Apr 072013
 

While doing some updates to the blog, such as adding author information and other useful information to hopefully help with legitimatizing things, I spent about two hours trying to figure out why a shortcode in the theme that I use (Suffusion) would not parse arguments. The tag suffusion-the-author would display author name, but if I added an argument to make it display the description or a link instead of just the name, it would always fail and revert to the author name only. Some quick googling told me that this didn’t seem to be a problem anyone else had, and after four major versions for the theme, I’m pretty sure someone would’ve noticed if a basic feature were entirely nonfunctional. After lots of debugging, I discovered that the problem was happening in WordPress’s code, inside the definition of shortcode_parse_atts(), which parses the attributes matched in the shortcode into an array for use inside the callback. For me, instead, it killed all of the tag arguments and returned an empty string. In the end I found that the issue was a regular expression that appears to be designed to replace certain types of unicode spaces with normal spaces to simplify the argument parsing regular expression that comes after

$text = preg_replace("/[\x{00a0}\x{200b}]+/u", " ", $text);

On my system, this would simply set $text to an empty string, every time. Initially, I wanted to just comment it out, because I’m not expecting to see any of these characters inside tags, but that bothered me, because this has been a part of the WordPress code for a very long time, and nobody else appears to have problems with it. Finally, I concluded that this must mean that unicode support was not enabled, so I began searching through yum. Bad news, mbstring and pcre were all already installed, and php -i told me that php was built with multibyte string support enabled, as well as pcre regex support. I tested my pcre libraries and they claimed to support unicode as well.

In the end, I solved the problem by updating php and pcre to the latest version, which was an update from php-5.3.13 to php-5.3.24 in my case, and pcre was updated from 7.x to 8.21. It appears that there was some kind of incompatibility between php 5.3 and the 7.x branch of pcre (if you build php 5.3 from source, it will target pcre 8.x), which prevented me from using unicode despite support for it being present everywhere. So, if you have trouble getting your php installation to handle unicode inside regular expressions, and you’re running on Amazon Web Services, make sure you update to the latest versions of php and pcre, as there appear to be some issues with older packages.

Jul 242011
 

After setting up a basic web server, I wanted to secure access to privileged resources using SSL. Normally, if I were dealing with customers or others who need to establish some kind of trust with me from the outside world, I would have to purchase a properly signed and verified SSL certificate from VeriSign or another Certificate Authority. However, because my goal is to secure internal resources that must be exposed publicly due to the nature of the internet (webmail and phpMyAdmin), it is cheaper (read: free) to create and sign my own certificates. End users do not need to trust me, and because I can personally verify the origin of my CA certificate, I do not need one signed by an external authority. I can then use this to sign additional certificates in order to create an area of my server that requires a user to present a certificate signed by my personal CA certificate in order to proceed, allowing very tight access control over sensitive areas.

Using the 64-bit Basic Linux AMI, OpenSSL is already installed. The configuration file, openssl.cnf is located at /etc/pki/tls/openssl.cnf. Luckily, it has a pretty reasonable set of defaults, and in particular the directory tree is already set up, with appropriate permissions (0700) on the directory used for storing the CA private key. Because openssl.cnf points to the CA certificate and the private key directly, I chose to rename my keys from the default value and entered modified the values for certificate and private_key to reflect this. With the basic OpenSSL configuration complete, it was time to create a self-signed CA certificate, set up the directory structure, and configure apache to use mod_ssl.

Signing the CA certificate

Creating a certificate that you can use to sign other requests is very straightforward:

[root@thelonepole /]# openssl req -newkey rsa:2048 -keyout tlpCA.key -x509 -days 365 -out tlpCA.crt

This creates a new X.509 format certificate along with its private key. It is automatically self-signed (this form of the command combines the key generation step, certificate request step, and self-signing step into a single command). My certificate requires that a password be entered when it is used to sign a certificate request, but for a small private server with exactly one trusted user the password can be removed by adding the -nodes flag, for convenience. You may also want to change the number of days for which the certificate is valid so that you don’t have to recreate it every year (for a CA certificate, you might as well set this to 10 or more years).

Once the certificate and key were ready, I moved them to the locations earlier specified in openssl.cnf for certificate and private_key. Then, I had to create the file used to track the next available serial number. For some reason this isn’t created by default, possibly because of some obscure vulnerability when the next available serial number is known, but for our purposes, we are in a relatively low security situation, so starting at 1 is pretty reasonable:

[root@thelonepole /]# touch /etc/pki/CA/serial
[root@thelonepole /]# echo 1 > /etc/pki/CA/serial

If this step is omitted, openssl will complain when it comes time to sign certificate requests because it does not know which serial number to fill in.

Signing Other Certificates

At this point, with the CA certificate in place, it is easy to create and sign certificate requests. It is necessary to create a different certificate for each subdomain that you wish to be able to use SSL with, because the certificate will include the domain that it is valid for in its path. If this doesn’t match the ServerName directive, it will generate a warning when starting apache, and users (even those who have added your CA certificate to their trusted store, i.e. you) will also see a warning in their browsers when using SSL. Creating and signing a certificate is done with two commands:

[root@thelonepole /]# openssl req -newkey rsa:2048 -nodes -keyout webmail.key -out webmail.csr
[root@thelonepole /]# openssl ca -in webmail.csr -out webmail.crt

Note that the certificates that are created for use with apache should be done WITHOUT a password, using the -nodes flag. If you don’t do this, you’ll be prompted for a password each time you start apache, which is a huge problem if your server goes down and/or is automatically restarted and you aren’t available to enter the password. The certificates generated by the above command will also be valid for the period of time specified in your openssl.cnf. If you want them to last longer, be sure to specify -days XXX.

Copy these certificates to a known location, such as /etc/pki/tls and note their location so that it can be used when configuring apache. You will also want to create a certificate (in exactly the same way) for yourself, which will be used for client authentication for SSL-enabled subdomains as well. However, for a client certificate, there is an additional step: it must be converted from X.509 format to the PKCS #12 format that is often the only one understood by browsers. The single output file will combine both the signed certificate and the private key, and you will have it on any computer or device that you expect to use to access your secure areas, so BE SURE to use a strong export password when prompted during this step. The conversion can be done with one command, which takes in your private key and signed certificate and exports it as a client certificate in PKCS format:

[root@thelonepole /]# openssl pkcs12 -export -clcerts -out greg.p12 -in greg.crt -inkey greg.key

Then, transfer the PKCS file to your computer and install it in your web browser, usually with sftp. Under chrome, I had the option to install it specifically as a key for client authentication, but I am not sure if all browsers support the categorization in this manner. You will also need to install a copy of your CA certificate into the trusted root certificate store, or the browser will not accept the client certificate. Luckily you can simply download the X.509 format certificate from your server and tell your browser about it without converting it (which wouldn’t make sense anyway because it’d require the private key…).

Configuring mod_ssl

Finally, there are several directives to add to the VirtualHost entries in order to configure them to use SSL. First, if you intend to only allow SSL to be used for a specific subdomain, like I do with my webmail, create an entry for normal HTTP that redirects all traffic to HTTPS:

<VirtualHost *:80>
	ServerName webmail.thelonepole.com
	Redirect permanent / https://webmail.thelonepole.com
</VirtualHost>

Next, a VirtualHost entry must be created that has the desired SSL settings, listening on port 443. Obviously, it must include SSLEngine On in order to enable SSL, but there are actually several extra directives that should be included in order to have client authentication work. Be sure to set SSLCACertificateFile to point to your local CA certificate and enable client authentication with SSLVerifyClient. In order to require that clients use certificates directly signed by your CA certificate, set SSLVerifyDepth to 1. I’ve included a complete VirtualHost entry as an example, based on the configuration I use here.

<VirtualHost *:443>
	ServerAdmin greg@thelonepole.com
	DocumentRoot /var/www/webmail
	ServerName webmail.thelonepole.com
	ErrorLog logs/webmail-error_log
	CustomLog logs/webmail-access_log combined
	SSLEngine On
	SSLCACertificateFile /path/to/CA/tlpCA.crt
	SSLCertificateFile /path/to/apache/ssl/webmail.crt
	SSLCertificateKeyFile /path/to/apache/ssl/webmail.key
	SSLVerifyClient require
	SSLVerifyDepth 1
	SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown
	CustomLog logs/webmail-ssl_request_log "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
</VirtualHost>

At this point, simply restart apache, and, if DNS was already set up, you should be prompted for a key when accessing your subdomain. If you have not added your key to the browser’s store (or are interested in verifying that it requires a client-supplied key and use an unconfigured browser), you should see a message notifying you that an error with code “ssl_error_handshake_failure_alert” happened during SSL negotiation.

Jul 102011
 

Cloud hosting differs from a normal unmanaged VPS because of the way that server instances are started and execute. Server data does not automatically persist between instances and each instance has unique IP address and default storage. With services like Amazon Web Services, it’s possible to have persistent storage that is attached to an instance after booting, but the instance will always create a new root device based on the snapshot associated with the Amazon Machine Image (AMI) that was selected to be run. By existing within the cloud, the way that a server is accessed also changes, because all traffic must pass through the Amazon firewall first. As a consequence, configuring a basic AWS instance is a bit different from a normal unmanaged VPS, so, as I recently set up my own server, I’ll describe the entire process for creating a basic web hosting server to run a small blog (such as this one).

Launching an Instance

The AWS Management Console is much easier to use than the command line tools because it displays all of the options for you, so I would recommend using it when possible. As a new user, I did everything except packaging an AMI from a snapshot using the AWS Management Console, because packaging an AMI from a snapshot can only be done with the command line tools. For the first instance you launch, if you plan on making use of the free usage tier (perfect for a small website, although you will likely run $1-$2/month in storage costs for your image snapshots unless you decrease the size of your root device when you create your own AMI), select one of the two Basic Amazon Linux AMIs. After choosing an instance, you should choose which availability zone to launch in. For the first instance you ever use, you can leave this as “No Preference” but if you plan on attaching to persistent storage in the future, you must note which availability zones contain your persistent EBS Volumes and launch your server instance in the same zone. Snapshots, however, are available to launch AMIs in all zones. Currently, I’m running a 64-bit micro instance in the us-east-1b zone, because the micro instance is well suited to my needs: infrequent accesses (low traffic) with occasional usage spikes, where up to two virtual cores are available.

On the advanced options page, I chose to use all of the default settings. The AWS guidelines recommend turning on instance termination protection (which is free), but I didn’t bother because the AMI will default to a Stopped state on shutdown. This means that the EBS that was created for the instance will not be deleted and it can be restarted without data loss. For the first instance, I planned on creating a snapshot to use for a custom AMI anyway, so I wanted to be sure that the instance and its associated EBS Volume disappear when I terminated it, by default. Similarly, on the tags page, because I had not configured any special cloud-init scripts, I didn’t need to tag my instances with anything, although I chose to add the domain name to the instance “Name” tag.

The first time I launched the instance, I generated a keypair, which is a relatively straightforward process. Because I am using Windows, after downloading the keypair, I had to import it for use with PuTTY, which is also very easy, using puttygen to convert the key. I simply loaded the keyfile that was downloaded from Amazon and then saved the private key in PuTTY’s internal format. I also loaded the key into pageant so that I can use it with psftp or for tunneling with svn+ssh later on. The final system that I configured does not allow key-less authorization, as is recommended by Amazon, so it is necessary to expose the key to all of the local programs that I will need to use to communicate with my server.

Finally, for the firewall, I created a new policy. This firewall uses a whitelisting-only approach where all traffic is blocked by default, so you can only specify which blocks of addresses may access specific ports on the machine. This is great for basic firewall, as you may block all traffic to a port or allow only yourself to access the machine via SSH, but it is insufficient for conveniently blocking malicious addresses without either denying access to all of your users or adding a large number of security rules manually. For this you will still need to fall back to traditional approaches such as iptables. However, for restricting access to SSH, this works great, so initially I added rules for SSH to allow access only from my home machine, and I added additional rules to allow SMTP, HTTP, and HTTPS traffic from all addresses. Here, the source must be specified in CIDR notation, such as 107.20.240.83/32 or 0.0.0.0/0 for all incoming addresses.

After the instance launched, I obtained an Elastic IP address so that I could use a common IP for all of my DNS records, and assigned it to the running instance. The public DNS name of each instance is assigned at boot time and will not be the same from run to run, even for a “Stopped” instance that is restarted. Using an Elastic IP that is associated with an instance after it boots is the only way to ensure that it has a specific IP address.

Installing Server Software

Now that my instance was running, I connected with PuTTY using my private key to authenticate. The default user name on all AWS instances is ec2-user, which I felt like changing. So, I created for myself a user account, greg, with the permissions that I would normally expect to allow me to perform basic server administration tasks:

[ec2-user@thelonepole /]$ sudo useradd -g users,wheel -m greg

Much to my surprise, after changing to my new account, I was unable to use sudo to perform any tasks. This is because, by default, on the Linux AMIs, the /etc/sudoers file does not have an entry for wheel, even though the group exists and ec2-user is a member. I chose to simply add an entry for my new account to allow me to use sudo without a password, in the same style as ec2-user was originally configured:

greg ALL = NOPASSWD: ALL

Now that my account was configured, it was time to install some basic server applications: Apache and MySQL, with PHP 5.3. Luckily, all of these applications are available properly compiled for a virtualized server from the built-in yum repository. I come from a Gentoo background, so the naming seemed a little bit strange here and there, as well as the separation of basic features into distinct “packages,” rather than USE flags, but every distribution has to be different somehow (and, USE flags only really work when you’re compiling from source). As root, use yum to install packages (or with sudo):

[root@thelonepole /]# yum install httpd

In total, to get a basic server with support for PHP 5.3, MySQL, and SSL connections, I had to install these packages: httpd, mod_ssl, mysql, mysql-server php (which includes php-cli and php-common), php-gd, php-mcrypt, php-mysql, php-pdo, and php-mbstring. If you plan on using it, APC is also available as php-pecl-apc. Not all of the php modules are available, but presumably you can compile from source or from PECL if you need one something that is unavailable.

With the binaries installed, all of these services had to be configured. Luckily, for PHP, the default settings were pretty much perfect, as they disable register_globals and magic_quotes_gpc. The default memory limit is a bit high at 128M, but this shouldn’t actually be a problem because it doesn’t reserve this memory. A correct php.conf is added to the apache config directory, so PHP will be enabled by default next time you start the httpd service.

Next was apache. The default configuration structure for Apache on the Linux AMI is a little bit different from the Gentoo structure (which, of course, I think is more logical), so I had a few small hiccups while setting things up. All configuration is stored in /etc/httpd, with httpd.conf being stored in /etc/httpd/conf/ and further generic configuration files (which are automatically included) being stored in /etc/httpd/conf.d/. My DNS entries have multiple CNAMEs for the various services I host on my site, including one for webmail and one for this blog. In order to set this up from apache’s perspective, I used the NameVirtualHost feature to enable vhosts, and then I configured all vhosts by name, NOT by IP address. This was very important because the Elastic IP address that is used for the server is not exposed to the server instance itself: with ifconfig it is easy to see that eth0 is assigned a private address behind a NAT. Therefore, although apache can bind to the Elastic IP, it’ll never receive any traffic on it, and the vhosts won’t work, so, due to the unpredictability of IP addresses for a server instance, it is most useful to bind to all incoming addresses and do all vhost configuration by name only. The end result was that my vhosts.conf file looks pretty standard, only with no IP-based vhosts:

NameVirtualHost *:80
NameVirtualHost *:443

# Default entry as fall-through, also handles www subdomain, matches main server config
<VirtualHost *:80>
SSLEngine Off
ServerAdmin greg@thelonepole.com
DocumentRoot /var/www/html
ServerName www.thelonepole.com
</VirtualHost>

<VirtualHost *:80>
SSLEngine Off
ServerAdmin greg@thelonepole.com
DocumentRoot /var/www/blog
ServerName blog.thelonepole.com
</VirtualHost>

I also set up several SSL-enabled vhosts to secure my webmail traffic, but I will talk about those another day, because their set up was almost completely routine and therefore not really a part of my experience with AWS. However, there is one important thing to mention about the default configuration file included in the mod_ssl package, which will cause a problem whether you plan on using SSL immediately or not. The default ssl.conf file includes an enabled VirtualHost entry that references several nonexistent keys. I am unsure why those keys come up in there, but for some reason they do, so the best thing to do is either create the keys or delete the configuration. I chose to go the latter route because it was faster to, so I removed the entire VirtualHost entry from ssl.conf. The rest of the file, which sets up global options for the SSL engine, however, is very useful and serves as a fine set of default settings.

With Apache successfully configured, this leaves only mysqld. Luckily, only the root password needs to be set from the command line, and the rest can be configured from a much more straightforward interface, such as phpmyadmin. Unfortunately, like the systemwide root password, it appears that the MySQL root password is scrambled during the mysql-server package install process. This created a huge headache for me at first, but the MySQL documentation includes useful information for bypassing the root password at start time and configuring the server automatically:

[root@thelonepole /]# mysqld --skip-grant-table
[root@thelonepole /]# mysql_secure_installation

The secure installation script also allowed me to remove anonymous access and delete the temp table during setup, saving me time later on. Then, after unpacking a phpadmin tarball into the webroot, it was easy to log in to MySQL as root and add additional users.

After all services are configured properly, I had to add them to be started automatically. This is done on CentOS using the chkconfig tool, so to add mysqld and httpd to start in runlevels 3, 4, and 5 (if the system boots into runlevel 2, we are not going to get anywhere anyway, so it doesn’t really matter), I only issued two commands:

[root@thelonepole /]# chkconfig --level 345 httpd on
[root@thelonepole /]# chkconfig --level 345 mysqld on

At this point, I started both Apache and MySQL with the service interface used on CentOS:

[root@thelonepole /]# service httpd start
[root@thelonepole /]# service mysqld start

and I was able to access both standard html files and phpmyadmin, as well as manage my databases. With the basic server configuration done, only one thing remained: preparing an AMI so that next time I launch an instance, I do not need to do any configuration.

Creating an AMI

AMIs are backed by snapshots of instances that were attached to running server instances. This means that an AMI cannot be created from an EBS Volume directly. In fact, you cannot create an AMI through the current AWS Management Console unless you first export a snapshot to Amazon S3, something that I have had no interest in doing. Assuming that you’re able to set up the EC2 tools as described in the EC2 User Guide, it is very easy to create a new AMI from a snapshot of the EBS volume.

First, from the Management Console, I created a new snapshot of the volume attached to my running server instance from Volumes section of the EC2 management tab. When the snapshot was ready, I opened up my local command prompt and ran straight through the configuration process described in the setting up your tools section of the User Guide. There is one important omission from the User Guide, however: the JAVA_HOME environment variable must be set. On Windows it will usually look like this:

C:\> set JAVA_HOME="c:\program files\java\jre6"

From here there is only one command required to create a new AMI from the snapshot:

C:\> ec2-register -n Image_Name -d Image_description --root-device-name /dev/sda1 -b /dev/sda1=[snapshot-id]:[size]

Note: Image_name and Image_description cannot contain spaces. The Management Console doesn’t seem to parse them properly anyway, so there isn’t much reason in putting details into the description in particular.

This creates a private AMI backed by EBS storage (creating one with instance storage is not recommended) that will start by creating an EBS volume from the given snapshot. Note that the snapshot-id is NOT its Name value (the Name field is just for human administrators), but actually the value listed in Snapshot-ID in the Management Console. The size is specified in GiB, so for a default instance based on the Amazon Basic Linux AMI it would be 8, but it can also be left blank, which will cause AWS to infer its value from the listed size of the snapshot (not disk usage). The size parameter can also be used to increase the size of a drive, which is great if you are nearing maximum capacity on your server root. If you have multiple snapshots that you need to attach at boot time, for instance if you mount /var or /home on a separate drive, then additional -b parameters should be given, such as -b /dev/sdf=[snapshot-id]. For some reason, Amazon recommends attaching secondary drives using /dev/sdf through /dev/sdp and NOT numbering partitions.

Personally, I use a multiple drive set up where I’ve separated the base operating system from my data, so most of /var/ is on a separate drive that is attached at boot time. By separating my OS from my configuration data, I can swap out the underlying system without having to migrate my data, so long as both systems are configured to look in the same place. One caveat of such a system is that because I use snapshots to generate EBS Volumes at boot, if I run two servers at once, I will get two separate drives, which can create an issue for synchronizing data. I feel like there are two ways to resolve this for multi-server systems, but I haven’t really explored either option fully because I don’t need to run two servers for performance. (1) Set up a server that has the “real” drives attached to it and then expose them as NFS volumes to the other servers, so that they immediately see each others’ modifications and do not have write conflicts. (2) Set up dedicated servers for each service, as in a normal infrastructure situation: one for handling email, one (or more) for hosting content, and one master plus multiple slaves for handling databases, along with a load balancer for directing traffic. I think (1) is acceptable as a transition solution (the drive is shared but the servers are heterogeneous so that there are no database conflicts) or in specific situations (such as serving static content on multiple IP addresses) but would not scale or survive long in a high traffic environment. I think that (2) is more robust because it forces all data to be written in one place (so there aren’t two separate databases trying to write to the same files) although it will be more expensive to run that many dedicated servers. I will revisit this in the future and explain how I’ve moved things like /var/www off of my main drive onto a data drive, as well as tricks for connecting to persistent EBS Volumes (not ones based off of snapshots) at boot time, using the cloud-init scripts.

This just about covers all of the basic system configuration that I had to do to get a small web server running using AWS. I also configured sendmail so that I could send and receive email, as well as an IMAP daemon (cyrus) for use with webmail (squirrelmail) and deployed self-signed SSL certificates to encrypt sensitive traffic (such as webmail and phpmyadmin). I also created my own (untrusted) CA certificate in order to sign certificates for use with client authentication, to further restrict access to the webmail and phpmyadmin services, but all of that configuration is beyond the scope of simply setting up an AWS server and will have to wait for another day.