Difference between revisions of "Umlaut Deployment"

From Code4Lib
Jump to: navigation, search
(New page: So you have your Umlaut running by executing "./scripts/server" in the Umlaut directory, and then it launches on port 3000 by default, and you connect to port 3000. This is fine for some ...)
 
(Dealing with bad query strings: More Apache Setup)
Line 1: Line 1:
 
 
So you have your Umlaut running by executing "./scripts/server" in the Umlaut directory, and then it launches on port 3000 by default, and you connect to port 3000. This is fine for some initial confirmation that you've got your setup working, and for development, but how do you actually deploy it?
 
So you have your Umlaut running by executing "./scripts/server" in the Umlaut directory, and then it launches on port 3000 by default, and you connect to port 3000. This is fine for some initial confirmation that you've got your setup working, and for development, but how do you actually deploy it?
  
Line 102: Line 101:
 
Mongrel refuses to accept a mal-formed query string. EBSCOHost, however, insists on sending such---for example, query strings with unescaped greater-than or less-than chars in them. We want to take care of this by putting directives in the apache config to rewrite these bad urls into proper escaped urls. The apache mod_redirect external map function is most convenient to use here, and a program to serve as an external map is included with umlaut. The following apache directives will take care of rewriting bad URLs. As always, $Umlaut stands for your Umlaut install dir.
 
Mongrel refuses to accept a mal-formed query string. EBSCOHost, however, insists on sending such---for example, query strings with unescaped greater-than or less-than chars in them. We want to take care of this by putting directives in the apache config to rewrite these bad urls into proper escaped urls. The apache mod_redirect external map function is most convenient to use here, and a program to serve as an external map is included with umlaut. The following apache directives will take care of rewriting bad URLs. As always, $Umlaut stands for your Umlaut install dir.
  
( need code here )
+
<pre>
 +
  # We want to re-write URLs with 'bad' < and > chars in the query
 +
  # string (eg from EBSCO) to escape them.
 +
  RewriteEngine on
 +
  RewriteMap query_escape prg:$umlaut/distribution/script/rewrite_map.pl
 +
  RewriteLock /var/lock/subsys/apache.rewrite.lock
 +
  RewriteCond %{query_string} ^(.*[\>\<].*)$
 +
  RewriteRule ^(.*)$ $1?${query_escape:%1} [R,L,NE]
 +
</pre>
  
 
Note: Due to a bug in Apache, ampersand chars in query string end up 'double escaped' when put through the map. We have code in a before filter in application_controller to take care of this.
 
Note: Due to a bug in Apache, ampersand chars in query string end up 'double escaped' when put through the map. We have code in a before filter in application_controller to take care of this.

Revision as of 15:45, 27 December 2007

So you have your Umlaut running by executing "./scripts/server" in the Umlaut directory, and then it launches on port 3000 by default, and you connect to port 3000. This is fine for some initial confirmation that you've got your setup working, and for development, but how do you actually deploy it?

Turns out there are several possible deploy environments for a Rails application. There is not necessarily one standard or best one at the moment, different people use different ones in different circumstances.

Jonathan Rochkind at Hopkins uses mongrel, mongrel_cluster and Apache mod_proxy and mod_proxy_balancer on a unix system for his deploy environment. He went down this road because it was what was recommended by the Rails Agile Development book. Georgia Tech is currently using a different deploy environment for it's Umlaut version 1, but mongrel_cluster/apache is the only currently demonstrated way to deploy the current Umlaut version. We would definitely not be optimistic about running Umlaut on Windows.

Since jrochkind is writing this documentation, he can only tell you how to do it how he did. You do need to have a verison of Apache that includes mod_proxy_balancer (> apache 2.2? ), but if you do, jrochkind is fairly happy with the solution.

These two pages from the mongrel website on Apache Best Practice Deployment and Using Mongrel Cluster are pretty good how-tos for mongrel. But we will also take you through it here, with specific directions and Umlaut recommendations and pit-falls we ran into.

Umlaut Deployment with Mongrel and Apache

There are basically two parts to getting Umlaut (or any Rails app) deployed in this setup. First is getting your Rails app running, and second is configuring Apache to connect to it properly.

There are a few decisions to make. Run just one instance of Umlaut, or run multiple load balancing instances? Because of the nature of the way Umlaut works, we strongly recommend running multiple Umlaut instances regardless of how little traffic you expect. Even in a low-traffic environment, the fact that Umlaut can take several seconds to respond to a request means that multiple instances are a good idea to keep Umlaut from seeming even slower than it is. We're completely guessing, but 3 is probably a pretty good number for just about any Umlaut site, from low to high traffic.

Also think about whether you what unix account you want to run Umlaut (recommended to create a special low-priv account). And whether your Umlaut URLs can be the base urls for a host (Ie, findit.library.jhu.edu points directly to umlaut), or whether you use a 'prefix' (ie, findit.library.jhu.edu/some/path/findit). Using apache virtual hosts and mounting Umlaut at the base is typical, but the prefix can work too.

Setting up mongrel_cluster

First you've got to install mongrel and mongrel_cluster:


sudo gem install mongrel
sudo gem install mongrel_cluster


The point of mongrel_cluster is to save configuration information for multiple mongrel instances in one configuration file, and then you can start, stop, or restart them all with one command, and without having to remember that config information each time (and possibly get it wrong or typod).

By default, mongrel_cluster keeps that configuration file in a Rails app's config/mongrel_cluster.yml. You could do that with Umlaut, but we like to keep your local config files in $Umlaut/config/umlaut_config instead (see Umlaut Local Configuration Architecture), so we recommend putting it in $Umlaut/config/umlaut_config. You can use the mongrel_rails command to write this config for you (see Using Mongrel Cluster; make sure to use the -C argument to put the config file in umlaut_config, if that's what you want), but here we'll just give you our actual mongrel_cluster.yml config, annotated. (You are certainly allowed to write it by hand).


# Unix account to run your processes as:\\
user: umlaut  

#Unix group to run processes as:
group: umlaut 

# Install dir of Umlaut you want to run from:
cwd: /data/web/findit/Umlaut 
log_file: log/mongrel.log # Leave like this. 

# Start port for your instances. Any high port will do. Does NOT need need
# to be open through firewall externally. 
port: 8000 
environment: production # Leave like this
address: 127.0.0.1 # Leave like this 
pid_file: tmp/pids/mongrel.pid # Leave like this

# How many instances to run. port: 8000 with servers:3 means you'll
# have a server on 8000, 8001, and 8002. 
servers: 3

# Only if  you want to start at web path other than base / :
prefix: /findit       # for instance. Start with slash, and don't end with one.

Now you can start all three of these mongrel instances by executing:

sudo mongrel_rails cluster::start -C $Umlaut/config/umlaut_config/mongrel_rails

The 'sudo' is necessary because we've told mongrel_cluster to start apps as user 'umlaut' ; first need to be root before you can start a process as another user. Also cluster::stop, cluster::restart, and cluster::status

See below to automate the startup of these processes on boot.

If you are choosing to start as a particular unix account, make sure your install dir can be read by that account! log and tmp dirs need to be writeable too. Easiest thing to do is just "sudo chgrp -R umlaut", or whatever other group you are choosing, your entire $Umlaut installation. Note that the parent directory (and all of it's parents) needs to have "x" permission for the user/group too.

Apache Setup

Now we set up apache using mod_proxy to 'reverse proxy' to our mongrel instances, with clustered load balancing. Make sure you have mod_proxy and mod_proxy_balancer installed and configured. Now, in your apache conf, proably in the specific virtual host you want to use for Umlaut:

# Very important, make sure you aren't inadvertantly making an open proxy with mod_proxy
ProxyRequests off 

# Set up the mod_proxy blanacer, with our three instances running on 8000-8002
# Note: Do not put trailing / on these

BalancerMember http://127.0.0.1:8000
BalancerMember http://127.0.0.1:8001
BalancerMember http://127.0.0.1:8002


# Now set up the ProxyPass directives to reverse proxy to that cluster
# Note: DO put trailing / on these.

ProxyPass / balancer://umlaut_cluster/ 
ProxyPassReverse / balancer://umlaut_cluster/ 
ProxyPreserveHost on

# Or, if you were using a prefix, these would look like, eg:
# ProxyPass /findit balancer://umlaut_cluster/findit/
# ProxyPassReverse /findit balancer://umlaut_cluster/findit/

Dealing with bad query strings: More Apache Setup

Mongrel refuses to accept a mal-formed query string. EBSCOHost, however, insists on sending such---for example, query strings with unescaped greater-than or less-than chars in them. We want to take care of this by putting directives in the apache config to rewrite these bad urls into proper escaped urls. The apache mod_redirect external map function is most convenient to use here, and a program to serve as an external map is included with umlaut. The following apache directives will take care of rewriting bad URLs. As always, $Umlaut stands for your Umlaut install dir.

  # We want to re-write URLs with 'bad' < and > chars in the query
  # string (eg from EBSCO) to escape them.
  RewriteEngine on
  RewriteMap query_escape prg:$umlaut/distribution/script/rewrite_map.pl
  RewriteLock /var/lock/subsys/apache.rewrite.lock
  RewriteCond %{query_string} ^(.*[\>\<].*)$
  RewriteRule ^(.*)$ $1?${query_escape:%1} [R,L,NE]

Note: Due to a bug in Apache, ampersand chars in query string end up 'double escaped' when put through the map. We have code in a before filter in application_controller to take care of this.

Start at Boot?

Follow the directions at Using Mongrel Cluster, which are basically:

sudo mkdir /etc/mongrel_cluster
sudo ln -s $UMLAUT/config/umlaut_config/mongrel_cluster.yml /etc/mongrel_cluster/umlaut.yml
sudo cp /path/to/mongrel_cluster_gem/resources/mongrel_cluster /etc/init.d/
sudo chmod +x /etc/init.d/mongrel_cluster

Now your cluster will start at boot, and you can also start, stop, or restart it (and any other clusters you link into /etc/mongrel_cluster) with:

sudo /etc/init.d/mongrel_cluster {start|stop|restart}