Difference between revisions of "Umlaut Deployment"

From Code4Lib
Jump to: navigation, search
m
 
Line 1: Line 1:
 +
=WARNING: This is Outdated Documentation!!!!=
 +
 +
'''THIS IS OUTDATED DOCUMENTATION''' See new Umlaut documentation at http://github.com/team-umlaut/umlaut/wiki
 +
---------
 +
 
[[Category:Umlaut]]
 
[[Category:Umlaut]]
  

Latest revision as of 16:19, 19 June 2012

WARNING: This is Outdated Documentation!!!!

THIS IS OUTDATED DOCUMENTATION See new Umlaut documentation at http://github.com/team-umlaut/umlaut/wiki


So you have your Umlaut running by executing "./scripts/server" in the Umlaut directory, and then it launches on port 3000 by default, and you connect to port 3000. This is fine for some initial confirmation that you've got your setup working, and for development, but how do you actually deploy it?

Turns out there are several possible deploy environments for a Rails application. There is not necessarily one standard or best one at the moment, different people use different ones in different circumstances.

Jonathan Rochkind at Hopkins uses mongrel, mongrel_cluster and Apache mod_proxy and mod_proxy_balancer on a unix system for his deploy environment. He went down this road because it was what was recommended by the Rails Agile Development book. We may explore other deploy enviroments (such as 'passenger') in the future. We would definitely not be optimistic about running Umlaut on Windows. *Lately mod_rails/Passenger is clearly the preferred Rails deployment in general, and jrochkind wants to find time to set it up and test it, but hasn't yet.*

Since jrochkind is writing this documentation, he can only tell you how to do it how he did. You do need to have a verison of Apache that includes mod_proxy_balancer (> apache 2.2? ), but if you do, jrochkind is fairly happy with the solution.

These two pages from the mongrel website on Apache Best Practice Deployment and Using Mongrel Cluster are pretty good how-tos for mongrel. But we will also take you through it here, with specific directions and Umlaut recommendations and pit-falls we ran into.

Alternatively, based on Ross Singer's recommendation, scotdalton is using thin and Apache at NYU. The setup is almost identical to mongrel and is detailed at Umlaut Deployment with Thin and Apache

Quick start 'wizard'

New! A Rails generator to set up config files for you, and make deployment with mongrel cluster and apache much easier.

Prerequisites

  • Install 'mongrel' and 'mongrel_cluster' gems.
  • Apache needs mod_proxy and mod_proxy_balancer (which means it needs to be apache >2.2 I think, and have those modules turned on).
  • You need to have access to an apache conf file to add some statements to hook your mongrel cluster up to the web.

To run

You can simply run ./script/generate mongrel_deploy_files to generate config files for an Umlaut mongrel cluster deployment. This makes some assumptions, detailed below--to change all of these options, run ./script/generate mongrel_deploy_files --help to see command line arguments, if you aren't happy with these defaults.

You can run this command at any time. It will interactively prompt you if you want to overwrite your existing files, and give you a diff. (Or you can say --force to force overwriting of existing files). You can also run ./script/destroy mongrel_deploy_files to remove anything created by the generator.

This process will add two files in umlaut_config/deploy, and one file in $UMLAUT/script/local/. Next, you need to hook up apache, and start your mongrels.

default assumptions

Several mongrel processes are being configured. By default, this is four mongrel processes on internal ports beginning at port 4001. Both of these things can be changed.

By default these mongrel processes will be run as unix user 'umlaut', group 'umlaut'. So either create such a user or group, or add arguments to choose other user/group.

By default the generator assumes that you are going to be deploying at 'document root' (/) in a particular apache (virtual) host. If you'd like to instead install at a sub-path, use the --prefix argument.

Hook up apache

A file was created for you in umlaut_config/deploy/umlaut_http.conf. You need to edit your apache conf file to "Include" this umlaut_http.conf in the virtual host of your choice (or main host). You need to set up the virtual host yourself, if you want one. Then simply "Include /path/to/umlaut/config/umlaut_config/deploy/umlaut_http.conf".

Start mongrels

Apache is now pointing to a balanced cluster of mongrels on the ports specified by the generator, from the path specified by the generator. But those mongrels aren't running yet. You need to start them. You can do this by running:

sudo mongrel_rails cluster::start -C /path/to/umlaut/config/umlaut_config/deploy/mongrel_cluster.yml

Or, for convenience, the generator installed a little bash script to do this all for you:

$UMLAUT/script/local/my_mongrel_ctl (start|stop|restart|status)

You can set things up to auto-start your mongrels on boot, see: [1]

The details: Umlaut Deployment with Mongrel and Apache

There are basically two parts to getting Umlaut (or any Rails app) deployed in this setup. First is getting your Rails app running, and second is configuring Apache to connect to it properly.

There are a few decisions to make. Run just one instance of Umlaut, or run multiple load balancing instances? Because of the nature of the way Umlaut works, we strongly recommend running multiple Umlaut instances regardless of how little traffic you expect. Even in a low-traffic environment, the fact that Umlaut can take several seconds to respond to a request means that multiple instances are a good idea to keep Umlaut from seeming even slower than it is. We're completely guessing, but 3 is probably a pretty good number for just about any Umlaut site, from low to high traffic.

Also think about whether you what unix account you want to run Umlaut (recommended to create a special low-priv account). And whether your Umlaut URLs can be the base urls for a host (Ie, findit.library.jhu.edu points directly to umlaut), or whether you use a 'prefix' (ie, findit.library.jhu.edu/some/path/findit). Using apache virtual hosts and mounting Umlaut at the base is typical, but the prefix can work too.

Setting up mongrel_cluster

First you've got to install mongrel and mongrel_cluster:


sudo gem install mongrel
sudo gem install mongrel_cluster

Reccommend you make sure you have mongrel >= 1.1.4 and mongrel_cluster >= 1.0.5. Recommend you do NOT have previous versions installed. When I had mongrel_cluster 1.0.3 simultaneously installed, it was being used, even though it shouldn't be, and its bugs were effecting me


The point of mongrel_cluster is to save configuration information for multiple mongrel instances in one configuration file, and then you can start, stop, or restart them all with one command, and without having to remember that config information each time (and possibly get it wrong or typod).

By default, mongrel_cluster keeps that configuration file in a Rails app's config/mongrel_cluster.yml. You could do that with Umlaut, but we like to keep your local config files in $Umlaut/config/umlaut_config instead (see Umlaut Local Configuration Architecture), so we recommend putting it in $Umlaut/config/umlaut_config. You can use the mongrel_rails command to write this config for you (see Using Mongrel Cluster; make sure to use the -C argument to put the config file in umlaut_config, if that's what you want), but here we'll just give you our actual mongrel_cluster.yml config, annotated. (You are certainly allowed to write it by hand).


# Unix account to run your processes as:\\
user: umlaut  

#Unix group to run processes as:
group: umlaut 

# Install dir of Umlaut you want to run from:
cwd: /data/web/findit/Umlaut 
log_file: log/mongrel.log # Leave like this. 

# Start port for your instances. Any high port will do. Does NOT need need
# to be open through firewall externally. 
port: 8000 
environment: production # Leave like this
address: 127.0.0.1 # Leave like this 
pid_file: tmp/pids/mongrel.pid # Leave like this

# How many instances to run. port: 8000 with servers:3 means you'll
# have a server on 8000, 8001, and 8002. 
servers: 3

# Only if  you want to start at web path other than base / :
prefix: /findit       # for instance. Start with slash, and don't end with one.

Now you can start all three of these mongrel instances by executing:

sudo mongrel_rails cluster::start -C $Umlaut/config/umlaut_config/mongrel_rails

The 'sudo' is necessary because we've told mongrel_cluster to start apps as user 'umlaut' ; first need to be root before you can start a process as another user. Also cluster::stop, cluster::restart, and cluster::status

We're still not sure exactly how many mongrels are neccesary to handle a given sized umlaut installation.

See below to automate the startup of these processes on boot.

If you are choosing to start as a particular unix account, make sure your install dir can be read by that account! log and tmp dirs need to be writeable too. Easiest thing to do is just "sudo chgrp -R umlaut", or whatever other group you are choosing, your entire $Umlaut installation. Note that the parent directory (and all of it's parents) needs to have "x" permission for the user/group too.

Apache Setup

Now we set up apache using mod_proxy to 'reverse proxy' to our mongrel instances, with clustered load balancing. Make sure you have mod_proxy and mod_proxy_balancer installed and configured. Now, in your apache conf, proably in the specific virtual host you want to use for Umlaut:

# Very important, make sure you aren't inadvertantly making an open proxy with mod_proxy
ProxyRequests off 

# Set up the mod_proxy blanacer, with our three instances running on 8000-8002
# Note: Do not put trailing / on these
<Proxy balancer://umlaut_cluster>
  BalancerMember http://127.0.0.1:8000
  BalancerMember http://127.0.0.1:8001
  BalancerMember http://127.0.0.1:8002
</Proxy>

# Set up ProxyPass directive to reverse proxy to SFX for handling SFX journal subscription cgi posts
# This should come before the cluster ProxyPass directive.
ProxyPass /resolve/cgi/core/journal_subscription.cgi http://your.sfx.host.edu:port/your_instance/cgi/core/journal_subscription.cgi
ProxyPassReverse /resolve/cgi/core/journal_subscription.cgi http://your.sfx.host.edu:port/your_instance/cgi/core/journal_subscription.cgi

# Now set up the ProxyPass directives to reverse proxy to that cluster
# Note: DO put trailing / on these.

ProxyPass / balancer://umlaut_cluster/ 
ProxyPassReverse / balancer://umlaut_cluster/ 
ProxyPreserveHost on

# Or, if you were using a prefix, these would look like, eg:
# ProxyPass /findit balancer://umlaut_cluster/findit/
# ProxyPassReverse /findit balancer://umlaut_cluster/findit/

SSL/https

If you are setting up apache to allow https requests, it should still proxy to an http mongrel as above, because mongrel doesn't speak http. However, you should include this line in the relevant SSL virtual host, to set the request header to let the Rails app know it's fronted by ssl:

   RequestHeader set X_FORWARDED_PROTO 'https'

Dealing with bad query strings: More Apache Setup

Mongrel refuses to accept a mal-formed query string. EBSCOHost, however, insists on sending such---for example, query strings with unescaped greater-than or less-than chars in them. We want to take care of this by putting directives in the apache config to rewrite these bad urls into proper escaped urls. The apache mod_redirect external map function is most convenient to use here, and a program to serve as an external map is included with umlaut. The following apache directives will take care of rewriting bad URLs. As always, $Umlaut stands for your Umlaut install dir.

  # We want to re-write URLs with 'bad' < and > chars in the query
  # string (eg from EBSCO) to escape them.
  RewriteEngine on
  RewriteMap query_escape prg:$umlaut/distribution/script/rewrite_map.pl
  RewriteLock /var/lock/subsys/apache.rewrite.lock
  RewriteCond %{query_string} ^(.*[\>\<].*)$
  RewriteRule ^(.*)$ $1?${query_escape:%1} [R,L,NE]

Note: Due to a bug in Apache, ampersand chars in query string end up 'double escaped' when put through the map. We have code in a before filter in application_controller to take care of this.

Start at Boot?

Follow the directions at Using Mongrel Cluster, which are basically:

sudo mkdir /etc/mongrel_cluster
sudo ln -s $UMLAUT/config/umlaut_config/mongrel_cluster.yml /etc/mongrel_cluster/umlaut.yml
sudo cp /path/to/mongrel_cluster_gem/resources/mongrel_cluster /etc/init.d/
sudo chmod +x /etc/init.d/mongrel_cluster

Now your cluster will start at boot, and you can also start, stop, or restart it (and any other clusters you link into /etc/mongrel_cluster) with:

sudo /etc/init.d/mongrel_cluster {start|stop|restart}


NOTE/WARNING

There is a problem in mongrel_cluster that will prevent mongrels from starting up again if your machine (or mongrels) die ungracefully leaving stale pids. See http://www.ruby-forum.com/topic/105849

My better fix: Edit the /etc/init.d/mongrel_cluster bash script you installed above. Change line:

mongrel_cluster_ctl start -c $CONF_DIR

to:

mongrel_cluster_ctl start -c $CONF_DIR --clean

Note the addition of the --clean argument.

I know this works with mongrel 1.1.4 and mongrel_cluster 1.0.5. An earlier mongrel_cluster did not respect the --clean argument properly--and I found that having a simultaenous install of the earlier mongrel_cluster for some reason caused it to be used instead of the later one. gem isn't supposed to work that way. But best make sure you have no mongrel_clusters earlier than 1.0.5 installed.

SFX Configuration

Institute Feature

This only matters if you use the SFX institute feature. Umlaut sends a req.ip=[client ip] param, which SFX is supposed to use to treat the request as if it came from that IP, not umlaut's ip. That works if the req.ip matches an SFX institute. But if it does not match any institute, you want SFX to treat the request as if it did not match any institute. Instead it consults the actual umlaut server IP and connects THAT to an institute. This is bad.

As a work around, define an institute in SFX that is listed first alphabetically (eg, "aaa_umlaut_server") that matches the Umlaut server's IP address(es). Now if req.ip doesn't match anything, SFX will decide the request matches "aaa_umlaut_server" institute--which won't effect anything, will be treated just like a non-local address--instead of matching on umlaut server address which might match a wrong institute.

This bug has been reported to Ex Libris.