Difference between revisions of "Umlaut Technical Overview"

From Code4Lib
Jump to: navigation, search
Line 10: Line 10:
# [[ServiceResponse data structures and generation]] -- Includes guide to writing your own services.  
# [[ServiceResponse data structures and generation]] -- Includes guide to writing your own services.  
# [[View architecture and control flow]]
# [[View architecture and control flow]]
==ServiceResponse data and generation==
===ServiceResponse and related data structures===
Before talking about how the services generate data, we should talk about the data format of a [http://umlaut.rubyforge.org/api/classes/ServiceResponse.html ServiceResponse].  A ServiceResponse is basically a unit of information generated by a Service, generally for display somewhere on the link resolver menu page. For example, there might be a ServiceResponse representing a fulltext link, a help link, or an abstract. ServiceResponses almost always link out somewhere, along with providing other data for display.
The ServiceResponse entity has a few 'standard' properties (display_text, url, notes), but also a property, service_data, consisting of a serialized hash for holding arbitrary key/value information. Different service types might require different key/values here.  The [] operator on ServiceResponse conveniently allows you to store arbitrary key/value information in this property (and also access/set the 'built-in' properties). For appropriately loose coupling between data stored, service generating it, and view, we define some conventions for what key/value pairs are used for what purposes in each response type, in comments on [http://umlaut.rubyforge.org/api/classes/ServiceResponse.html ServiceResponse class definition].
A ServiceResponse also records which Service generated it, using that Service's service_id/name as defined in config/umlaut_config/services.yml, and generally retrievable from the ServiceList.
So what do we mean by a 'service type'?  The list of all valid service types is defined in the ServiceTypeValue table. Each ServiceTypeValue has a one-word internal identifier token (name), a display_name for user presentation, and optionally a display_name_pluralized (to over-ride standard Rails pluralization). The values in this table are initialized from db/orig_fixed_data/service_type_values.yml when you run rake umlautdb:load_initial_data. We intend the local implementer to be able to create locally defined ServiceTypeValues too, if necessary. 
ServiceTypeValue uses the acts_as_enumerated plug-in to conveniently allow the developer to refer to an individual ServiceTypeValue by name:  ServiceTypeValue[:fulltext] ==> the ServiceTypeValue object with name == 'fulltext'.  acts_as_plugin does efficient caching.
So obviously which ServiceTypeValue a given ServiceResponse is intended for needs to be registered somewhere. But you won't find it in ServiceResponse, which might be confusing at first. In fact, there's a somewhat confusingly named three-way join object called ServiceType, which ties together:
* a ServiceResponse
* a ServiceTypeValue
* a Request
This architecture theoretically allows:
* One ServiceResponse to belong to multiple Requests (ServiceResponse cacheing accross requests/sessions).
* One ServiceResponse to be assigned ''multiple'' ServiceTypeValues and thus listed multiple times with a given Request.
In fact, Umlaut does not currently use ServiceResponse caching across requests; it turned out to be tricky to get right without clear gain. And very few (if any?) current services register the same ServiceResponse to a request with multiple ServiceTypeValues. But, the architecture is there to support it if needed in the future.
This data structure architecture ends up somewhat confusing (and ServiceType is probably not a clear name for that three-way join) but there are usually convenience methods defined to avoid the complexity; they should be used. See for example (tbd).
[http://bibwild.wordpress.com/files/2008/02/umlaut-serviceresponse.jpg Data structure diagram] Trying to figure out how to make this display inline, sorry.
===Obligations of Service logic===
What you need to know to write a new Service. How to generate data, and callback methods service logic can or must provide.
Recall that an umlaut "service" is defined in config/umlaut_config/services.yml to be a particular class holding the service logic, and some configuration parameters.
That class holding the service logic is called a "service adaptor", or somewhat ambiguously, sometimes times just a "service". Service adaptors live in lib/service_adaptors, and extend [http://umlaut.rubyforge.org/api/classes/Service.html Service].
====Service adaptor implementation====
Service logic should generally be written to be state-less. The same Service object, defined in services.yml, is initialized once and generally re-used for the life of an application instance (cached by ServiceList).  So any state you store can end up persisting from request to request and session to session, which you probably don't intend. Umlaut architecture for background services also involves threads and forks, and while there's normally no reason a given service object would be in two threads simultaneously, better safe than sorry. It's safest to store no non-universal state in the service object.
=====Disclosure methods=====
A service adaptor must define [http://umlaut.rubyforge.org/api/classes/Service.html#service_types_generated service_types_generated()] to return an Array of ServiceTypeValues constituting the types of ServiceResponses the service
A service adaptor may optionally list some required configuration params. If they are not supplied, an exception will be thrown when the service is initialized from services.yml. eg:
: required_config_params :api_key, :base_url
=====The handle method=====
The heart of a typical service is in implementing the [http://umlaut.rubyforge.org/api/classes/Service.html#handle handle] method. When Umlaut wants a service to do it's thing, Umlaut will pass the request in, and it's up to the Service to do it's work.
The service can examine all metadata from the request, and even examine ServiceResponses generated by other services, and the status of other services in progress or finished. (See [http://umlaut.rubyforge.org/api/classes/Request.html Request#dispatched_services], Request#dispatched, Request#services_in_progress, etc.)
The service can then enhance any metadata if desired (likely data in [http://umlaut.rubyforge.org/api/classes/Referent.html Referent], from Request#referent).
The service can create one or more ServiceResponses. A ServiceResponse normally represents a discrete package of data that will be displayed on some part of the resolve menu. ServiceResponses should generally be created with the convenience method [http://umlaut.rubyforge.org/api/classes/Request.html Request]#add_service_response. 
The Service code is also responsible for registering a DispatchedService object with the completion state of the service. This should be done with the convenience method [http://umlaut.rubyforge.org/api/classes/Request.html Request]#dispatched .  If the service throws an uncaught exception, Umlaut itself will register a DispatchedService with status FailedFatal. But otherwise, the service is responsible for registering a completion status, or Umlaut may not realize the service is complete and continue running it over and over again, or reporting it as timed out.
====callback methods====
The Service can play an interactive role with the view elements of Umlaut in determining how to display the ServiceResponse and how to generate an external url for it if the user clicks on it. A Service doesn't need to do this--it can simply include properties in generated ServiceResponses for the conventional keys mentioned in [http://umlaut.rubyforge.org/api/classes/ServiceResponse.html ServiceResponse],  including a pre-generated url in the :url property.
However, for more complicated processing (including not generating urls until the point-of-need when a user actually clicks on one), callback methods can instead be implemented.
These callback methods include [http://umlaut.rubyforge.org/api/classes/Service.html]#view_data_from_service_type(service_type_obj) ,  #to_[name of service goes here] (eg #to_fulltext, or #to_help ), Service#response_to_view_data, and Service#response_url.
For more information, see the Technical Overview section on view logic. (tbd).
====Alternate Service Tasks====
Services were originally designed to do one thing, as described above. However, it has been useful to use the service architecture to perform other 'tasks' too, basically other sorts of plug-ins. What plug-in 'task' a service will be called upon to perform depends on the task config property in services.yml, which defaults to 'standard' when empty.
The other service task we have defined currently as 'link_out_filter'. A task:link_out_filter service will never have it's handle method called. Instead, it will have a [http://umlaut.rubyforge.org/api/classes/Service.html Service#]link_out_filter method defined, and called at the appropriate control point.  Examples of link_out_filter services are ezproxy, and sfx_backchannel_record.

Revision as of 20:38, 6 February 2008

To give you an overview of the technical architecture of umlaut, we will go through a typical Resolve request, identifying all the classes involved, and pointing to their api doc if possible.

OpenURLs are sent to the default index action of the resolve controller.

In the resolve controller, a before filter method called init_processing is run to parse the OpenURL and set up the Umlaut request (or retrieve an existing request).

Technical Overview Sections

  1. Request Setup and Environmental Context
  2. ServiceResponse data structures and generation -- Includes guide to writing your own services.
  3. View architecture and control flow