Difference between revisions of "Umlaut wishlist"

From Code4Lib
Jump to: navigation, search
 
(10 intermediate revisions by 7 users not shown)
Line 1: Line 1:
 
[[Category:Umlaut]]
 
[[Category:Umlaut]]
 +
 +
=WARNING: This is Outdated Documentation!!!!=
 +
 +
'''THIS IS OUTDATED DOCUMENTATION''' See new Umlaut documentation at http://github.com/team-umlaut/umlaut/wiki
 +
---------
  
 
Some actual current future plans:
 
Some actual current future plans:
  
* Fix HathiTrust adapter to use new HT plugin, including not showing fulltext for just portions of a serial. (Or showing it in 'see also' section only)
+
* JournalTOCs ToC?
  
* Rails3
+
* Use OCLC xISBN to find HT and Internet Archive/OCA matches?
  
 
* Internet Archive -- use new OL/IA api, discover search-inside-the-book.  
 
* Internet Archive -- use new OL/IA api, discover search-inside-the-book.  
  
 
* WorldCat, use new api, link directly to nearest public library in 'see also' or elsewhere.  
 
* WorldCat, use new api, link directly to nearest public library in 'see also' or elsewhere.  
 +
 +
* CiteSeerX -- source of 'cited by' info, AND, most excitingly, open access pre-prints. But their Atom/RSS feeds (the only API I could find) don't seem to advertise enough info to actually use these features. Would need to talk to developer team -- possibly offer to help code? Also not entirely clear how big their corpus actually is, if it's worth it.
 +
 +
* Try screen-scraping Google Scholar (and maybe Microsoft Academic) to get the open access full text links they find.  Also, there's a Springer API for open access content now. http://dev.springer.com/docs/Restful_operations
 +
 +
* When no full text is found, provide link to search on Google Scholar, or Bing Academic?  Need to have sufficient metadata to create the search. Oct 2010 Library Technology Reports article has some ideas, I think.
  
  
*old* Desired or planned features.  
+
'''old''' Desired or planned features.  
  
 
* Check for similar articles from: http://biosemantics.org/jane/faq.php#api
 
* Check for similar articles from: http://biosemantics.org/jane/faq.php#api

Latest revision as of 16:22, 19 June 2012


WARNING: This is Outdated Documentation!!!!

THIS IS OUTDATED DOCUMENTATION See new Umlaut documentation at http://github.com/team-umlaut/umlaut/wiki


Some actual current future plans:

  • JournalTOCs ToC?
  • Use OCLC xISBN to find HT and Internet Archive/OCA matches?
  • Internet Archive -- use new OL/IA api, discover search-inside-the-book.
  • WorldCat, use new api, link directly to nearest public library in 'see also' or elsewhere.
  • CiteSeerX -- source of 'cited by' info, AND, most excitingly, open access pre-prints. But their Atom/RSS feeds (the only API I could find) don't seem to advertise enough info to actually use these features. Would need to talk to developer team -- possibly offer to help code? Also not entirely clear how big their corpus actually is, if it's worth it.
  • When no full text is found, provide link to search on Google Scholar, or Bing Academic? Need to have sufficient metadata to create the search. Oct 2010 Library Technology Reports article has some ideas, I think.


old Desired or planned features.

  • Full-text availability check from http://chroniclingamerica.loc.gov/ -- check by title/city, check by lccn (?), able to check particular dates/link to particular dates and/or pages of paper?
  • Allow a service_response to have a tree relationship to children, so for instance alternate versions of a text can be attached as children of the main link, expandable by the user.
  • Journal ToC from CiteULike
  • Add information about the conversation happening around an article with Scintilla if we have a URL, PMID or DOI (Alf at Scintilla would prefer us NOT to use the API for high-traffic. But we can copy his techniques internally to Umlaut. CrossRef and PubMed for "cited by" on DOI and PMID identifiers are a good idea. He has also reverse engineered the Scopus javascript api to allow server-side json access. http://hublog.hubmed.org/archives/001512.html):
    http://hublog.hubmed.org/archives/001609.html
    Unofficially it will return json:
    http://scintilla.nature.com/conversations?uri=info%3Adoi%2F10.1371%2Fjournal.pmed.0020124&format=json



  • Rochester “Getting Users Fulltext” style code to skip right to the full text, skipping content-provider metadata pages.


  • UMich Mirlyn for metadata enrichment?
    http://webservices.itcs.umich.edu/mediawiki/MLibraryAPI/index.php/Mirlynapi:Home


  • xISBN/thingISBN use. (Some thought is required in how to integrate this while avoiding false positives). Bowker ISSN service for metadata enhancement. OCLC xISSN? Integrate preceding/succeeding title information from OPAC or xISSN?
  • LibraryLookup: http://xisbn.worldcat.org/liblook/index.htm At least until xISBN is baked in we could provide a link to this service. Increases the chances of finding a desired book in the catalog through work set grouping. Used by LibX.
     http://xisbn.worldcat.org/liblook/resolve.htm?res_id=http://www.iucat.iu.edu&rft.isbn=0451530942&url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:book
  • Journal covers from Ulrich's via screen-scraping (or Ulrich's/sersol built in api?)


  • Connotea integration


  • Fetch ToC from LC. Screen scrape, I guess? Or z3950? Any other content from LC?


  • Link to Books In Print ala Notre Dame.

http://www.library.nd.edu/eresources/findit/findit.cgi?doc_num=001939269&aleph_session=U5AVHRXD5QB1CGDFDSVJ9DSY2UA6QNCGVEU8EYRX9NNMIQ429Q-54668%22 example

  • bip search url? :

http://www.booksinprint.com/merge_shared/Search/advsearch.asp%3FdateState%3DY%26txtAction%3D%26BooleanSearch%3D%26SType%3Dadv%26collection%3DBIP%26QueryMode%3DSimple%26ResultCount%3D25%26ResultTemplate%3Dmbbookresult_fl.hts%26navPage%3D1%26SrchFrm%3DAdv%26ScoreThreshold%3D0%26Criteria1%3DISBN%26CriteriaText1%3D0838935370


  • SFX plugin: Notice when first title given is non-roman, and look for roman title to enhance metadata with when so.


  • HIP and other OPAC searchers should pull ToC from MARC 505 when present. And 856's judged to be ToC in ToC, not full text.


  • Fix Umlaut Referent to more easily allow multiple authors. Architectural change neccessary to get a lot of this stuff working right.


  • Enhance metadata to have full metadata for a refworks etc export. Using: CrossRef? Metalib? Anything else?


  • A general purpose responsecache. Schema: Date, service/source, key. Use for caching image urls, ToC urls from LC, etc.


  • Fix Worldcat registry auto-discovery.


  • Add a Worldcat search that uses API, instead of screen scrape.


  • Switch OCA search to use OCA native APIs, instead of indexdata mirror index.


  • fix unapi in umlaut. unapi to rsi? For zotero.


  • Change background to use Spawn plugin instead of manual threading. Investigating using spawn with fork instead of thread (terry reese on limited pool of forks).


  • Crazy idea for an abstract interface/architecture to support querying web service apis that require client side javascript, like Google Books and Scopus.


  • Integrate my various local document delivery services into menu of options when full text isn’t available. More generally, a clear architecture for providing localized doc delivery services in addition to a single ILL link.


  • SFX adaptor: Add a "rollup" feature that pays attention to dates to avoid eliminating coverage.