PBCore RDF Hackathon
SIGN UP HERE: https://docs.google.com/spreadsheets/d/1R4cSuYCrkQWx0IJZzBrWu_vc9_TSK_5Z-SqQY8ZwYqY/edit?usp=sharing
Please also fill out this form: http://goo.gl/forms/nAvL52W9MI
>>> When, Where, What time?
Date: Saturday & Sunday, February 7-8, 2015
Time: ~8:30am-5pm (with option of continued work throughout the conference at the same location)
Location: 4104 Northeast 73rd Avenue, Portland, Oregon, 97218
What will be the format of the event?
In advance of the hackathon, participants are asked to fill out this form so that we can get a sense of the experience and skills of those who plan to attend. On the first day of the event, we will begin with welcome and introductions, review the agenda, and then break into groups to work on a variety of tasks. Groups may be identified as those working on intellectual content, intellectual property, technical, etc.
The days themselves will be structured something like this. Coffee/tea will be provided. Lunch is on your own.
Saturday, February 7
8:30am – Welcome, introductions
9am - 9:45am - Discuss and determine the domain and scope of the ontology
9:45am - noon - Review of existing ontologies (DC terms, MODS, EBUCore, BIBFRAME, PREMIS) to determine what can be used for PBCore. Snacks and coffee to be served.
Noon - 1pm – Lunch on your own.
1pm - 2pm - Generate a comprehensive list of terms that are needed in the ontology. Snacks and coffee will be served.
2pm - 4:45pm - Begin developing the class hierarchy and defining properties of concepts. Use existing vocabularies and harness EBUCore data model when appropriate.
4:45pm - 5pm - Review and wrap up.
Sunday, February 8
8:30am - Review progress to date; introductions of new participants
8:45am - noon - Continue working on class hierarchy and properties
noon - 1pm - Lunch on your own
1pm - 3:00pm -- Define the facets of the properties (value type, allowed values, number of values/cardinality, and other features). Review facets of existing ontologies. Do they meet the needs of PBCore users?
3:00pm - 4:30pm -- As a larger group, review progress and suggestions of smaller groups
4:30pm - 5pm -- Return to smaller groups, make suggested edits, finalize documentation
Summary & Background
The PBCore RDF Ontology Hackathon is occurring out of a growing need for PBCore users to express their metadata in RDF. A number of PBCore users contribute to and are part of the Project Hydra community, a collaborative, open source effort to build digital repository software solutions at archives institutions. Hydra is built on a framework that uses Fedora Commons as the repository for storing metadata. Many users are seeking to update their Fedora repositories to the latest version (Fedora 4), which provides a great opportunity to develop an RDF data structure. If PBCore had an RDF ontology, it would be easier for PBCore users to take full advantage of Fedora 4 capabilities in managing data and encourage adoption of Fedora 4.
We envision building upon existing knowledge bases that are already well established. In particular, we hope to harmonize the EBUCore ontology with PBCore and determine what existing terms from the EBUCore vocabulary can be re-used, and what concepts may be unique to PBCore that would deem the need for additional terms.
PBCore is a metadata schema for audiovisual materials. Its original development in 2004 was funded by the Corporation for Public Broadcasting, with a goal of creating a metadata standard for public broadcasters to share information about their video and audio assets within and among public media stations. Since its conception, PBCore has been adopted by a growing number of audiovisual archives and organizations that needed a way to describe their archival audiovisual collections. The schema has been reviewed multiple times and is currently in further development via the American Archive of Public Broadcasting and the Association of Moving Image Archivists (AMIA) PBCore Advisory Subcommittee.
The Schema Team is working on an updated version of PBCore (PBCore 2.1), the changes of which will consist of minor tweaks and bug fixes, and is expected to be released in March 2015. Other Teams on the Subcommittee are working on PBCore outreach, education, documentation, and a new website.
Participants should sign up for a working group. On the days of the event, these sections will be filled with suggestions and links to documentation created by the working groups.
Intellectual Content Working Group
This group will focus on the intellectual content part of the knowledge base. Intellectual content in PBCore XML is currently expressed through elements like pbcoreTitle, pbcoreAssetType, pbcoreAssetDate, pbcoreSubject, pbcoreDescription, pbcoreGenre, pbcoreRelation, pbcoreCoverage, pbcoreAudienceLevel, pbbcoreAudienceRating, pbcoreAnnotation, etc.
Casey E. Davis, WGBH, @caseyedavis1
Intellectual Property Working Group
This group will focus on the intellectual property part of the knowledge base. Intellectual property in PBCore XML is currently expressed through elements like pbcoreCreator, pbcoreContributor, pbcorePublisher, pbcoreRightsSummary, and roles.
Rebecca Guenther, LC and NYU/MIAP, @rguenther52, email@example.com
Instantiation Working Group
This group will focus on the instantiation part of the knowledge base, excluding essence tracks.
Essence Track Working Group
This group will focus on the essence track part of the knowledge base.
Name, Institution, Twitter handle/email address
Documentation Working Group
This group will create, gather and organize documentation produced during the hackathon. One person from each of the other working groups should also work on the documentation working group.
Casey E. Davis, WGBH, @caseyedavis1
Suggested Reading & Preparation
- Sign up for a Code4Lib wiki account (if you don't already have an account)
- Everyone should read at least the first chapters of the Allemang book, Semantic Web for the Working Ontologist:
- Everyone should understand the RDF meaning of classes, properties, domain and range before beginning. (cf: http://kcoyle.blogspot.com/2014/11/classes-in-rdf.html)
- Review PBCore Schema: http://pbcore.org/elements/
- Read this awesome Ontology Development 101 publication: http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html
- Read about RDF on the W3C website: http://www.w3.org/RDF/
- Read this article: "Multi-Entity Models of Resource Description in the Semantic Web: A comparison of FRBR, RDA and BIBFRAME." (http://kcoyle.net/LHTv32n4preprint.pdf)
- Review existing ontologies
- EBUCore: http://www.ebu.ch/metadata/ontologies/ebucore/index.html and http://www.ebu.ch/metadata/ontologies/ebucore/ebucore.rdf
- MODS: http://www.loc.gov/standards/mods/modsrdf/
- BIBFRAME: http://www.loc.gov/bibframe/
- DC Terms: http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=terms#
- FOAF: http://www.foaf-project.org/
- PREMIS: http://id.loc.gov/ontologies/premis.html
Tips and Advice from the Community
from Karen Coyle
- Don't lean too heavily on Protege. Protege is very OWL-oriented and can lead one far astray. It's easy to click on check boxes without knowing what they really mean. Do as much development as you can without using Protege, and do your development in RDFS not OWL. Later you can use Protege to check your work, or to complete the code.
- Develop in ntriples or turtle but NOT rdf/xml. RDF differs from XML in some fundamental ways that are not obvious, and developing in rdf/xml masks these differences and often leads to the development of not very good ontologies.
from Jean-Pierre Evain
- I have personally no issue whatsoever with Protégé or RDF/XML for the type of ontology we seem to be aiming at
- I agree that OWL is probably not required. But this doesn't prevent using Protégé. Of course one needs to know what is specific to OWL.
Need more info?
If you have questions or need more information, feel free to contact Casey Davis at casey_davis [at] wgbh [dot] org.