Final Post

As we are now at the end of the formal part of the project, here is a final post, although this blog will remain in place for new information regarding the project so in fact, this may not be the final post at all!

Outputs:

  • Adaptation of the COMET project’s IPR analysis workflow and embedding of this in an automated process of record selection which is made prior to the IPR clear data being published as Open Data.
  • Cataloguing of 8,178 pre-1950 Examination Papers and related historical textbooks in the subjects of History and Science/technology.
  • Use and proof of concept of a low-barrier cataloguing interface which ramps up the volume of retrospective cataloguing which can be achieved whilst retaining an acceptable standard of data quality.
  • Release of 439,000 records as downloadable files from the Library and Archives catalogue systems under the Public Domain Dedication and License v1.0 and for use by all. The breakdown of numbers is:
  1.     Library catalogue: 214,000 (bibs); 198,000 (authority)
  2.     Archive catalogue: 26,800
  • Working connectors between the four contributing systems, with data RDF indexed and available via a SPARQL endpoint interface.
  • A working incremental record update workflow which synchronizes the data derived from the sources.
  • Release of records to Google which were previously not indexed

Lessons learnt:

  • Connecting library systems to Drupal in a way which is scalable to large complex datasets is difficult using the currently available modules.
  • Synchronization (daily incremental updating) is possible but relies on scripting and is probably not too sustainable.
  • Once your data is properly ingested to Drupal, adding RDF mappings at granular level is a trivial task and non-technical staff could do this.
  • It is relatively easy to embed an IPR filtering mechanism into the workflow which suits the institution’s risk appetite.
  • A low barrier cataloguing mechanism is a good way to surface previously hidden (print) resources with semantically rich records in an economical way and without requiring specialist cataloguing expertise. We experimented with various methods of creating subject index headings and the use of scripts was a far more economical solution than those solutions that required even minimal professional staff input at the record level. We also enriched records after they had been created by asking researchers to provide indexing terms for those OEM-UK books they consulted. Using a low barrier cataloguing mechanism enabled us to focus limited resources on semantically enriching every OEM-UK record in a process that continues even after they are created.
  • There are simple technical methods by which basic records can be downloaded to Drupal from well-known third party catalogue sources, but the IPR and rights and responsibilities in this area remain unclear.
  • Building the rdf index on Drupal (with a large dataset) was challenging but once figured out, simple to reschedule.
  • Releasing records to Search Engines requires some thought in terms of development of site map and META tag content in order to ensure maximum visibility through the crawler process.

Opportunities and Possibilities:

  • We intend to examine how far it is possible to update our catalogue records using other linked data sources such as VIAF.
  • We will monitor traffic coming from Google via Google Analytics to demonstrate tangible use increases due to having opened the data.
  • Better advice/training for systems librarians on SEO would be both welcome and useful in opening data to a wider audience.
  • We will continue to use the low barrier cataloguing methodology to increase our retrospective cataloguing output.

A low barrier cataloguing mechanism is a good way to surface previously hidden (print) resources with semantically rich records economically and without requiring specialist cataloguing expertise. We experimented with various methods of creating subject indexing and the use of scripts was a far more economical solution than those solutions that required even minimal professional staff input at the record level. We also enriched records post-creation by asking researchers to provide indexing terms for those OEM-UK books they consulted. Using a low barrier cataloguing mechanism enables us to focus limited resources on semantically enriching every OEM-UK record in a process that continues even after they are created

Normal
0

false
false
false

EN-GB
X-NONE
X-NONE

MicrosoftInternetExplorer4

/* Style Definitions */
table.MsoNormalTable
{mso-style-name:”Table Normal”;
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:””;
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:”Calibri”,”sans-serif”;
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:”Times New Roman”;
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:”Times New Roman”;
mso-bidi-theme-font:minor-bidi;}

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s