In this post I am going to explore the various methods of applying indexing that we have used as part of OEM-UK.
At OEM-UK we are looking at cost-effective ways to catalogue existing uncatalogued collections (see previous post for what the OEM-UK way of cataloguing entails). We are concentrating on uncatalogued collections (or collections that are only represented in paper-based legacy metadata) for two reasons;
1. We are trying to create a low-barrier, (relatively) inexpensive methodology, that is re-usable by any library to tackle existing ‘backlogs’.
2. We are taking advantage of the opportunities offered by collections that have had sorting/grouping applied to them (e.g. by type – school textbooks collection) or that have legacy metadata (e.g the card catalogue or hardcopy lists) to open up new methodologies for us to apply subject indexing.
I will mainly concentrate on the OEM-UK school textbooks in this post.
The school textbooks are physically grouped together in our remote store, but they are not grouped by subject. There are many broad subject areas covered in the collection but OEM-UK will only tackle two; history and science & technology.
The textbooks have running numbers on the spines and they are on the card catalogue which contains both the subjects of the textbooks and the running number (OCR’ing did not work on these cards in case anyone thinks we missed an opportunity!).
The OEM-UK cataloguer took 1 day to pick out the catalogue cards with the subjects of history or science & technology on them and listed the numbers in 2 Excel spreadsheets. We then had 1890 history and science & technology school textbooks transported to our office and we were ready to catalogue the textbooks as part of OEM-UK.
Here at the IOE a book catalogued from scratch and indexed using our in-house thesaurus (LET) takes on average 20 minutes; 10 minutes of this is for the subject indexing. The timings for cataloguing older material downloaded via z39.50 are similar.
In order to deliver OEM-UK on time we had to achieve an average time of 5 minutes per book – including the indexing!
Two things need to be emphasised from the outset;
1. The OEM-UK methodologies for subject indexing tried so far only work with existing (backlog / uncatalogued) collections that share common attributes; the OEM-UK way of indexing is not a replacement for the human indexing of new material.
2. We are not tied to one methodology. The highly configurable nature of Drupal means than in seconds I can create new fields that are instantly available to the OEM-UK cataloguer and I can remove redundant fields in order to tailor the form to the actual work at hand; this is one of the major advantages that Drupal has over a traditional LMS.
The 3 different methodologies we have tried (so far) are;
1. Using scripts to match words in the Drupal records (e.g. title) with subject terms in IOE controlled vocabularies – we used this methodology on the exam papers (see subject indexing of exam papers in a previous post) and the science & technology textbooks (the subjects of these books were almost always explicitly stated in the title, only 19 of the 790 science & technology textbooks could not have the subjects identified using scripts). On average the science & technology books took 3 minutes 55 seconds each to catalogue and the exam papers took 1 minute.
2. ‘Subject carding’ – the history textbooks cover a wide range of subjects that are not always obvious from the title. In collaboration with our subject experts we created a ‘core’ history LET subject list of 10 subjects (Ancient history, Classical history, Medieval history, Early modern history, Late modern history, European history, British history, Religious history, and Church history). We created cards listing these subjects each with a tick box. For 5 weeks our subject experts spent 2 hours per week ‘subject carding’ the 1100 history textbooks (inspecting the book, ticking the appropriate subjects, and inserting the card in to the book). The list of subjects was replicated in Drupal and the OEM-UK cataloguer ticked the appropriate subject in the record. On average these books took 4 minutes 10 seconds to catalogue.
3. Enhancing existing records – the highly configurable nature of Drupal has enabled us to use our researchers expertise to enhance the subject indexing of existing Drupal records. I will use one real life example to illustrate this process. A post grad from the US visited us recently, his theses is on Darwinism and religion in England’s state secondary schools, 1920’s – 1970’s. We struck a deal. Using Drupal I would find all of the possibly relevent science and technology textbooks based upon year of publication and broad subject area and give him this list. I let him use a vacant desk in our office to conduct his research. We kept the books that he identified as being relevent aside. We knew these books could be given the additional terms ‘Darwinism’ and ‘Secondary education’, it took me a few seconds to add these 2 terms (temporarily) to the Drupal form. It took on average 1 minute per book to find the record on Drupal and add in the extra indexing terms.
So far in 5 months 1 FTE has created 7487 Drupal records and every one of them is subject indexed; the OEM-UK Drupal cataloguing is ahead of schedule and is 5 times quicker that traditional cataloguing and indexing.
I don’t expect the OEM-UK indexing to have to same granularity/quality as the indexing done by a professional indexer. But what is the qualitative difference between OEM-UK indexing and that of professional indexers?
I have now set 4 of our experienced indexers the task of cataloguing and indexing a range of the OEM-UK textbooks so that I can compare how OEM-UK indexing shapes up against traditional indexing. This will be the subject of a later post.
In my next post I am going to explore the costings for OEM-UK cataloguing.