When we had finished cataloguing on Drupal, we exported to a csv file. I did this by setting up a special Drupal view which exported the fields I wanted in the order I wanted. I then wrote a script which converted the csv into Sirsi flatascii format. The essence of this was very simple (map each csv field to a marc field number and write to the output file). The detail turned out to be rather more difficult.
The first issue was repeatable fields. Drupal has a concept of repeated fields but outputs them all into one comma-separated field in the csv. Using an array which iterated over each comma-separated item, it was relatively easy to turn them into repeatable fields for marc. The exception to this was the AACR2 rule regarding 100 and 700 fields and how the values for those interact with the statement of reponsibility in subfield c of the 245. After some jiggling, we were able to make this work satisfactorily. The script is not pretty, but it is fast (on a thousand records) and does the job. I will release this script shortly. We loaded 1083 history textbooks records via the Symphony import report in 1 minute. Later, we repeated the exercise with 786 science and technical textbooks.
The difference in the latter was that we mined subject-indexing terms from the title and series fields using a pre-defined mapping to our London Education Thesaurus (LET). 1718 subject terms were created over the 786 records. It was possible to do this because we were working with a dataset which had fairly a tightly-defined subject area and in which the title often (by the very nature of the material) was itself describing a subject. Our next challenge is to see what we can do with the exam papers when we import those to Symphony and how far the automatic subject allocation system can be taken there.