Skip to content
Ere Maijala edited this page Nov 28, 2023 · 33 revisions

Introduction

RecordManager provides metadata management functionality that can be useful e.g. for getting data into the search index of a discovery service. Out of box RecordManager supports VuFind.

RecordManager uses MongoDB or MySQL/MariaDB as the database for storing and processing the metadata. Multiple record formats can be used with pluggable record drivers. RecordManager provides the following main functions:

  • Harvest records using OAI-PMH protocol
  • Import records from files
  • Split records using PHP classes or XSLT
  • Normalize records using PHP classes or XSLT
  • Deduplicate records using the fancy built-in algorithm
  • Export records to files
  • Provide records to other harvesters with the built-in OAI-PMH Provider
  • Harvest [SFX knowledgebase records](wiki/Harvesting SFX Objects)
  • Directly update a Solr index (VuFind)
  • Record preview support for e.g. Voyager’s “Send Record to WebVoyage VuFind” function (requires a bit of custom code in VuFind)
  • Enrich records with pluggable enrichment modules (includes a sample enrichment that fetches additional data for topics from an ontology)

The basic idea is that metadata can be harvested from multiple data sources and indexed into a single Solr index. To make sure records from different sources can be identified and kept separate, RecordManager prepends any ID fields with the data source ID (e.g. local record ID 123 becomes source.123). This means that any software using the Solr index can identify the source of the record, but must also take care to strip the prefix before, for instance, linking to a UI in the source system or fetching holdings information via an API.

The original records are kept mostly as-is in RecordManager. This allows one to harvest or import everything once and then work on the set of records. They can be e.g. converted to other formats for OAI-PMH (this would happen on the fly) or indexed into Solr where all records share a set of common index fields in addition to the original metadata.

Installation

See the accompanying README.md file for short installation instructions.

Getting Records into RecordManager

There are currently the following ways to get records into RecordManager's Mongo database:

  1. OAI-PMH harvesting
  2. Direct file import
  3. SFX export harvesting
  4. Harvesting via the Innovative Sierra REST API

The normal ways to get data in are OAI-PMH and loading files. The configuration page explains all the related settings, and there are also multiple sample configurations in conf/datasources.ini.sample.

OAI-PMH is driven via the harvest.php script and file loads are done using the import.php script. See the Usage page for examples and more information on the tools.

Configuration

See the Configuration wiki page for information on the settings.

Usage

See the Usage wiki page for basic instructions.

Command Line Functions

See the [Command Line Reference](wiki/Command Line) for more information.

Additional Information

See the Customizing RecordManager page for information on how to extend the functionality without having to modify the core Base module.