Skip to content
This repository has been archived by the owner on Jan 25, 2024. It is now read-only.

Find the Hidden Object

Mike Caprio edited this page Mar 24, 2017 · 64 revisions

Use Computer Vision, Shape and Object Recognition to Tag Images with Richer Metadata

Hackathon Findings

The Hackathon team created a solution that could recognize like objects and suggest matches in the Museum's Library Image Database, Digital Special Collections (Omeka).

The solution regarding the object matching was impressive. Going farther with this challenge could address the rich opportunity to link thousands of artifact images in the ethnology collections with thousands of expedition photos in the Library. To create a solution that would meet this potential, future attempts at creating this bridge would incorporate additional key elements for the Hackathon team such as a more comprehensive discussion regarding expeditions and related collections, as well as defining each data field within Digital Special Collections, which would help capture more access points from which to find relationships. This would allow for more widely applicable solutions more representative of the collections. Future events might solve this issue by incorporating 2 presentations, a brief first challenge introduction to the entire participant group, followed by a second more detailed presentation, with Q&A, to the participants who selected the challenge, highlighting access points within the image records and an explanation of how missing data could be found using other various fields as clues to greater context.

Hackathon Projects

Background

Digital Special Collections (Omeka) is an ongoing effort to provide Museum staff, researchers, and the general public with access to the rich Photographic Collection in the Museum's Library of over a million images. More than 20,000 images are currently cataloged and public, with more added each week and assigned to 15 named collections, thus far, and a larger general collection. There are images from all of the topics in the natural sciences including ethnology.

The Photographic Collection chronicles the history of scientific study and exploration by Museum staff around the globe since the Museum's founding in 1869. Images document field work, expeditions, science education, specimens, and exhibition preparation. Cataloging this volume and range of data with standards and controlled vocabularies is a unique and specialized effort within the Library's digitization program.

Object shuswap basket in this photo and winter dress in this photo

Some images are part of collections that have a large amount of historical data to organize, and some images have no data at all. In many cases research must be done by image catalogers using a variety of Museum and Library sources. Using clues, such as location or date from the image source, the cataloger goes to the AMNH Anthropology Collections Database and searches on available data, such as object name, material, culture name, or geographic location, to locate an item (this database is available for you in the challenge repo in the file anthro_web_public.zip). And for items with a specimen number, cross-referencing is done in the aforementioned Anthropology database. In both cases, data are verified and/or entered into an image record in Omeka. Correctly identifying and naming images of specimens, artifacts, exhibits, and field work is an essential task for making images searchable and creating context throughout the Museum's science topics and its history.

Object numbered 50.1/1408 in a field notes sketch, from Omeka, and in Anthropology respectively

The Anthropology Collections Database has images of many of the same items for which the Library has images, but the Anthropology database has more comprehensive and reliable data. Images in the Library's Omeka database that include a specimen number are most helpful, and are then manually verified by catalogers against the Anthropology Database. Conversely, many Library images do not include specimen numbers, such as images of expeditions and exhibits. Catalogers would benefit from a more automated method to identify artifacts in images and to verify library data through the Anthropology Collections database. It would be helpful if software could visually recognize objects and specimen numbers to connect data between Digital Special Collections (Omeka) and the Anthropology Collections Database. Ultimately, the goal is to discover the gaps between our Omeka database and the Anthropology database and enhance our data in Omeka wherever possible.


Solutions

  • Vision match human assisted training system. Use computer vision to recognize objects for which there is no data in Omeka and suggest possible matches for object name, material, design, or technique and utilize the data available. Allow a human to indicate actual matches.

  • Interface prototype for data merging between Anthropology and Library. Connect with and utilize data from the Anthropology Collections Database into Digital Special Collections (Omeka). This could include an interface for a cataloger to utilize data from Anthropology; something that pulls up the data from a match and allows a cataloger to enter it into DSC (Omeka).

  • Connect Anthropology image records with Library images. Perhaps an interface for a researcher to add metadata links out to other resources as well as discover "similar items" (visually or characteristically). Could add links to additional resources and possibly provide a Museum location if an item is on exhibit.


Resources