Skip to content

File analyzer component packages

Terry Brady edited this page Jan 16, 2014 · 30 revisions

The File Analyzer code base is broken into multiple modules containing different levels of functionality.

Core Package - requires only Java

The Core package contains File Test Rules and File Import Rules

  • with general applicability to multiple institutions
  • that do not depend on libraries other than the core java libraries

Count Files By Type

This test counts the number of files found by file extension. A report will be generated listing the number of files found for each extension as well as a cumulative number of bytes for files of each type.

This rule will generate a listing of the full path to every file it finds. The purpose of this tool is to generate a file list for import into other applications.

Demo Package - integrated with Apache Tika (for metadata extraction) and BagIt

The Demo package contains File Test Rules and File Import Rules

  • that depend on external libraries
  • may not have applicability to multiple institutions
  • demonstrate how to customize Core code to implement institution-specific business logic

DSpace Package - automation of DSpace ingestion tasks

Georgetown University has successfully automated a number of DSpace ingest tasks using the File Analyzer.

OVERVIEW PRESENTATIONS DSpace Tools Overview

OpenRepositories 2013 Presentation

WRLC (Washington Research Library Consortium) Forum Presentation - Sept 2013

Custom Packages

An institution can create their own module containing highly customized rules.

Create Your Own FileAnalyzer

public class GUFileAnalyzer extends DirectoryTable {

public GUFileAnalyzer(File f, boolean modifyAllowed) {
    super(f, modifyAllowed);
    this.title = "Georgetown University Libraries File Analyzer";
    this.message = "File Analyzer customized for use by the Georgetown University Libraries.";
    this.refreshTitle();
}

protected ActionRegistry getActionRegistry() {
    return new GUActionRegistry(this, modifyAllowed);
}

protected ImporterRegistry getImporterRegistry() {
    return new GUImporterRegistry(this);
}
public static void main(String[] args) {
    if (args.length > 0)
        new GUFileAnalyzer(new File(args[0]), false);        
    else
        new GUFileAnalyzer(null, false);        
}

}

Create an ActionRegistry to register your custom File Test Rule classes (and to remove default ones)

public class GUActionRegistry extends DemoActionRegistry {

private static final long serialVersionUID = 1L;

public GUActionRegistry(FTDriver dt, boolean modifyAllowed) {
    super(dt, modifyAllowed);
    
    removeFT(IngestInventory.class);
    removeFT(IngestValidate.class);
    add(new GUIngestInventory(dt));
    add(new GUIngestValidate(dt));
}

}

Create an ImporterRegistry to register your Custom File Import Rules

public class GUImporterRegistry extends DemoImporterRegistry {

private static final long serialVersionUID = 1L;

public GUImporterRegistry(FTDriver dt) {
    super(dt);
    add(new OutputToBursar(dt));
}

}
Clone this wiki locally