Skip to content
Alexander Dutton edited this page Sep 19, 2011 · 5 revisions

Software packaging break-out group

General Status

(As of 19th September 2011)

Packaging cannot go ahead as planned until various suboptimal bits of DataStage are fixed. The things to do below should address the majority of these issues.

Things to do

The following to be done in Sprint 5.

Code audit

Proper code audit, line-by-line.

Need to make sure we're using the issue tracker to track the code audit.

Possibly move to Jira for issue-tracking (who to look at?) Free license, need to find somewhere visible to host it. (issues.dataflow.ox.ac.uk?)

Ben to look through DataBank.

Alex and Anusha to look at DataStage.

External dependencies and licenses

DataBank

Everything required is installable by Debian. Some Python dependencies.

DataStage

Various JS libraries (see #1). Lots of these will be pulled out in time as we're going to move away from an entirely-JS-based site.

Layout of software packaging files

It would probably make sense to pull all the Debian packaging bumph into its own repository. We can then include the magic to package non-Debian-non-DataFlow python packages.

(This has been done as https://github.com/dataflow/debian-packaging)

BagIt (data packaging)

Making DataBank compliant with the latest BagIt spec and structure. (checksum manifests)

Need to check file structure of whatever DataStage sends to DataBank.

DataStage should really be talking BagIt and Sword (though we probably don't have time for this yet).