Skip to content

Resources_and_technology

Oliver Stueker edited this page May 5, 2015 · 2 revisions

<<toc></toc>>

Table of Contents

Computational infrastructure

Current components

 * [[JUMBO-Converters]] (Java) for legacy2CML and other transformations
 * [[Lensfield2]] build system with dependencies
 * [[http://quixote.wikispot.org/Resources_and_technology|RESTful]] system for uploading and aggregation
 * [[http://quixote.wikispot.org/Resources_and_technology|Greenchain]] server on virtual machine at Cambridge - allows free upload (ca 25 GB available)
 * [[Chempound]] a standalone database server for archiving the outputs of computational chemistry calculations.

Current development

 * [[ANTLR]] technology for parsing (QB and Weerapong developing).

Dependencies and underlying technology

 * [[http://www.oracle.com/technetwork/java/javase/downloads/index.html|Java 1.5 JDK]]
 * [[http://quixote.wikispot.org/Tutorials_and_problems#Maven|Maven]] for resolving Java dependencies. Check [[Maven]] for a basic tutorial and known problems.
 * [[http://mercurial.selenic.com/|Mercurial]] for interacting with software repositories.  Check [[Mercurial]] for a basic tutorial and known problems.
 * [[http://sourceforge.net/projects/avogadro/files/avogadro/1.0.1/|Avogadro]] for visualizing the parsed output. Check [[Avogadro]] for instructions about how to install it.

Optional (Opensource) Components

 * [[NWChem]] - a powerful Opensource electronic structure code.
 * [[http://avogadro.openmolecules.net/|Avogadro]] - an Opensource molecular modelling environment

Parsing (and chunking) QC datafiles

Ways to create semantic compchem

 * Embed calls in the code. Current libraries include:
    1. [[FoX]] (FORTRAN95, Toby White).
    1. [[JUMBO]]/[[CMLXOM]] (Java, Peter Murray-Rust)
    1. [[http://www.codalogic.com/lmx/|LMX]] (move to another section if I am wrong)
 * Write scripts or programs that read files and convert into semantic form:
    1. [[JUMBO-Converters]]
    1. [[Openbabel]]
    1. [[http://cclib.sourceforge.net/wiki/index.php/Main_Page|cclib]]
 * Write high-level parsers:
    1. [[ANTLR]]

The current approach adopted by the Quixote Project is to use the JUMBO-Converters.

Attributes for Compchem

    • NB:** we are currently working this out the prototype on the Prototype_data page.
Taken from the EtherPad at http://okfnpad.org/zcam2010
    • Metadata:**
 * author email in logfile (e.g. through title) (is this normally in the logfile or it would be a good practice)
 * datacite DOI in logfile (pre-publication)
 * publication associated to the logfile (if published)
    • Definition of the system:**
 * geometry/structure/nuclear coordinates (its all the same thing, size n)
 * charge/spin/state (from my point of view, the spin and the state go in the provenance section, they are constraints to the wavefunction)
    • Provenance (type of calculation):**
 * level of the theory (RHF, B3LYP, MP2, AM1, etc.)
 * basis set (either with an agreed-upon name, as in BSSE, or custom basis sets)
 * additional details to the level of the theory (frozen core, etc.)
 * convergence parameters for SCF, CC iterations, etc.
 * initial guess for the iterative procedures (e.g., Hückel guess for SCF)
 * algorithm used for the iterative procedures
    • Results of the calculation (observables):**
 * energy
 * energy gradient (size n)
 * energy hessian (size n^2^)
 * wave function (size n)
 * density matrix (size n^2^)
 * Mulliken charges (or some other type) (size n)
 * Normal Modes; hessian eigenvalues, eigenvectors
    • Performance of the calculation:**
 * wall-clock time
 * CPU time
 * number of cores it ran into
 * total RAM used
 * scratch space in disk used
 * code exited successfully or unsuccessfully
    • CML examples:**
 * [[http://cml.svn.sourceforge.net/viewvc/cml/schema2/trunk/examples/complex/calcite1.xml?revision=161|GULP ouptut]]
 * [[http://cml.svn.sourceforge.net/viewvc/cml/schema2/trunk/examples/complex/castep2.xml?revision=161|CASTEP output (shows use of properties)]]
 * [[http://cml.svn.sourceforge.net/viewvc/cml/schema2/trunk/examples/complex/castep3.xml?revision=161|similar CASTEP output]]
 * [[http://cml.svn.sourceforge.net/viewvc/cml/schema2/trunk/examples/complex/dlpoly.xml?revision=161|DLPOLY output]]

Dictionaries

 * [[Dictionaries_examples]]
 * [[Creating_dictionaries]]

Formats for storing structured QC data

General formats

 * [[CML]]
 * [[http://www.hdfgroup.org/HDF5/|HDF5]]
 * [[http://abigrid.cineca.it/abigrid/the-docs-archive/q5cost/index_html|Q5cost]]

QC formats

 * CMLcomp: [[http://cml.sourceforge.net/schema/cmlComp/HTMLDOCS/cmlcomp.pdf|Old schema]], [[http://como.cheng.cam.ac.uk/preprints/c4e-Preprint-97.pdf|Newer preprint]]

QC codes and their datafiles' structure

Some of the codes we intend to support

 * [[DALTON]]
 * [[Gaussian]]
 * [[GAMESS]]
 * [[GAMESS-UK]]
 * [[MOLCAS]]
 * [[MOLDEN]]
 * [[MOLPRO]]
 * [[MPQC]]
 * [[MOPAC7]]
 * [[NWChem]]
 * [[ORCA]]
 * [[QChem]]
 * [[TURBOMOLE]]

A long list of quantum chemistry and solid state physics codes: http://en.wikipedia.org/wiki/List_of_quantum_chemistry_and_solid_state_physics_software

Please edit the pages that these point to. Add:

 * any home page(s)
 * examples of files
 * notes on any already existing parsers
 * person of contact

Open datasets

 * [[Pablo_Echenique's_dataset]]
 * [[cclib_test_set]]
 * [[http://www.oci.uzh.ch/group.pages/baldridge/efiles/UZH_GC3_M06_O3ADD6.tgz|Mark Monroe's dataset]]
 * [[Three_NWChem_6.0_single-points]]

Uploading and downloading

We expect a variety of approaches to upload and download as people try out different mechanisms. Here we describe the simple REST approach.

Server

We have set up a server at http://greenchain.ch.cam.ac.uk/patents/quixote/ (The "patents" is historical - we may be able to rename it). This server is wide open in all respects so please don't advertise it to spammers. We may tackle security later -if the server is hacked or damaged we simply close it and start again.

You can use a REST-based approach to:

 * upload files to server
 * download files/URLs from server
 * list files in a server "directory"
 * delete files/URLs on server

The URL structure reflects the local files hierarchy directly so I shall often simply call the webpages "files" and "directories"

The power of REST is:

 * its simplicitly
 * the library support
 * the warm feeling you get from doing something really simply that works exactly as you want.

The current HTTP commands are:

 * PUT (or possibly POST) puts a file
 * GET gets a file
 * DELETE delete a file

That's it.

REST is supported in almost ball languages (I don't know about FORTRAN and I wouldn't use it anyway for this). We shall use Java from: http://bitbucket.org/petermr/lensfieldjumbo. Check this out (if you haven't done already). The code (if you need it) is in: . The routines we shall use are:

 * }
 * 
 * 

Download

Misc

 * [[http://harmful.cat-v.org/software/xml/soap/simple|The S stands for Simple]] Why PMR uses REST, not SOAP
 * [[http://rest.elkstein.org/|Learn REST: A Tutorial (by M. Elkstein)]]
Clone this wiki locally