Skip to content

Latest commit

 

History

History
97 lines (82 loc) · 4.57 KB

README.md

File metadata and controls

97 lines (82 loc) · 4.57 KB

General ReadMe

The primary objective of WDS-JniPMML-XLL is to provide model evaluators to Excel. In particular, access to the standard PMML evaluator is a starting point, both for use and/or for comparison. Later versions will be include other model specs and implement other evaluators.

Please see documentation articles for a brief introduction on use.

Other evaluators aside, there is a technical challenge to providing access to the standard PMML evaluator, jpmml, which is cross-programing languages. Under the hood, to create a fast efficient Excel interface that insulates the user from the technical details, the usual addin languages (C#/VB/VBA) must take data from the workbook (in multiple columns and possibly multiple rows), transform it, call the jpmml in Java, and then return the appropriate data (with possibly multiple columns and rows) back to the workbook.

This effort does not preclude writing a PMML evaluator (or re-writing jpmml, which may be a good idea) in another language. However, as other evalutaors as added, a common Excel based interface than provides a basis for comparison.

Through this version, WDS-JniPMML-XLL provides:

  • A pair of Excel AddIns (XLLs) and VBA support for:
    • Evaluating PMML models
      • As an Excel function call
      • Using the de facto standard implementation, jpmml.evaluator
      • Using input data from an in-worksheet table
      • Uses XmlMap'd exportable ListObjects, but provides tools to facilitate
      • Can evaluate one or multiple observations (rows) per call
      • Results returned as normal function outputs
      • With cacheable models for efficiency
    • Additional data wrangling tools for
      • Importing/Exporting HDF5 compound datasets
      • Importing/Exporting flat files
    • Additional VBA module handling
  • A Java wrapper of jpmml.evaluator
    • Callable from the XLL via jni
    • Testable as a standalone from the command line
    • But, can be called through the Excel AddIn using the JVM.
    • Input and output data can be:
      • HDF5 compound datasets
      • Flat files
      • In memory (as when called through jni)
  • A launch script and examples are included
    • WDS-JniPMML-XLL-Launch.bat: a script for launching a new Excel instance, running the AddIns without installing
    • WDS-JniPMML-XLL-Test-Launch.bat: a script for running the AddIns and the example workbook WDS-JniPMML-XLL-Test.xlsm
    • test/data: A test set of the usual PMML cases

A Few Project Organization Notes

  • JniPMML-[AAA]: Language specific libraries that directly related to project objectives
    • Where-ever-possible and as-close-as-possible code naming conventions and structure is kept similar across languages.
  • WDS-[AAA]: Language specific utility libraries that can be used independently of the JniPMML-[AAA] libs
  • lib: compiled final products which could be used directly
  • scripts: make scripts, for cross-language documentation building in particular

Prerequisites

  • 64 bit Excel
  • Although, if compiling, 32 bit could possibly be added.
  • Access to the VBA project object model (if using the VBA module handlers
  • HDF5 and HDFView
    • The HDF5 and HDFView libs are required if compiling, but the functionality could be removed.
    • The provided jars require at least HDFView be on the path or the path passed in as a command line option when starting Excel
  • Java jdk-12
  • Required when using the latest HDFView install.
  • Compiling environment
  • The github configurations are for Visual Studio Community Edition and Intellij Community edition.
  • DocFx
  • DocFx is used for the documentation build, including DocFxDoclet on the JavaDoc side.

License Note

All code contributions and development from Wypasek Data Science, Inc. (WDataSci) published on its public github site is released under the MIT license. Code from other sources is noted as such, and any assemblies, XLL's, and/or jars that may contain other software (for example, as Apache's Maven or ExcelDna may bundle from other sources) are released along with the commonly used IDE project and/or solution files used to generate them.