This repository hosts a set of libraries and command line tool for automating parts of the onboarding workflow. It gives the user the ability to apply rule-based mapping automation, ingest multiple source files, review loadsheet consistency, and validate entity definitions against a pre-defined ontology (i.e., Google's Digital Buildings Ontology).
This repo contains the following critical pieces:
- A well defined ontology (
./ontology
) - A command line interface for dynamically building and checking loadsheets (
./programs/cli.py
) - Associated support libraries for the command line interface (and for future enhancement):
- An ontology validator
- A loadsheet validator
- A handler class that sits atop all the relevant classes
- A rules engine for applying regular expression pattern matching
- A representations class set for converting the loadsheet into ontology-usable objects
This repo requires the following libraries to be installed prior to use:
- pyyaml (for parsing YAML documents)
- pyfiglet (for fancy CLI name)
- openpyxl (for Excel read/write)
- pandas (for loadsheet backend)
- ruamel.yaml
If not already installed, you can install the libraries by running requirements.py
in your command line:
>>> python requirements.py
Start the Commmand Line Interface (LoadBoy2000):
- Run the progam:
>>> python cli.py
Loadsheet process:
-
Prepare the loadsheet
- Obtain a point list (in XSLX or CSV format)
- Format the point list to adhere to the loadsheet template sheet
- Run the RULE ENGINE over the data
- Manually review the unmapped points
-
Validate the loadsheet
-
Match to existing DBO types
-
Create new types, as needed
-
Apply types to the loadsheet
Example workflow:
-
Import the ontology:
>>> import ontology '../ontology/yaml/resources'
If successful, you should get CLI confirmation.Manual (optional) unit tests:
- Add a fake field to the field list ('bacon_sensor') -- should return error
- Add a fake field with valid subfields ('supply_sensor') -- will NOT return an error.
- Add a new type with a fake field -- should return error
- Add duplicate fields to fake type -- should return error
-
Clean raw loadsheet:
>>> clean '../loadsheet/Loadsheet_ALC.xlsx'
-
Import the cleaned loadsheet:
>>> import loadsheet '../loadsheet/Loadsheet_ALC.xlsx'
If successful, you should get CLI confirmation.
-
Normalize the loadsheet (AKA apply the ruleset):
>>> normalize '../resources/rules/google_rules.json'
If successful, you should get CLI confirmation.
-
Export to a new loadsheet for review:
>>> export excel '../loadsheet/Loadsheet_ALC_Normalized.xlsx'
Rules should have been applied. You should see a new file with normalized columns (e.g.,
required
,assetName
, andstandardFieldName
) filled in. -
Perform a manual review and repeat steps 3, 4, and 5 as necessary.
-
Import and validate finished loadsheet:
>>> import loadsheet '../loadsheet/Loadsheet_ALC_Final.xlsx'
>>> validate
Validation will fail for common errors:
- duplicate
standardFieldName
andfullAssetPath
combinations - an invalid
standardFieldName
(i.e., not defined in the referenced ontology or mispelled) - missing bacnet info (e.g., missing
objectId
)
- duplicate
-
When no validation errors are present, assets in the loadsheet can be matched to DBO entity types:
>>> match
-
Perform a review of type matches and assign to a valid canonical type.
>>> review generalTypes
>>> review generalTypes VAV
>>> review generalTypes VAV 1
or
>>> review matches
-
Apply the matched types Either review all matches made using
>>> apply all
Or Autoapply exact matches and only review inexact using
>>> apply close
-
Convert normalized loadsheet to ABEL spreadsheet:
>>> convert abel ./path/to/building/payload.csv
The following is a list of issues that need to be addressed before widespread use:
- Add rigorous typing to all methods
- Make the necessary fields in
handler.py
andrepresentations.py
private - Increase the match success rate of the rules JSON (and potentially provide tooling or templates for users to create their own ruleset)