Tagger extends Josch (see Josch below) by integrating Tagged Unions as well as more schema extraction approaches. To use Tagger in Josch, please follow the steps above first and then the following below.
First, navigate to tools/Tagger/Tagger-main/default-config.json
.
Change the value of out-dir
to any directory you have the permission to write to.
Then you can start Josch like described above.
Afterwards navigate to the Tool Settings
Pane of Josch.
Click on Select tools folder
and use the directory navigator to point towards the
tools
-directory in Josch.
If you want to use the approach of Spoth et al., you have to have a valid Java 8 installation.
Please also select the folder accordingly and make sure that the path to java 8 ends with /bin/java
.
If you want to use the approach of Frozza et al., you have to provide a path towards a working version of
this tool.
We provide one here,
but you may use your own if you wish.
Follow the instructions in the README of this project to build the tool.
In Josch, please provide a valid path and make sure that it ends with approaches/frozza
.
Afterwards you can connect to the database as above and utilise Tagged Union Extraction by
navigating to the Tagger
-pane.
Implemented by Valentin Gittinger and Stefan Klessinger.
This work was published as a demo at EDBT 2023. To cite this work, please use the following BibTeX entry
@inproceedings{DBLP:conf/edbt/KlessingerFGKSS23,
author = {Stefan Klessinger and
Michael Fruth and
Valentin Gittinger and
Meike Klettke and
Uta St{\"{o}}rl and
Stefanie Scherzinger},
editor = {Julia Stoyanovich and
Jens Teubner and
Nikos Mamoulis and
Evaggelia Pitoura and
Jan M{\"{u}}hlig and
Katja Hose and
Sourav S. Bhowmick and
Matteo Lissandrini},
title = {Tagger: {A} Tool for the Discovery of Tagged Unions in {JSON} Schema
Extraction},
booktitle = {Proceedings 26th International Conference on Extending Database Technology,
{EDBT} 2023, Ioannina, Greece, March 28-31, 2023},
pages = {827--830},
publisher = {OpenProceedings.org},
year = {2023},
url = {https://doi.org/10.48786/edbt.2023.75},
doi = {10.48786/EDBT.2023.75},
timestamp = {Sat, 29 Apr 2023 13:06:22 +0200},
biburl = {https://dblp.org/rec/conf/edbt/KlessingerFGKSS23.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Josch is a cockpit application that combines schema extraction and checking for JSON Schema containment to exploit their interactions. It can be used for schema-less NoSQL document stores, but is currently geared for MongoDB. Furthermore, it does not implement schema extraction and checking for JSON Schema containment itself, instead, it uses third-party-tools for these tasks and allows the user to easily switch between them (in the user interface).
Schema-Extraction: Josch analyzes A MongoDB collection and a JSON Schema or a MongoDB validator is extracted, that describes the structure of the stored data.
JSON Schema Containment: Josch compares Two JSON Schema documents to check whether the language defined by one schema is a superset, superset, equivalent or incomparable to the language defined by the other JSON Schema document.
Josch uses Maven to preserve a modular architecture that allows to readily extend Josch by adding new tools for schema extraction or JSON Schema containment checking. Even further, other document stores can also be added.
-
Extract a JSON Schema using different extraction tools.
- Use relative or absolute sampling to extract.
- Switch between the extraction tools within the application.
-
Compare two JSON Schemas semantically using different containment tools.
- Switch between the tools within the application.
-
Compare two JSON Schemas syntactically and highlight the differences.
-
Store and browse historic schema versions.
- Add a personal note when storing.
- Filter JSON Schemas by the date of storing.
-
Validate all or individual documents against a JSON Schema.
- Find the documents that do not validate.
- Get the amount of valid documents.
- Find out why a single document fails validation.
-
Load, modify and create a JSON Schema
-
Show the available databases and collections of the database server.
-
Show random document samples for a given collection.
- Show all documents if the collection is not too big.
-
Insert a document into the collection.
- Extract a MongoDB validator.
- Generate a MongoDB validator from a given JSON Schema.
- Register a new MongoDB validator at the database with specific validation action and level.
- Validate all or individual documents against a MongoDB validator.
- Find the documents that do not validate.
- Get the amount of valid documents.
Josch is implemented in Java, but the third-party-tools used by Josch requires other compilers. These have to be installed and be accessible.
Some aspects of Josch require environment variables (short: variables, EV) to be set. The setting of these is dependent on your operating system (OS). Please refer to the manual in order to find out how to set and modify them.
Whenever a command
is given, please execute it in your OS' shell/terminal. The shell is the command
line interface of your operating system. Please note that the shell has to be restarted after each
environment variable is set.
Josch uses Java 14 or higher. You can use OpenJDK or Oracle JDK.
To use the JSON Schema Containment tool jsonsubschema, the following needs to be installed:
The schema containment checking tool jsonsubschema requires Python 3.8 or higher.
Pipenv creates and manages virtual environments for Python projects. There are two ways to install
it: Isolated or pragmatic. For further information see the
Pipenv documentation.
We do generally suggest performing an isolated installation, which includes adding Pipenv to
the PATH
variable.
Josch requires that the location of Pipenv is part of your PATH
variable, so please ensure
that pipenv is accessible from your shell by the command pipenv --version
.
Open the cloned directory of this repository and navigate (via your shell) to tools\JsonSubSchema
and execute the command pipenv install
in order to install all required Python modules.
You can also move this directory to another place, but please make sure to specify the correct path in Josch (settings can be applied in the user interface).
To use the JSON Schema Containment tool is-json-schema-subset, the following needs to be installed:
Open the cloned directory of this repository and navigate (via your shell) to
tools/IsJsonSchemaSubset
and execute the command yarn install
in order to install all required
Node modules there.
You can also move this directory to another place, but please make sure to specify the correct path in Josch (settings can be applied in the user interface).
Hackolade is a commercial tool to extract JSON Schema and MongoDB validator from the MongoDB database server. In order to use Hackolade with Josch a Professional Edition Licence is required. Before starting Josch and using Hackolade, it has to be installed and set up using the following steps:
- Start the application and click on
common tasks
. Then click onReverse-Engineer target
. - Choose
MongoDB
with the accordingtarget version
of your database. Now click theCreate
button and finally theAdd
button. - Configure the connection to your database and enter the name that you want. Note that you have to
remember the name and pass it to Josch later on. Confirm the settings by hitting the
save
button. - After saving the connection, your database should show up in the list. Hackolade isn't required anymore and can be closed.
- Add the installation path of hackolade to a
hackolade
environment variable and add it to thePATH
variable as well.
As this is a Java library, it is contained in Josch.
Josch is developed as a multi-module Maven Project. You can either use your Java IDE to execute it or you can use Maven directly.
- Navigate to the
josch
directory of the repository via your shell. It holds apom.xml
and the submodules. - Execute the command
mvn clean install
. - Navigate to the subdirectory
josch.presentation\josch.presentation.gui\josch.presentation.gui.controller\target
. - Execute the command
java -jar josch-1.0-jar-with-dependencies.jar
. For this command to work the Java application has to be on yourPATH
variable.
Import Josch as a Maven Project
via IDE and build the Project accordingly. The main class and
method to launch the application is josch.presentation.gui.controller.App.main()
.
- Easy integration of new NoSQL document stores that base on JSON data.
- Easy integration of new schema extraction and containment tools.
- Different color themes that can be extended upon.
As Josch is a multi-module Maven Project. It can be extended easily. Extensions can be made in any given layer. The implementation of extensions is similar for all layers and extensions except for the presentation layer because it has no layer above.
In order to extend Josch, you need to create a new Maven submodule in the respective component (josch.services.<COMPONENT>
). To make your implementation stick to Josch, use the interfaces and abstract classes in the corresponding layer (josch.<LAYER>.interfaces
). Each submodule is required to have a module-info.java
in order to avoid transitive dependencies and needs to be registered in the parent pom.xml
. Examples can be found in every leaf module, e.g. josch.services.comparison.jsonsubschema
.
After you have implemented the new module, you have to register it within Josch:
- To make the module selectable in the user interface, register the module as new value in the respective component. These can be found at
josch.model.enums
. E.g. to register a new module for checking containment, add it tojosch.model.EContainmentTools.java
- To make the module work in Josch internally, register it in the respective factory. Each layer has its own factory (
josch.<LAYER>.factory
). In order to register it, add it to the respectiveswitch
statement.
Implemented by @daubersc