Skip to content

Working with JSON LD

Pedro Szekely edited this page May 20, 2015 · 22 revisions

Converting JSON-LD to One Object Per Line

  1. Download and install jq from http://stedolan.github.io/jq/
  2. To convert JSON Array to one json object per line from command line
   jq ".[]" -c <filepath> > <outputfilepath>

Merging JSON-LD Files

Sometimes you model multiple sources and produce multiple JSON-LD files, and then you want to merge the JSON-LD files into a single file. There are two cases:

  • Reducing: this involves combining top-level JSON-LD objects by URI. The reducer is smart so it first combines objects at the top level, and then proceeds recursively to combine objects at all levels of the tree.
  • Joining: Need to provide an example.

Reducing JSON-LD

Joining JSON-LD

To do joins of JSON-LD files you need to set up Hadoop and Hive on your machine, and then you run a script to join your files.

Hadoop and Hive Setup

http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
http://www.apache.org/dyn/closer.cgi/hive/

Loading JSON-LD into Elastic Search

  1. git clone https://github.com/usc-isi-i2/dig-elasticsearch.git
  2. Change directory to types/webpage/scripts
  3. Type python loadDataElasticSearch.py -h. This will provide help for the script as below
   usage: loadDataElasticSearch.py [-h] [-hostname HOSTNAME] [-port PORT]
                                   [-mappingFilePath MAPPINGFILEPATH] dataFileType
                                filepath indexname doctype

   positional arguments:
      filepath            json file to be loaded in ElasticSearch
      indexname           desired name of the index in ElasticSearch
      doctype             type of the document to be indexed
      dataFileType        Specify '0' if every line in the data file is
                          different json object or '1' otherwise

   optional arguments:
      -h, --help                       show this help message and exit
      -hostname HOSTNAME               Elastic Search Server hostname, defaults to 'localhost'
      -port PORT                       Elastic Search Server port,defaults to 9200
      -mappingFilePath MAPPINGFILEPATH mapping/setting file for the index

d. Execute:

python loadDataElasticSearch.py <filepath> <index-name> WebPage

If you don't have Elastic Search please download it from https://www.elastic.co/products/elasticsearch and follow the installation instructions.