On the impact of sameAs on schema matching

This repository contains all the Python scripts and data necessary to replicate our experiments of our paper "On the impact of sameAs on schema matching" authored by Joe Raad, Erman Acar, and Stefan Schlobach.

With these experiments we aim at answering the two following research questions:

Q1. Does the inclusion of instance-level interlinks enhance instance-based schema alignments? (w and w/o considering the transitive closure of the class subsumption relation.)

Q2. Is there a correlation between the quality of the instance-level interlinks and the quality of the resulting schema alignments?

A number of external resources are necessary for replicating these experiments:

Download the LOD-a-lot dataset.

This data set contains 28.3 billion triples collected from the 2015 LOD Laundromat crawl of over 650K data documents from the Web. It is exposed in an HDT file that is 524GB in size (including its additional index), and is publicly accessible via an LDF interface.

Download the Equivalence Classes.

This data set of equivalence classes results from the closure of all 558 million owl:sameAs links in the sameAs.cc data set. This data set also contains two additional set of equivalence classes resulted (a) after discarding all owl:sameAs links with an error degree >0.99, and (b) after discarding all owl:sameAs links with an error degree >0.4.

Install the HDT Python library

This library allows to read and query HDT document with ease in Python

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
figures		figures
results		results
scripts		scripts
Analysis-Concepts-Size-Inference.ipynb		Analysis-Concepts-Size-Inference.ipynb
Evaluation.ipynb		Evaluation.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

On the impact of sameAs on schema matching

With these experiments we aim at answering the two following research questions:

A number of external resources are necessary for replicating these experiments:

About

Releases

Packages

Languages

raadjoe/impact-sameAs-schema-matching

Folders and files

Latest commit

History

Repository files navigation

On the impact of sameAs on schema matching

With these experiments we aim at answering the two following research questions:

A number of external resources are necessary for replicating these experiments:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages