Scala tool to extract data from local BabelNet indices
Currently, the tool does the following:
- reads nouns from
nouns_no_header.csv
- queries local BabelNet with these nouns (the expected results are also nouns in English)
- stores BabelNet ID, candidate FrameNet ID, BabelNet definition, candidate FrameNet definition, and all the edges for each synset in
bn_entries.csv
andbn_edges.csv
And that's basically it so far.
- Clone the repository
git clone https://github.com/slowwavesleep/BabelNetExtractor.git
-
Install IntelliJ IDEA (Community version will suffice) if you don't have it
-
Open the cloned repository with IDEA (File -> Open)
-
Add BabelNet API to the libraries:
- File -> Project Structure ->
+
-> Java
- Navigate to BabelNet API and select the corresponding
.jar
file (such as/BabelNet-API-3.6.1/babelnet-api-3.6.1.jar
) - Add
lib
directory the same way (such as/BabelNet-API-3.6.1/lib
) - The list of libraries should look like this:
- File -> Project Structure ->
-
Copy
config
directory from BabelNet API to the project -
Point to local indices in
config/babelnet.var.properties
(for example,babelnet.dir=/home/user/BabelNet/BabelNet-3.6
) and comment out the rest of parameters in this file (unless you know what you're doing) -
Run
src/main/scala/Extract.scala
using IDEA (right click on it, thenRun 'Extract'
) -
After a few minutes
bn_entries.csv
andbn_edges.csv
should appear insrc/main/resources