R wrapper to the DBpedia Spotlight API. All official documentation is online.
It is a tool for automatically annotating mentions of DBpedia resources in text, providing a solution for linking unstructured information sources to the Linked Open Data cloud through DBpedia. - DBpedia Spotlight
Global options are set with spot_set_opts
. If you want to extensively use dbpedia you are encouraged to deploy spotlight-docker on your own server then point to it with the base_url
argument in spot_set_opts
.
You can install spotlight from github with:
# install.packages("devtools")
devtools::install_github("news-r/spotlight")
library(spotlight)
# Data to extract entities from
text = c(
"Donald Trump is probably in Washington DC.",
"szzza dasdazsd azzsd daawq", # garbage
"" # empty document
)
# remove empty documents
text <- spot_filter(text)
# Annotate
places <- spot_annotate(text, types = "DBpedia:Place")
#> Annotating 2 documents.
#> 100% annotating document: 2 [====================================================================] eta: 0s
knitr::kable(places)
text | confidence | support | types | sparql | policy | resource_URI | resource_support | resource_types | resource_surfaceForm | resource_offset | resource_similarityScore | resource_percentageOfSecondRank |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Donald Trump is probably in Washington DC. | 0.5 | 0 | DBpedia:Place | whitelist | http://dbpedia.org/resource/Washington_(state) | 43066 | Wikidata:Q3455524,Schema:Place,Schema:AdministrativeArea,DBpedia:Region,DBpedia:PopulatedPlace,DBpedia:Place,DBpedia:Location,DBpedia:AdministrativeRegion | Washington | 28 | 0.5234317 | 0.4408709 |