-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathclean-up-log.text
77 lines (53 loc) · 2.45 KB
/
clean-up-log.text
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
From LOGICS.EMAP.FGV.BR to WNPT.BRLCLOUD.COM
============================================
Notes:
a) All methods mentioned here are from WORDNET-EDITOR;
b) "Context" and "graph" are the same terms. Allegro Graph uses
"graph", we'll use "context".
High level steps:
1. Fix incorrect IRIs in logics.emap.fgv.br
2. Rename namespaces
3. Fix default graph
4. Process suggestions
5. Deal with blank nodes
Detailed steps:
1. Some IRIs in the logics.emap.fgv.br had spaces in them, which is
not acceptable to Allegro Graph when loading the files. To fix
this, run the method PROCESS-ALL-WORDS-WITH-INVALID-CHARS in
(clean-up.lisp)
2. After this is done, export all triples with context information
using the method EXPORT-TRIPLES (export.lisp)
3. Execute the script rename-namespace.sh
4. Re-import the files preserving the context information as follows:
agload -g "<file://own-pt.nt>" wn30 nomlex.nt
agload -g "<file://wnlite.nt>" wn30 wnlite.nt
agload -g "<file://own-pt.nt>" wn30 own-pt.nt
agload -g "<file://wordnet-en.nt>" wn30 wordnet-en.nt
agload wn30 default-graph.nt
5. At this point we have the triple-store with the same contents of
the original LOGICS.EMAP.FGV.BR triple-store, but with proper
namespaces and context information.
6. Process suggestions from the wnpt.brlcloud.com/wn site: execute
PROCESS-SUGGESTIONS (suggestions.lisp)
7. Make sure none of the newly added suggestions were added to the
default graph:
(length (get-triples-list :g (default-graph-upi *db*))) should be 0.
8. Fix the remaining default-graph issues by executing
FIX-DEFAULT-GRAPH (clean-up.lisp) and
fix-remaining-blank-nodes-in-default-graph.
9. We need to clean the literals of the synsetId and tagCount
properties (they need to be xsd:nonnegativeInteger).
# Run: (FIX-INCORRECTLY-TYPED-LITERALS) on file CLEAN-UP.LISP
10. Fix all blank word senses, by executing
PROCESS-ALL-BLANK-WORDSENSES
11. Fix all blank word nodes, by executing PROCESS-ALL-BLANK-PT-WORDS
and PROCESS-ALL-BLANK-EN-WORDS.
========================================================
= After verification is complete, these need were done =
========================================================
12. New wn30.turtle was provided: need to replace the one in the
triple-store (context <file://wnlite.nt>
# First, execute the following SPARQL:
# drop graph <file://wnlite.nt>
13. Next, imported wn30.ttl from the openWordnet-PT with the
proper context