Skip to content

Commit

Permalink
Merge commit '9177918af81b6f2394e0869c704ad6267dc2584a'
Browse files Browse the repository at this point in the history
  • Loading branch information
rafelafrance committed Jan 4, 2024
2 parents 862f694 + 9177918 commit 4b4e380
Show file tree
Hide file tree
Showing 16 changed files with 3 additions and 477 deletions.
2 changes: 1 addition & 1 deletion traiter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Some literature mined:
1. Have experts identify relevant terms and target traits.
2. We use expert identified terms to label terms using spaCy's phrase matchers. These are sometimes traits themselves but are more often used as anchors for more complex patterns of traits.
3. We then build up more complex terms from simpler terms using spaCy's rule-based matchers repeatedly until there is a recognizable trait. See the image below.
4. We may then link traits to each other (entity relationships) using spaCy's dependency matchers.
4. We may then link traits to each other (entity relationships) also using spaCy rules.
1. Typically, a trait gets linked to a higher level entity like SPECIES <--- FLOWER <--- {COLOR, SIZE, etc.} and not peer to peer like PERSON <---> ORG.
2. Also note that sometimes the highest level entity is assumed by its context. For instance, if a web page is a description of a newly found species then I don't need to parse the species name in the document.

Expand Down
4 changes: 2 additions & 2 deletions traiter/traiter/create_spell_well_from_idigbio.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,13 +120,13 @@ def insert_misspellings(freq, spell_well_db, deletes):
batch.append((word, word, 0, count))
hits.add(word)

for delete in [w for w in spell_well.deletes1(word) if keep(word, w)]:
for delete in (w for w in spell_well.deletes1(word) if keep(word, w)):
if delete not in hits:
batch.append((delete, word, 1, count))
hits.add(delete)

if deletes > 1:
for delete in [w for w in spell_well.deletes2(word) if keep(word, w)]:
for delete in (w for w in spell_well.deletes2(word) if keep(word, w)):
if delete not in hits:
batch.append((delete, word, 2, count))
hits.add(delete)
Expand Down
Empty file.
63 changes: 0 additions & 63 deletions traiter/traiter/pylib/reconcilers/base.py

This file was deleted.

16 changes: 0 additions & 16 deletions traiter/traiter/pylib/reconcilers/coordinate_precision.py

This file was deleted.

23 changes: 0 additions & 23 deletions traiter/traiter/pylib/reconcilers/coordinate_uncertainty.py

This file was deleted.

16 changes: 0 additions & 16 deletions traiter/traiter/pylib/reconcilers/decimal_latitude.py

This file was deleted.

16 changes: 0 additions & 16 deletions traiter/traiter/pylib/reconcilers/decimal_longitude.py

This file was deleted.

137 changes: 0 additions & 137 deletions traiter/traiter/pylib/reconcilers/event_date.py

This file was deleted.

18 changes: 0 additions & 18 deletions traiter/traiter/pylib/reconcilers/geodetic_datum.py

This file was deleted.

16 changes: 0 additions & 16 deletions traiter/traiter/pylib/reconcilers/habitat.py

This file was deleted.

57 changes: 0 additions & 57 deletions traiter/traiter/pylib/reconcilers/maximum_elevation.py

This file was deleted.

Loading

0 comments on commit 4b4e380

Please sign in to comment.