Skip to content

Latest commit

 

History

History
80 lines (51 loc) · 5.37 KB

README.md

File metadata and controls

80 lines (51 loc) · 5.37 KB

WPP2 - Gruppe 7

Install

pip install -r requirements.txt

Usage

The program is an interactive shell. Start with

python main.py

Commands

search '<Query>'

Info

To use Proximity Queries use a forward slash '/' and not '\' because the "click" package parses it wrong

Benchmarks

  • Setup index: 3.41s

  • Search search '(blood OR pressure) AND cardiovascular': 0.001s

  • Search search '(blood OR presure) AND cardiovascular' (with spell check): 0.011s

New Benchmarks

  • Setup (Load word2vec, index, load vectors): 1min 43sec

Evaluation

PLAIN-121

Method Precision@5 Precision@10 Precision@20 Precision@50 Recall@5 Recall@10 Recall@20 Recall@50 F1@5 F1@10 F1@20 F1@50 R-Precision
td-idf 0.8 0.4 0.2 0.12 0.05714285714285714 0.05714285714285714 0.05714285714285714 0.08571428571428572 0.1066 0.0999 0.0888 0.0999 0.08571428571428572
word2vec 0.6 0.6 0.3 0.16 0.04285714285714286 0.08571428571428572 0.08571428571428572 0.11428571428571428 0.08 0.15 0.1333 0.1333 0.11428571428571428

PLAIN-1021

Method Precision@5 Precision@10 Precision@20 Precision@50 Recall@5 Recall@10 Recall@20 Recall@50 F1@5 F1@10 F1@20 F1@50 R-Precision
td-idf 0.8 0.9 0.9 0.82 0.006779661016949152 0.015254237288135594 0.030508474576271188 0.06949152542372881 0.013445378151260505 0.0300 0.05901639344262295 0.128125 0.30338983050847457
word2vec 1.0 1.0 0.9 0.54 0.00847457627118644 0.01694915254237288 0.030508474576271188 0.04576271186440678 0.01680672268907563 0.0333 0.05901639344262295 0.08437499999999999 0.2559322033898305

PLAIN-15

Method Precision@5 Precision@10 Precision@20 Precision@50 Recall@5 Recall@10 Recall@20 Recall@50 F1@5 F1@10 F1@20 F1@50 R-Precision
td-idf 0.2 0.2 0.2 0.1 0.02564102564102564 0.05128205128205128 0.10256410256410256 0.1282051282051282 0.0454 0.0816326530612245 0.13559322033898302 0.11235955056179775 0.1282051282051282
word2vec 0.0 0.2 0.15 0.08 0.0 0.05128205128205128 0.07692307692307693 0.10256410256410256 0.0 0.0816326530612245 0.1016949152542373 0.08988764044943821 0.10256410256410256

PLAIN-145

Method Precision@5 Precision@10 Precision@20 Precision@50 Recall@5 Recall@10 Recall@20 Recall@50 F1@5 F1@10 F1@20 F1@50 R-Precision
td-idf 0.6 0.3 0.2 0.22 0.08571428571428572 0.08571428571428572 0.11428571428571428 0.3142857142857143 0.15 0.1333 0.1454 0.25882352941176473 0.22857142857142856
word2vec 0.0 0.0 0.05 0.08 0.0 0.0 0.02857142857142857 0.11428571428571428 0.0 0.0 0.0363 0.09411764705882354 0.08571428571428572

PLAIN-1336

Method Precision@5 Precision@10 Precision@20 Precision@50 Recall@5 Recall@10 Recall@20 Recall@50 F1@5 F1@10 F1@20 F1@50 R-Precision
td-idf 0.2 0.2 0.1 0.04 0.125 0.25 0.25 0.25 0.15384615384615385 0.22222222222222224 0.14285714285714288 0.06896551724137932 0.125
word2vec 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

MAP-Score

td-idf: 0.17417613459986342

word2vec: 0.11169926119078662