pip install -r requirements.txt
The program is an interactive shell. Start with
python main.py
search '<Query>'
To use Proximity Queries use a forward slash '/' and not '\' because the "click" package parses it wrong
-
Setup index: 3.41s
-
Search
search '(blood OR pressure) AND cardiovascular'
: 0.001s -
Search
search '(blood OR presure) AND cardiovascular'
(with spell check): 0.011s
- Setup (Load word2vec, index, load vectors): 1min 43sec
Method | Precision@5 | Precision@10 | Precision@20 | Precision@50 | Recall@5 | Recall@10 | Recall@20 | Recall@50 | F1@5 | F1@10 | F1@20 | F1@50 | R-Precision |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
td-idf | 0.8 | 0.4 | 0.2 | 0.12 | 0.05714285714285714 | 0.05714285714285714 | 0.05714285714285714 | 0.08571428571428572 | 0.1066 | 0.0999 | 0.0888 | 0.0999 | 0.08571428571428572 |
word2vec | 0.6 | 0.6 | 0.3 | 0.16 | 0.04285714285714286 | 0.08571428571428572 | 0.08571428571428572 | 0.11428571428571428 | 0.08 | 0.15 | 0.1333 | 0.1333 | 0.11428571428571428 |
Method | Precision@5 | Precision@10 | Precision@20 | Precision@50 | Recall@5 | Recall@10 | Recall@20 | Recall@50 | F1@5 | F1@10 | F1@20 | F1@50 | R-Precision |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
td-idf | 0.8 | 0.9 | 0.9 | 0.82 | 0.006779661016949152 | 0.015254237288135594 | 0.030508474576271188 | 0.06949152542372881 | 0.013445378151260505 | 0.0300 | 0.05901639344262295 | 0.128125 | 0.30338983050847457 |
word2vec | 1.0 | 1.0 | 0.9 | 0.54 | 0.00847457627118644 | 0.01694915254237288 | 0.030508474576271188 | 0.04576271186440678 | 0.01680672268907563 | 0.0333 | 0.05901639344262295 | 0.08437499999999999 | 0.2559322033898305 |
Method | Precision@5 | Precision@10 | Precision@20 | Precision@50 | Recall@5 | Recall@10 | Recall@20 | Recall@50 | F1@5 | F1@10 | F1@20 | F1@50 | R-Precision |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
td-idf | 0.2 | 0.2 | 0.2 | 0.1 | 0.02564102564102564 | 0.05128205128205128 | 0.10256410256410256 | 0.1282051282051282 | 0.0454 | 0.0816326530612245 | 0.13559322033898302 | 0.11235955056179775 | 0.1282051282051282 |
word2vec | 0.0 | 0.2 | 0.15 | 0.08 | 0.0 | 0.05128205128205128 | 0.07692307692307693 | 0.10256410256410256 | 0.0 | 0.0816326530612245 | 0.1016949152542373 | 0.08988764044943821 | 0.10256410256410256 |
Method | Precision@5 | Precision@10 | Precision@20 | Precision@50 | Recall@5 | Recall@10 | Recall@20 | Recall@50 | F1@5 | F1@10 | F1@20 | F1@50 | R-Precision |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
td-idf | 0.6 | 0.3 | 0.2 | 0.22 | 0.08571428571428572 | 0.08571428571428572 | 0.11428571428571428 | 0.3142857142857143 | 0.15 | 0.1333 | 0.1454 | 0.25882352941176473 | 0.22857142857142856 |
word2vec | 0.0 | 0.0 | 0.05 | 0.08 | 0.0 | 0.0 | 0.02857142857142857 | 0.11428571428571428 | 0.0 | 0.0 | 0.0363 | 0.09411764705882354 | 0.08571428571428572 |
Method | Precision@5 | Precision@10 | Precision@20 | Precision@50 | Recall@5 | Recall@10 | Recall@20 | Recall@50 | F1@5 | F1@10 | F1@20 | F1@50 | R-Precision |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
td-idf | 0.2 | 0.2 | 0.1 | 0.04 | 0.125 | 0.25 | 0.25 | 0.25 | 0.15384615384615385 | 0.22222222222222224 | 0.14285714285714288 | 0.06896551724137932 | 0.125 |
word2vec | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
td-idf: 0.17417613459986342
word2vec: 0.11169926119078662