Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistical feature #44

Open
wants to merge 40 commits into
base: master
Choose a base branch
from

Conversation

neelsomani
Copy link
Member

No description provided.

Sharabesh and others added 30 commits July 2, 2017 10:43
…nd finished some of the underlying implementation
@jbpoline
Copy link
Collaborator

jbpoline commented Jul 7, 2017

hey - that's great - but that's a lot of commits to review in one PR - in the future let's do pull request for smaller and more homogeneous chunck when possible - testing the stuff on my machine now !

@neelsomani
Copy link
Member Author

I can split this into backend and frontend, if that's easier! (Commented with the wrong account previously.)

@neelsomani
Copy link
Member Author

Some notes:

You can test the API request directly by querying http://localhost:5000/json/collection-significance. Parameters to send are specified at http://localhost:5000/json/collection-significance/help

I commented out some print statements in statistics.py. It might help to uncomment those to track exactly what's going on.

@jbpoline
Copy link
Collaborator

jbpoline commented Jul 7, 2017

I found that the widget does not always (actually, now never) show up after the significance test is finished - @neelsomani @Sharabesh did you observed this as well ? I created a "10-vision-papers" and a "10-audio-papers" collection and tried to test the difference, we should get "vision" blobs in the vision-audio and "audio" blob in the audio-vision. We probably need to have some kind of stupid dump (the x,y,z that are significant and their p-value / z-score) in some temporary text file to test this.

@neel-temporary-user
Copy link

Hmm, that seems to imply that the API endpoint is failing. Did you try pulling the latest version? I noticed that something was wrong shortly after I pushed - then I realized that my correction was incorrect, so I pushed again. If you don't have the latest version, that might explain it. I tested the latest version right before I left for work.

@neelsomani
Copy link
Member Author

Not going to bother deleting/readding that comment again...

@neelsomani
Copy link
Member Author

We probably don't want to dump a file to the server itself, but we can dump to a database. I think the best solution is to check your Javascript console, and/or uncomment the print statements in the Python code. Because the endpoint is asynchronous, it's not going to throw errors in the same way (i.e., on certain failures, rather than returning an invalid response code, it will just hang indefinitely).

@neelsomani
Copy link
Member Author

Alright, I just replicated what you did, and I found that there were no significant results returned by the API endpoint. (i.e., the call wasn't hanging/there was no error; it just didn't return any results.) I've added a div to indicate when no results are returned.

@Sharabesh
Copy link
Member

With regards to significance values, if you set the threshold and width to some arbitrarily high value, you can see results, but under defaults it looks like results weren't showing up for the majority of the ones I tested. @jbpoline did you think was an issue with implementation or a case where we should update the default values?

@neelsomani
Copy link
Member Author

If you do a test with 30 articles with the query "vision" against the whole database, then results will show up with the current threshold; that's why I moved it back down from the arbitrarily high value. So it's more likely an issue with the implementation rather than the threshold (or neither, and this is just the correct result).

@jbpoline
Copy link
Collaborator

jbpoline commented Jul 8, 2017

testing it again - I had pulled though... will uncomment to see what's going on...

@jbpoline
Copy link
Collaborator

jbpoline commented Jul 8, 2017

ok - I get "no significant results" - which is a bit strange when comparing the "audio" versus "vision" - will dig into this ...

@neelsomani
Copy link
Member Author

Hi @jbpoline, aside from making the endpoint, we might want to look into this issue you brought up in July (where "audio" vs. "vision" gave no significant results), if it hasn't already been resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants