Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fixes #298 Add time analysis of each engine for the results retrieved #341

Closed
wants to merge 2 commits into from

Conversation

bhaveshAn
Copy link
Member

Fixes #298

Checklist

  • I have read the Contribution & Best practices Guide and my PR follows them.
  • My branch is up-to-date with the Upstream master branch.
  • I have added necessary documentation (if appropriate)

Changes proposed in this pull request:

  • Add time analysis of each engine for the results retrieved

@vaibhavsingh97 @gabru-md Please review the PR.
Providing heroku deployment at https://hidden-harbor-34393.herokuapp.com/

time

@AnshulMalik
Copy link
Contributor

These changes are include the time taken by the network request from client to server and back.

We should only display the time taken to generate the results, that is returning the time along with results.

@raju249
Copy link
Contributor

raju249 commented Nov 27, 2017

The changes should be in backend code and not in front end IMHO.
Along with the results include the time and then parse the time to front end.

Copy link
Member

@gabru-md gabru-md left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please shift the time analysis code to backend.
As we want the time analysis of response generation and not the time taken by the server to process the request and send back the response.
In the current scenario the network lag will cause the time to increase whereas the actual time of generation will be different.

@bhaveshAn
Copy link
Member Author

@AnshulMalik @raju249 @gabru-md The thing is we cannot include time analysis in backend as API GET /api/v1/search/<search-engine>?query=query&format=format will get affected and since this api is used by SUSPER and OPEN EVENT to get the responses so can't affect the API.

@raju249
Copy link
Contributor

raju249 commented Nov 27, 2017

I think we don't have to include or change anything in the URL.
We can send the time in response. We can work it out like just pass the time value in json or xml, whoever wants to use it can use.

Others, any thoughts ?

@AnshulMalik
Copy link
Contributor

@bhaveshAn That's good point.
So let's hold this, and try to think, do we really need this time analysis feature?
If yes, we can discuss with SUSPER or SUSI or others who are using.

@AnshulMalik
Copy link
Contributor

@raju249, our response is currently just array, not a dictionary so we can not directly add another key to response.

@raju249
Copy link
Contributor

raju249 commented Nov 27, 2017

Ohh.
I thought its a key value pair.
No issues then, I think we can follow what you said.

@cclauss
Copy link
Contributor

cclauss commented Nov 27, 2017

Scrapers currently return one parameter: the urls list. It would be possible to modify the scrapers to return two parameters instead: the urls list and the elapsed time. The first parameter would be returned to the calling program and the second parameter would only be used for display, logging, or printing.

@AnshulMalik
Copy link
Contributor

That's possible @cclauss, but when I think about it's use case, I don't think this would be useful anywhere?
Why would someone want the time taken to generate the search results ?

@gabru-md
Copy link
Member

As @AnshulMalik stated. I guess there is no definite benefit/use of actually including the time elapsed into query-server results. I mean i cannot figure out the possible benefits that it will have since it is not a search engine
@bhaveshAn please throw some light on this matter and explain ?

@AnshulMalik
Copy link
Contributor

AnshulMalik commented Dec 6, 2017

@bhaveshAn reminder

@cclauss
Copy link
Contributor

cclauss commented Dec 8, 2017

I actually like this PR a lot... In modern software you should "instrument everything" and round trip time measurement is worth doing as long as at least one user is going to study the resulting data from time to time to understand if we have any bottlenecks that are worth addressing.

@cclauss cclauss mentioned this pull request Dec 10, 2017
5 tasks
@AnshulMalik
Copy link
Contributor

@cclauss considering round trip time is not recommended, since it can vary a lot depending upon user connection, It won't help anyhow, but marking time for fetching and parsing on backend can be a matrix which we can rely on.

@cclauss
Copy link
Contributor

cclauss commented Dec 10, 2017

Our code's performance is roughly:

the time to run server.search() - the time to run each scraper's search().

Once we have tuned the former, we can focus on which search engines are too slow. My bet would be that some are faster than others but without metrics, we do not know for sure. Some search engines might be so slow that we drop them.

@raju249
Copy link
Contributor

raju249 commented Dec 15, 2017

I think unless we pass the time in response from server, adding time via js is not useful.

@cclauss
Copy link
Contributor

cclauss commented Dec 15, 2017

There are two things: What we want to measure #382 and how we want to measure it (this PR / discussion). I believe the “what” should come before the “how”. Is it possible for the maintainers to respond to #382 so we know what we are trying to do before we decide how we are going to do it?

@mariobehling
Copy link
Member

What is the status of this? Should this be closed and reopened later, when there is a follow-up?

@bhaveshAn
Copy link
Member Author

@mariobehling I think we should go with adding this feature, since time analysis comes under performance category. And thats cool idea.
Though the discussion is only on whether we should implement it

  1. In Flask backend where the API GET /api/v1/search/<search-engine>?query=query&format=format will get affected and we will return the search results response alongwith time taken.
  2. In JavaScript frontend where we just calculate the time difference between the clicking the submit button and the response returned by backend. ( AS DONE IN THIS PR )
    Please give your opinion.

@mariobehling
Copy link
Member

@vaibhavsingh97 What are your thoughts?

@vaibhavsingh97
Copy link
Member

@mariobehling @bhaveshAn IMO, instead of checking time on client side, we should add one time parameter to the backend API and return the response along with the time. But my question is where time will be useful?

@vaibhavsingh97
Copy link
Member

@bhaveshAn Update please

@bhaveshAn
Copy link
Member Author

In Flask backend where the API GET /api/v1/search/?query=query&format=format will get affected and we will return the search results response alongwith time taken.

@vaibhavsingh97 Should I make these corresponding changes.

@vaibhavsingh97
Copy link
Member

@bhaveshAn yes, just pass one dictionary context while rendering the page

Copy link
Member

@realslimshanky realslimshanky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please resolve conflicts.

@bhaveshAn
Copy link
Member Author

This PR needs updation in codebase. Please give some time.

@bhaveshAn bhaveshAn changed the title Fixes #298 Add time analysis of each engine for the results retrieved [WIP] Fixes #298 Add time analysis of each engine for the results retrieved Feb 4, 2018
@mariobehling
Copy link
Member

I agree time should be retrieved from the server, not the client.

Work on this issue is stalled. Closing this due to inactivity. Please reopen new PR if you want to resolve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add time analysis of each engine for the results retrieved
9 participants