-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Celerify HydroShare Export #2897
Conversation
Hmm, just noticed that the above demo project doesn't have GMS files. Will look into that. |
One strategy that could work for notifying users that they should re-export to HydroShare is to compare the last HydroShare export date with the most recent scenario update date. If a scenario was updated after the export date, then we know there are new changes. Might need to confirm that the scenario update date is available on the front end. |
Previously we would use a User Model to get an instance of a HydroShare Service Client for them. In this effort we're offloading some of this work to Celery Tasks, which do not support passing non-serializable Models as parameters. So, we will be forced to use simple data types.
These mostly follow the create and update paths in the view with the addition of spawning their own copies of HydroShare Service, since that cannot be given to them across the Celery serialization boundary.
Now, in the case of POST we run a Celery job instead of doing it all within a single request. This sends back a job_id that can be polled for. This polling will behave exactly as other such instances.
A HydroShareExportTaskModel is added to poll for results, set to poll every 6 seconds since this is relatively slow process. A task model gives us two promises: one for when the initial request is made, and another for when polling stops. We handle errors such as connectivity problems etc in the start promise failure. We handle response from HydroShare in the polling promise resolution. The exportToHydroShare method returns the two promises combined, to resolve when they are both ready. This isn't currently used, but keeps the API same as before when we returned the AJAX promise.
Since sending each file takes a few seconds, and when multiple scenarios are present in a project they have a large number of files, which can take a lot of time to send, we increase the timeout of both Celery tasks for creating and updating resources, as well as front end that will poll to a corresponding degree.
Since now the HydroShare Export can potentially take quite a while, during which the user can keep making changes, with Autosync enabled that would simply add each change to the queue, leading to sitations where the app would constantly be exporting to HydroShare. Furthermore, because this export queue isn't visible to the user, they can navigate away, blocking export from completion. Adding a visible queue that is user-manipulatable is currently out of scope. Instead, our compromise here is to disable autosyncing completely, and relying on the user to export manually when they want.
When we switched to using job uuids for transferring MapShed results for Subbasin, we didn't update this HydroShare export functionality which had been broken. By switching to using the MapShed Job UUID, we enable GMS file exports to HydroShare again.
08044c7
to
4c96a44
Compare
Rebased on |
Added a commit that indicates if a project has changed since the last export. |
Taking a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think you may have found a genuine bug. Looking into it now. |
That was a great catch. I had only been testing with projects that had previously been exported to HydroShare. Doing so with a new project tripped on not having Ready for another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for me now. Nice job. I was able to upload projects with 8 scenarios before the exports started timing out. Since these exports take a long time, consider adding a temporary message while the upload is in progress letting the user know that the upload may take a while.
Added a final commit that only re-exports scenarios that have changed, or have not been exported before. If you could test this one last thing, I think we'll be good to go for the near future. Recommended tests:
|
Since re-exporting existing projects takes longer than exporting them the first time (because all files have to be checked if they exist in HydroShare, and if so, deleted, then re-exported), limiting the number of files we send on the re-export significantly speeds up the process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvements. Performance is much better.
Thanks for the multiple thorough reviews! Going to merge this now. |
Will squash fixups before merging. |
Since we no longer autosync projects, and the user must be prompted to export explicitly, we add a message in the MultiShare Modal under the HydroShare section saying if the project has changed since last export. This message is only visible if the date of last export is less recent than the latest modified_at date of all the scenarios in the project. Thus, if any scenario is modified after an export, this message will be shown. To allow this, we now always update the client scenario model with the modified_at date from the server.
Previously we always exported all scenarios for simplicity. Unfortunately, as we bump up against the limit of how many files can be exported before a timeout, especially when the record already exists and each export checks for a file, deletes it, and then exports it again, being judicious about which files we send ensures that we almost never time out.
2570370
to
20261d5
Compare
These were previously not required because all the HydroShare export processing was done in Django from withiin App. However, in #2897 it was changed to be done in Celery from within Worker. Without these variables being initialized correctly, the export does not work.
Overview
Since exporting multiple files to HydroShare can take a long time, and projects with multiple scenarios can balloon the number of files really quickly, the exports would time out in some cases.
To mitigate that, we wrap the creation and updating methods for HydroShare resources in Celery Tasks, and return a Job ID instead. This frees the export from the constrains of a request-response cycle, while also not tying up a
gunicorn
worker process.Celery Tasks have their own timeouts however, and these are set to timeout after 5 minutes. The front-end which polls for these results has the same timeout.
Since these exports take so long, as users add changes in the front-end that can keep being queued for export. But if each export takes ~4 minutes, and each single change triggers an export, we're looking at potentially hours of exports queued up. If the browser is closed, or the user navigates away during this time, the queued exports would be dropped silently. Furthermore, there is no visibility of this queue, or any sort of progress of the exports. Thus, the autosync behavior is too obfuscated with potential data loss for users, which is why the final commit disables it. The autosync options are removed from the UI, all new projects default to it disabled, and a migration turns it off for all currently-syncing HydroShare resources.
To assist or prompt users to export to HydroShare, we can think about adding a "dirty" or "stale" flag to a HydroShare resource, that can be set when any change to a project is made, and unset when a HydroShare export finishes. This is left to a future task.The above has been incorporated here by comparing the last HydroShare export date with the last scenario modification date. If a scenario has been modified more recently than the last export, we add a message to the HydroShare Export modal saying so.
Connects #2885
Demo
https://beta.hydroshare.org/resource/d1f4f20cd6344b7a809552164cc512a6
Notes
I also tried cutting every piece of the export into a Celery Task. That work is vieweable in this diff hydroshare-celerification-2.patch. This was suboptimal because since Celery doesn't allow sharing complex objects between tasks, the HydroShare Rest Client could not be shared, and each task would have to create its own which involved making OAuth requests to HydroShare, further slowing down the process. And while splitting up into smaller tasks allowed us to have an unspecified timeout at the Celery level (since each individual task could finish under the default 90s timeout), we would still have a timeout in the front-end, so there is no getting rid of timeouts completely.
HydroShare Recommendations
There are potential improvements that can be made to the HydroShare API which would help us significantly:
Support Bulk File Uploads. Since the most time is spent uploading a file at a time, even though all the files are ready at once, the ability to support bulk file uploads would speed up the process significantly. Probably by an order of magnitude or more. Support Bulk File Uploads hydroshare/hydroshare#2859 Support Bulk File Uploads hydroshare/hs_restclient#86
Allow Overwriting Files. Currently, if a file already exists in HydroShare, you cannot add it again. We have to first delete that file, then add it. This leads to updates taking twice as long as creation, because for each file we must first delete it then add it. If adding files supported overwriting, this could be done in half the time. Support Overwriting Files hydroshare/hydroshare#2860 Support Overwriting Files hydroshare/hs_restclient#87
Speed Up API Responses. Currently each API request takes ~1s to turn around. If this was faster the export would be too.
Testing Instructions
MMW_HYDROSHARE_SECRET_KEY
in theapp
andworker
VMs using the "HydroShare Beta" item in LastPassbundle
,vagrant ssh app -c 'sudo restart mmw-app'
andvagrant ssh worker -c 'sudo service celeryd restart'