-
Notifications
You must be signed in to change notification settings - Fork 5
Enable Bulk User Creation for Larger Batches of Users #597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit allows ProfileApiClient to call the new fast validation endpoint in Profile.
This is deliberately a serial queue.
This commit creates a new operation that validates a batch. It moves all of the validation error handling out of ::CreateBatch, leaving that operation quite simple.
This commmit takes the validation and concurrency control out of create_students_job and puts it in school_students_controller. The idea is that instead of validating then committing one job of 50 students, we: 1. validate the entire batch of N students quickly by calling SchoolStudents::ValidateBatch 2. Split the batch into chunks of 50. 3. Enqueue them in the context of a GoodJob::Batch, ensuring atomic enqueue of all `N/50` jobs. 4. Control concurrency by not allowing creation of another Batch whose description field matches the school ID (this makes up for the fact that GoodJob::Batch does not have a concurrency key like Jobs do).
Switching to `filter_map` ensures that a parameter list of `[]` gets through as `[]` and not `[nil]`.
The validation functionality has been refactored and raised to the controller level that calls CreateBatch, so the tests are at a higher level now.
We require contributors to sign our Contributor License Agreement, and we don't have you on file. In order for us to review and merge your code, please complete this form and we'll get you added and review your contribution as soon as possible. |
class << self | ||
def call(school:, school_students_params:, token:, user_id:) | ||
response = OperationResponse.new | ||
response[:job_id] = create_batch(school, school_students_params, token, user_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the frontend able to poll this batch job like a standard job then, i.e. there's no impact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I completely understand this question - can you amplify this a bit for me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the frontend uses the job id to poll for the status of the job, just checking that given it's now using a batch id it'll all work as before - I suspect it will
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out that it does not work automatically. The reason being that UserJob
requires a GoodJob ID and, when you're creating the jobs in the context of a batch, they don't actually exist at the time we're trying to create the UserJob
.
I will see if I can make UserJob
relate to the batch instead of an individual job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spec/features/school_student/creating_a_batch_of_school_students_spec.rb
Show resolved
Hide resolved
Recent updates in profile made a small number of changes to the structure of errors and parameters in profile. This commit brings editor-api up to match it.
This commit moves the method SchoolStudentsController#batch_in_progress? to School#import_in_progress? and changes the batch identifier from "school_id:#{school.id}" to simply "#{school.id}". This allows us to expose the import_in_progress field through the API so that we can enable/disable front-end UI components to prevent user frustration.
This change includes the state of current imports for a school in the API response.
This commit changes UserJob from having a 1:1 relationship with GoodJob::Job to tracking the ID of a GoodJob::Batch. This means that the UserJobsController had to change to track the state of a batch and emulate the status messages that a Job used to provide (e.g. runnning, queued, etc.). The front end can now poll for the status of this job.
class UserJob < ApplicationRecord | ||
belongs_to :good_job, class_name: 'GoodJob::Job' | ||
|
||
validates :good_job_batch_id, presence: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Make this belongs_to :good_job_batch, class_name: 'GoodJob::Batch'
We were only decrypting students at validation and not creation. Fix.
Status
This user story is in Experience CS #1146 - Adding Students - Increase Upload Limit.
The overall goal of this PR is to allow Code Editor to ingest CSV uploads of new users much larger than the current limit of 50 that is imposed by Profile. However, it was also a goal NOT to change the 50-row limit in the
createStudents
endpoint in Profile.Therefore the high level strategy to solving this problem is:
This PR depends on the branch in PR #1948 in Profile and PR #628 in editor-standalone, which removes the 50-row limit check on the front end.
Points for consideration:
Please could reviewers focus on concurrency safety and any issues that could arise if a batch takes a long time to run?
What's changed?
ProfileApiClient
to use the new/preflight-student-upload
endpoint in Profile.SchoolStudent::CreateBatch
and instead validates the entire upload using the newSchoolStudents::ValidateBatch
operation.GoodJob::Batch
to group individualCreateStudentsJob
and ensure that the enqueue of the entire batch of 50-user creation jobs is atomic.description
field of aGoodJob::Batch
as a kind of concurrency key, since batches don't have concurrency keys. The controller will refuse to enqueue a new batch if there exists a batch whose description matches the school ID and which has not been completed or discarded.Steps to perform after deploying to production
I don't believe deploying this PR will require any additional work.