Skip to content

Conversation

fspeirs
Copy link
Contributor

@fspeirs fspeirs commented Sep 19, 2025

Status

This user story is in Experience CS #1146 - Adding Students - Increase Upload Limit.

The overall goal of this PR is to allow Code Editor to ingest CSV uploads of new users much larger than the current limit of 50 that is imposed by Profile. However, it was also a goal NOT to change the 50-row limit in the createStudents endpoint in Profile.

Therefore the high level strategy to solving this problem is:

  1. Change Profile to allow fast validation of large batches of students (done in PR #1948 in Profile)
  2. Have Code Editor accept a large batch of students and call that endpoint to ensure the entire batch is valid.
  3. Have Code Editor split the large batch into multiples of 50 students and feed that into Profile - even if that takes substantial time to complete.

This PR depends on the branch in PR #1948 in Profile and PR #628 in editor-standalone, which removes the 50-row limit check on the front end.

Points for consideration:

Please could reviewers focus on concurrency safety and any issues that could arise if a batch takes a long time to run?

What's changed?

  • Changes ProfileApiClient to use the new /preflight-student-upload endpoint in Profile.
  • Removes batch validation from SchoolStudent::CreateBatch and instead validates the entire upload using the new SchoolStudents::ValidateBatch operation.
  • Introduces a new GoodJob queue just for user creation.
  • Uses a GoodJob::Batch to group individual CreateStudentsJob and ensure that the enqueue of the entire batch of 50-user creation jobs is atomic.
  • Uses the description field of a GoodJob::Batch as a kind of concurrency key, since batches don't have concurrency keys. The controller will refuse to enqueue a new batch if there exists a batch whose description matches the school ID and which has not been completed or discarded.

Steps to perform after deploying to production

I don't believe deploying this PR will require any additional work.

This commit allows ProfileApiClient to call the new fast validation
endpoint in Profile.
This commit creates a new operation that validates a batch. It moves
all of the validation error handling out of ::CreateBatch, leaving that
operation quite simple.
This commmit takes the validation and concurrency control out of create_students_job
and puts it in school_students_controller.

The idea is that instead of validating then committing one job of 50
students, we:

1. validate the entire batch of N students quickly by calling
   SchoolStudents::ValidateBatch
2. Split the batch into chunks of 50.
3. Enqueue them in the context of a GoodJob::Batch, ensuring atomic
   enqueue of all `N/50` jobs.
4. Control concurrency by not allowing creation of another Batch whose
   description field matches the school ID (this makes up for the fact
   that GoodJob::Batch does not have a concurrency key like Jobs do).
Switching to `filter_map` ensures that a parameter list of `[]` gets
through as `[]` and not `[nil]`.
The validation functionality has been refactored and raised to the
controller level that calls CreateBatch, so the tests are at a higher
level now.
Copy link

cla-bot bot commented Sep 19, 2025

We require contributors to sign our Contributor License Agreement, and we don't have you on file. In order for us to review and merge your code, please complete this form and we'll get you added and review your contribution as soon as possible.

@fspeirs fspeirs changed the title Batch user creation Enable Bulk User Creation for Larger Batches of Users Sep 19, 2025
class << self
def call(school:, school_students_params:, token:, user_id:)
response = OperationResponse.new
response[:job_id] = create_batch(school, school_students_params, token, user_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the frontend able to poll this batch job like a standard job then, i.e. there's no impact?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I completely understand this question - can you amplify this a bit for me?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the frontend uses the job id to poll for the status of the job, just checking that given it's now using a batch id it'll all work as before - I suspect it will

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out that it does not work automatically. The reason being that UserJob requires a GoodJob ID and, when you're creating the jobs in the context of a batch, they don't actually exist at the time we're trying to create the UserJob.

I will see if I can make UserJob relate to the batch instead of an individual job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would appreciate your input on 81ed472...c32b720

@cla-bot cla-bot bot added the cla-signed label Sep 24, 2025
This commit moves the method SchoolStudentsController#batch_in_progress?
to School#import_in_progress? and changes the batch identifier from
"school_id:#{school.id}" to simply "#{school.id}".

This allows us to expose the import_in_progress field through the API so
that we can enable/disable front-end UI components to prevent user
frustration.
This change includes the state of current imports for a school in the
API response.
This commit changes UserJob from having a 1:1 relationship with
GoodJob::Job to tracking the ID of a GoodJob::Batch.

This means that the UserJobsController had to change to track the state
of a batch and emulate the status messages that a Job used to provide
(e.g. runnning, queued, etc.).

The front end can now poll for the status of this job.
class UserJob < ApplicationRecord
belongs_to :good_job, class_name: 'GoodJob::Job'

validates :good_job_batch_id, presence: true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Make this belongs_to :good_job_batch, class_name: 'GoodJob::Batch'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants