Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: give users the option to run the json migration asyncly #495

Merged
merged 1 commit into from
Aug 13, 2023

Conversation

aqeelat
Copy link
Member

@aqeelat aqeelat commented Jan 12, 2023

No description provided.

@hramezani
Copy link
Member

Thanks @aqeelat for your effort!

I assume that users who want to migrate normally, just run migrate command and see a changes_text in the model and everything else is like before. right?

The general approach is ok for me we need to document it.

@alieh-rymasheuski do you have any idea here?

auditlog/diff.py Outdated
@@ -189,3 +192,25 @@ def model_instance_diff(
diff = None

return diff


def _changes_func() -> Callable[["LogEntry"], Dict]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better to move these codes to models.py?

help="If provided, we will use native db operations.",
dest="db",
type=str,
choices=["postgres", "mysql", "oracle", "sqllite3"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't support sqllite3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it's sqlite3, single l.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I saw it included in #407 (comment) and I thought it was supported.

@aqeelat
Copy link
Member Author

aqeelat commented Jan 17, 2023

@hramezani would you be able to write the documentation for it? Maybe in a "migration to version 3 guide" in the docs, or as a readme section.

I could write the bullet points as a comment to this PR but I'm not familiar with reStructuredText enough to write a good document.

Also, how do you recommend we write the tests for this pr?

@hramezani
Copy link
Member

@hramezani would you be able to write the documentation for it? Maybe in a "migration to version 3 guide" in the docs, or as a readme section.

I could write the bullet points as a comment to this PR but I'm not familiar with reStructuredText enough to write a good document.

Ok, Please write them down in this PR and I will take care of documentation later

Also, how do you recommend we write the tests for this pr?

  • We need some tests for the new changes_func
  • We need to test the new management command as well.

Please consider to add them in separate class to be able to remove them easily later or add some note regarding the removal

@aqeelat
Copy link
Member Author

aqeelat commented Jan 19, 2023

I added a few tests but I'll come back later to polish them and add the changelog entry

@codecov
Copy link

codecov bot commented Jan 19, 2023

Codecov Report

Merging #495 (1a3ebca) into master (f2591e4) will increase coverage by 0.37%.
The diff coverage is 98.90%.

@@            Coverage Diff             @@
##           master     #495      +/-   ##
==========================================
+ Coverage   94.56%   94.93%   +0.37%     
==========================================
  Files          30       31       +1     
  Lines         902      988      +86     
==========================================
+ Hits          853      938      +85     
- Misses         49       50       +1     
Files Changed Coverage Δ
auditlog/migrations/0015_alter_logentry_changes.py 90.00% <88.88%> (-10.00%) ⬇️
auditlog/apps.py 100.00% <100.00%> (ø)
auditlog/conf.py 100.00% <100.00%> (ø)
...uditlog/management/commands/auditlogmigratejson.py 100.00% <100.00%> (ø)
auditlog/models.py 90.37% <100.00%> (+0.60%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@aqeelat aqeelat force-pushed the migration_workaround branch 2 times, most recently from a114d01 to 2fe9614 Compare January 21, 2023 20:55
@aqeelat
Copy link
Member Author

aqeelat commented Jan 21, 2023

Now that I think of it, should I attach this PR to the current changelog entry?
Also, please review the comments I added to the pr

parser.add_argument(
"-b",
"--bactch-size",
default=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it better to have a default batch size?
I assume if the batch size is None it will process all the records in one shot am afraid it put pressure on databases with a lot of records.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batching here is use to save memory not reduce the transaction size because bulk_update does batching by default. So, instead of storing a billion record in an array and then uploading batch by batch, passing -b will get each batch, process it, garbage collect it, then repeat.

But good comment. I added a default batch size of 50, with the option of not using batches.

"-b",
"--bactch-size",
default=None,
help="Split the migration into multiple batches",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention that this works just in case -d is not provided.

("auditlog", "0014_logentry_cid"),
]

atomic = not settings.AUDITLOG_TWO_STEP_MIGRATION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the usage of this line here?

Copy link
Member Author

@aqeelat aqeelat Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I knew was that if atomic=True (which is the default), we won't be able to rename the field type and then create another field with the same name in the same migration because they will be run in a transaction.
But after your comment, I tested it out and it worked regardless.


from auditlog import models

models.changes_func = models._changes_func()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move it to auditlog/models.py

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I had it in the models, it was throwing an error saying that the conf is not found. Do you have any suggestions on how to fix it?

@hramezani
Copy link
Member

@aqeelat Thanks for your update.

Now that I think of it, should I attach this PR to the current changelog entry?

Yes

Also, please review the comments I added to the pr

It's because of your get_logs query. at the beginning we have L1, L2 and L3. the migration uses the first page of get_log which is L1. then in next loop, get_logs query result is L2 and L3(because L1 is migrated). So in the second loop migration picks page 2 from L2 and L3 wich is L3.
For fixing the problem I think you need to don't use paginator and always pick the first batch_size items of get_logs

@aqeelat
Copy link
Member Author

aqeelat commented Jan 23, 2023

It's because of your get_logs query. at the beginning we have L1, L2 and L3. the migration uses the first page of get_log which is L1. then in next loop, get_logs query result is L2 and L3(because L1 is migrated). So in the second loop migration picks page 2 from L2 and L3 wich is L3. For fixing the problem I think you need to don't use paginator and always pick the first batch_size items of get_logs

Good catch. I didn't know that the paginator reevaluates the qs in each iteration. I replaced it with a for-loop and it seems to be working fine now.

@aqeelat
Copy link
Member Author

aqeelat commented Jan 23, 2023

@hramezani I updated the PR with the comments you suggested.

@hramezani
Copy link
Member

Thanks @aqeelat LGTM 👍

Now, you need to write down the required steps for migration (as a comment to this PR). Then I will create a PR out of them later.

@aqeelat
Copy link
Member Author

aqeelat commented Jan 26, 2023

Version 3.0.0 introduces breaking changes. Please review the migration guide below before upgrading.
If you're new to django-auditlog, you can ignore this part.

The major change in the version is that we're finally storing changes as json instead of json-text. To convert the existing records, this version has a database migration that does just that. However, this migration will take a long time if you have a huge amount of records, causing your database and application to be out of sync until the migration is complete.

To avoid this, follow these steps:

  1. Before upgrading the package, add these two variables to settings.py:
    • AUDITLOG_TWO_STEP_MIGRATION = True
    • AUDITLOG_USE_TEXT_CHANGES_IF_JSON_IS_NOT_PRESENT = True
  2. Upgrade the package. Your app will now start storing new records as JSON, but the old records will accessible via LogEntry.changes_text.
  3. Use the newly added auditlogmigratejson command to migrate your records. Run django-admin auditlogmigratejson --help to get more information.
  4. Once all records are migrated, remove the variables listed above, or set their values to False.

@aqeelat
Copy link
Member Author

aqeelat commented Jan 26, 2023

Feel free to reword this as you see fit.
@hramezani

@hramezani
Copy link
Member

Thanks @aqeelat

I've created issue #503 for this.

I am going to test the management command in my machine and will merge it if everything works fine.
This will happen probably next week

@aqeelat
Copy link
Member Author

aqeelat commented Apr 14, 2023

Any updates here? @hramezani

@hramezani
Copy link
Member

@aqeelat Not yet!

I am too busy nowadays and as this is required for V3, it's not very urgent for me.
I also need to add documentation for this.

@aqeelat
Copy link
Member Author

aqeelat commented Apr 14, 2023 via email

@hramezani
Copy link
Member

@aqeelat I am ok to release it in V2, we need to:

  • Cherry-pick cid commit on top of V2
  • Make a new release for V2(V2.3)

Would you like to work on it?

@aqeelat
Copy link
Member Author

aqeelat commented Aug 10, 2023

@hramezani When would you be able to approve the PR?

@hramezani
Copy link
Member

@hramezani When would you be able to approve the PR?

@aqeelat we need to document how to do the migration. You already provided the required step for that and we have an issue.
My initial plan was to work on documentation for this PR but I am too busy :(
It would be great if you add the documentation if you have time, them we are good to go

@aqeelat
Copy link
Member Author

aqeelat commented Aug 10, 2023

@hramezani I'll get something out this weekend ISA.

@hramezani
Copy link
Member

@hramezani I'll get something out this weekend ISA.

Great! Please add the documentation to this PR

@aqeelat
Copy link
Member Author

aqeelat commented Aug 10, 2023

@hramezani pushed

hramezani
hramezani previously approved these changes Aug 11, 2023
log_entry.refresh_from_db()

# Assert
msg = "Not yet implemented. Run this management commad without passing a -d/--database argument."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
msg = "Not yet implemented. Run this management commad without passing a -d/--database argument."
msg = "Not yet implemented. Run this management command without passing a -d/--database argument."

@hramezani hramezani merged commit 134ef73 into jazzband:master Aug 13, 2023
6 checks passed
@aqeelat aqeelat deleted the migration_workaround branch August 13, 2023 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants