Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues #39

Open
nmeylan opened this issue Jul 13, 2024 · 1 comment
Open

Performance issues #39

nmeylan opened this issue Jul 13, 2024 · 1 comment

Comments

@nmeylan
Copy link
Owner

nmeylan commented Jul 13, 2024

On big json following features are slow:

  • Insert new row: due to data structure with json pointer, inserting a new row means we need to iterate over each entry after the new added row to update their json-pointer as the position has changed.
{"skills": [{"copyFlags": {"reproduce": true, "plagiarism": false}}]}

/skills/0 -> {"copyFlags": {"reproduce": true, "plagiarism": false}} 
/skills/0/copyFlags -> {"reproduce": true, "plagiarism": false}
/skills/0/copyFlags/reproduce -> true
/skills/0/copyFlags/plagiarism -> false

Then inserting a new row above row 0, any row after row 0 should be updated like this

/skills/1 -> {"copyFlags": {"reproduce": true, "plagiarism": false}} 
/skills/1/copyFlags -> {"reproduce": true, "plagiarism": false}
/skills/1/copyFlags/reproduce -> true
/skills/1/copyFlags/plagiarism -> false
  • Replace in column when criteria match too many row: due to the capability to open multiple view on the same row at different depth of the json object, we have to serialize and parse again the updated row:
    • we need to serialize it so root object of the row can be viewed in "Object table"
    • we then parse it again to update nested objects serialized entry
{"skills": [{"copyFlags": {"reproduce": true, "plagiarism": false}}]}

/skills/0 -> {"copyFlags": {"reproduce": true, "plagiarism": false}}  <- we keep serialized object for depth 3 view 
/skills/0/copyFlags -> {"reproduce": true, "plagiarism": false}  <- we keep serialized object for depth 4 view
/skills/0/copyFlags/reproduce -> true
/skills/0/copyFlags/plagiarism -> false

If we update plagiarism to true /skills/0/copyFlags/plagiarism -> true

we also need to update 
/skills/0 <- require serialization of root object
/skills/0/copyFlags
@nmeylan
Copy link
Owner Author

nmeylan commented Jul 14, 2024

mitigate slow replace by, on json of 350mb, with 310k rows, replacing all 310k rows values:

  • Improving serialization: from 6.5sec to 5s
  • Enable parallel replacement (using 8 core): from 5s to 1.2s
    It is still slow, but start to be usable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant