-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
State @ Aug 25 #1
Comments
@olizilla if you can pick some of this up while I'm OOO that would be amazing! If not, then no worries I will finish on my return. |
@alanshaw the resource links in sha256it progress report above for Keys & Hashes link to |
😩 I didn't get round to uploading them before I left. Step 4/5 are still doable and would be useful progress if nothing else... |
no worries. You mean todos 3 & 4 right?
|
Yes, FML 🤦♂️ |
I think faster to use the input files as the source rather than a full db scan... there are ~43 billion rows in the target dynamo table. If we use the input files we can locally reduce the job down to just the set of inserts that need to happen and divide the work up between us... something like
we can share the files around and each run the cli over a subset of the inputs to fill the queue, and we can tweak the queue subscriber concurrency to make the writes faster. or we could have the cli just write to dynamo. I wonder if there is much speed difference between writing to SQS vs writing to dynamo. |
Background
dotstorage-prod-0
dotstorage-prod-1
copyFunctionURL
andhashFunctionURL
in the stage outputs for current lambda URLsHow to
ls
CLI command (inpackages/cli
)hash
CLI command (inpackages/cli
)cat keys-dotstorage-prod-1-d.ndjson | sha256it hash > hashes-dotstorage-prod-1-d.ndjson
copy
CLI command (inpackages/cli
)cat hashes-dotstorage-prod-1-a.ndjson | sha256it copy > copies-dotstorage-prod-1-a.ndjson
Note: all CLI commands output ndjson.
Resources
Complete
dotstorage-prod-0
ANDdotstorage-prod-1
dotstorage-prod-1
are separated in 5 million line chunks -a-g
dotstorage-prod-0
a
,b
&c
fordotstorage-prod-1
dotstorage-prod-0
a
fordotstorage-prod-1
TODO
The text was updated successfully, but these errors were encountered: