State @ Aug 25 #1

alanshaw · 2023-08-25T15:03:30Z

Background

Deployed as sha256it with SST in AWS
Using the staging environment for hashing/copying from dotstorage-prod-0
Using the production environment for hashing/copying from dotstorage-prod-1
See copyFunctionURL and hashFunctionURL in the stage outputs for current lambda URLs

How to

Get keys
- Use ls CLI command (in packages/cli)
Generate hashes
- Use hash CLI command (in packages/cli)
- e.g. cat keys-dotstorage-prod-1-d.ndjson | sha256it hash > hashes-dotstorage-prod-1-d.ndjson
Copy CARs
- Use copy CLI command (in packages/cli)
- e.g. cat hashes-dotstorage-prod-1-a.ndjson | sha256it copy > copies-dotstorage-prod-1-a.ndjson

Note: all CLI commands output ndjson.

Resources

Keys
Hashes

Complete

Key listings for dotstorage-prod-0 AND dotstorage-prod-1
- Note: Keys for dotstorage-prod-1 are separated in 5 million line chunks - a-g
Hashes for dotstorage-prod-0
Hashes a, b & c for dotstorage-prod-1
Copies for dotstorage-prod-0
Copies a for dotstorage-prod-1

TODO

Give feedback

Hashes for dotstorage-prod-1 - d, e, f, g
Copies for dotstorage-prod-1 - b, c, d, e, f, g
Script to verify data was copied. A HEAD request would suffice. I am confident that if it is present and non-zero in size, then it is there in its entirety and it is consistency verified due to the copy operation using ChecksumSHA256.
Script to update DynamoDB carpaths with CAR CID. We should be able to walk the DB, find a carpath with complete/, lookup the CAR CID in the output files above and update it.
Options

The text was updated successfully, but these errors were encountered:

alanshaw · 2023-08-25T15:05:06Z

@olizilla if you can pick some of this up while I'm OOO that would be amazing! If not, then no worries I will finish on my return.

olizilla · 2023-08-30T08:49:43Z

@alanshaw the resource links in sha256it progress report above for Keys & Hashes link to bafytodo! Do you have real CIDs for those?

alanshaw · 2023-08-30T08:56:05Z

😩 I didn't get round to uploading them before I left.

Step 4/5 are still doable and would be useful progress if nothing else...

olizilla · 2023-08-30T09:13:14Z

no worries. You mean todos 3 & 4 right?

Script to verify data was copied. A HEAD request would suffice. I am confident that if it is present and non-zero in size, then it is there in its entirety and it is consistency verified due to the copy operation using ChecksumSHA256.

Script to update DynamoDB carpaths with CAR CID. We should be able to walk the DB, find a carpath with complete/, lookup the CAR CID in the output files above and update it.

alanshaw · 2023-08-30T18:35:37Z

Yes, FML 🤦‍♂️

olizilla · 2023-09-01T11:06:03Z

Script to update DynamoDB carpaths with CAR CID. We should be able to walk the DB, find a carpath with complete/, lookup the CAR CID in the output files above and update it.

I think faster to use the input files as the source rather than a full db scan... there are ~43 billion rows in the target dynamo table. If we use the input files we can locally reduce the job down to just the set of inserts that need to happen and divide the work up between us... something like

filter input files where carpath starts with /complete
cli to read files and write updates to a queue in batches (maybe 25 which is the max dynamo batch write size)
consumer to read from queue and update db (1 message of 25 = 1 batch write op)

we can share the files around and each run the cli over a subset of the inputs to fill the queue, and we can tweak the queue subscriber concurrency to make the writes faster.

or we could have the cli just write to dynamo. I wonder if there is much speed difference between writing to SQS vs writing to dynamo.

olizilla mentioned this issue Aug 31, 2023

feat: add head command to verify CAR is in dest bucket #2

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

State @ Aug 25 #1

State @ Aug 25 #1

alanshaw commented Aug 25, 2023 •

edited by olizilla

Loading

TODO

alanshaw commented Aug 25, 2023

olizilla commented Aug 30, 2023

alanshaw commented Aug 30, 2023

olizilla commented Aug 30, 2023

alanshaw commented Aug 30, 2023

olizilla commented Sep 1, 2023

State @ Aug 25 #1

State @ Aug 25 #1

Comments

alanshaw commented Aug 25, 2023 • edited by olizilla Loading

Background

How to

Resources

Complete

TODO

alanshaw commented Aug 25, 2023

olizilla commented Aug 30, 2023

alanshaw commented Aug 30, 2023

olizilla commented Aug 30, 2023

alanshaw commented Aug 30, 2023

olizilla commented Sep 1, 2023

alanshaw commented Aug 25, 2023 •

edited by olizilla

Loading