-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure chunked TOC and tar-split metadata are consistent #2035
Conversation
// This is horrible, but we don’t know how much padding to skip. (It can be computed from the previous hdr.Size for non-sparse | ||
// files, but for sparse files that is set to the logical size.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty horrible hack.
It should be possible for tar-split/archive/tar
to expose the expected padding size in Header
.
(Alternatively, the OCI spec suggests that sparse files should not be used, but I don’t expect that can be relied upon.)
The test failure is on unrealistic data in a different test in this package, I’ll update that tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a relatively superficial skim but generally LGTM.
pendingFiles := map[string]*internal.FileMetadata{} // Name -> an entry in toc.Entries | ||
for i := range toc.Entries { | ||
e := &toc.Entries[i] | ||
if e.Type != internal.TypeChunk { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor but I paused on this for a bit, uncertain; perhaps:
// Chunks are just part of files, they won't appear explicitly
// in the tar stream, so we don't validate them.
if e.Type == internal.TypeChunk {
continue
}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be best documented somewhere around pkg/chunked/internal/compression.go
, this is “just” straightforwardly consuming the structure as designed.
#1939 did add the basic documentation of TypeChunk
, although the full semantics of Offset
/ChunkOffset
etc. is not written down anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be a good citizen, I have added a commit documenting the format (to the extent I can reverse-engineer it from the code) to the documentation of internal.FileMetadata
.
|
||
// iterateTarSplit calls handler for each tar header in tarSplit | ||
func iterateTarSplit(tarSplit []byte, handler func(hdr *tar.Header) error) error { | ||
// This, strictly speaking, hard-codes undocumented assumptions about how github.com/vbatts/tar-split/tar/asm.NewInputTarStream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory this API could land as a PR to that repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would certainly be better. The timing will probably force us to carry this in c/storage at for a few days/weeks, but I should definitely prepare a tar-split PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have recorded the task to file a tar-split PR as an item on containers/image#2189 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed as vbatts/tar-split#71 .
pkg/chunked/tar_split_linux_test.go
Outdated
"github.com/vbatts/tar-split/tar/storage" | ||
) | ||
|
||
func testTarheader(index int, typeFlag byte, size int64) tar.Header { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name made me thing this is itself a test, how about createTestTarheader
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
983b14b
to
886c158
Compare
28222ae
to
85d7be5
Compare
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Podman tests passed in containers/podman#23307 . |
/lgtm |
In addition to the existing use when creating a TOC from tar data, we will also need it when parsing TOC and tar-split data. Should not change behavior. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We are going to be checking its consistency with the TOC. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, giuseppe, mtrmac The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This should resolve #2014. Basically untested, filing now for early design review.
We simply enforce that the TOC and tar-split have exactly the same contents. To the extent the zstd:chunked format is only being produced by this package, we can expect an equal match (right now? ), but that may become harder as the format evolves (e.g. the recent
timeIfNotZero
change).@giuseppe PTAL. Cc: @cgwalters .