Skip to content

Conversation

@annrpom
Copy link
Contributor

@annrpom annrpom commented Dec 2, 2025

This patch uses bytes.Compare to compare key versions instead of ComparePointSuffixes; the latter calls on
normalizeEngineSuffixForCompare - which could end up stripping additional bytes from the suffix (comparing only a prefix of the suffix) if the last byte of the version happens to match one of the sentinel lengths defined.

Fixes: #158463

Radu's analysis: cockroachdb/cockroach#158463 (comment)

@annrpom annrpom requested a review from a team as a code owner December 2, 2025 18:41
@annrpom annrpom requested a review from RaduBerinde December 2, 2025 18:41
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@annrpom annrpom marked this pull request as draft December 2, 2025 18:45
@annrpom annrpom marked this pull request as ready for review December 2, 2025 19:03
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this! We need a test case in TestKeySchema_KeySeeker.

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @annrpom)


cockroachkvs/cockroachkvs.go line 792 at r1 (raw file):

	// a non-MVCC key with an untyped version.
	if firstRowWall == 0 && firstLogical == 0 {
		if len(ks.untypedVersions.At(0)) == 0 || len(version) == 0 {

[nit] I'd put the At() result in a variable


cockroachkvs/cockroachkvs.go line 794 at r1 (raw file):

		if len(ks.untypedVersions.At(0)) == 0 || len(version) == 0 {
			// Empty suffixes sort before non-empty suffixes.
			return cmp.Compare(len(ks.untypedVersions.At(0)), len(version)) >= 0

return len(version) <= len(..)

Copy link
Contributor Author

@annrpom annrpom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Reviewable status: 0 of 1 files reviewed, all discussions resolved (waiting on @RaduBerinde)


cockroachkvs/cockroachkvs.go line 792 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] I'd put the At() result in a variable

done


cockroachkvs/cockroachkvs.go line 794 at r1 (raw file):

Previously, RaduBerinde wrote…

return len(version) <= len(..)

done

@annrpom annrpom requested a review from RaduBerinde December 3, 2025 03:24
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@jbowens PTAL as well, we will need to backport this.

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @annrpom)


cockroachkvs/cockroachkvs.go line 794 at r1 (raw file):

Previously, annrpom (annie pompa) wrote…

done

It might be easier to reason about if we did this (your call):

// Empty suffixes sort before non-empty suffixes.
switch {
case len(version) == 0:
  // This includes the case where both versions are empty.
  return true
case len(firstRowUntypedVer) == 0:
  return false
default:
  return bytes.Compare()
}

cockroachkvs/testdata/key_schema_key_seeker line 245 at r2 (raw file):

IsLowerBound(moo @ 02073a83c45688420eaf97824255790f1e12, "") = true

# Regression for test for cockroach/#158463; previously, we would incorrectly

Let's add a case where the first key in the block doesn't have a suffix.


cockroachkvs/testdata/key_schema_key_seeker line 270 at r2 (raw file):

moo @ 0000000000,1
----
IsLowerBound(moo @ 9000000000,2, "") = false

We should add a case where the key doesn't have a suffix, and also a case with a suffix that is expected to return true.

Copy link
Contributor Author

@annrpom annrpom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @RaduBerinde)


cockroachkvs/cockroachkvs.go line 794 at r1 (raw file):

Previously, RaduBerinde wrote…

It might be easier to reason about if we did this (your call):

// Empty suffixes sort before non-empty suffixes.
switch {
case len(version) == 0:
  // This includes the case where both versions are empty.
  return true
case len(firstRowUntypedVer) == 0:
  return false
default:
  return bytes.Compare()
}

ah i like - done


cockroachkvs/testdata/key_schema_key_seeker line 245 at r2 (raw file):

Previously, RaduBerinde wrote…

Let's add a case where the first key in the block doesn't have a suffix.

we have one https://github.com/annrpom/pebble/blob/e119ca2263953ce04969bbf857b5c458bf6a0e36/cockroachkvs/testdata/key_schema_key_seeker#L173-L188


cockroachkvs/testdata/key_schema_key_seeker line 270 at r2 (raw file):

also a case with a suffix that is expected to return true.
done

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RaduBerinde
Copy link
Member

cockroachkvs/cockroachkvs.go line 838 at r3 (raw file):

			firstRowSuffix = buf[:13]
		}
		return ComparePointSuffixes(firstRowSuffix, version) >= 0

Claude Code found another issue here: we're calling this with version which is not a suffix. We can just replace with bytes.Compare(version, firstRowSuffix[:len(firstRowSuffix)-1]) >= 0 since neither is empty.

This patch uses bytes.Compare to compare key versions instead of
ComparePointSuffixes; the latter calls on
`normalizeEngineSuffixForCompare` - which could end up strip additional
bytes from the `suffix` (comparing only a prefix of the suffix) if the
last byte of the version happens to match one of the sentinel lengths
defined.

Fixes: #158463
Copy link
Contributor Author

@annrpom annrpom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @RaduBerinde)


cockroachkvs/cockroachkvs.go line 838 at r3 (raw file):

Previously, RaduBerinde wrote…

Claude Code found another issue here: we're calling this with version which is not a suffix. We can just replace with bytes.Compare(version, firstRowSuffix[:len(firstRowSuffix)-1]) >= 0 since neither is empty.

done

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 5 unresolved discussions (waiting on @annrpom and @RaduBerinde)


cockroachkvs/cockroachkvs.go line 773 at r4 (raw file):

// IsLowerBound is part of the KeySeeker interface.
func (ks *cockroachKeySeeker) IsLowerBound(k []byte, syntheticSuffix []byte) bool {

Is this only for point keys?
If for both point and range keys, I'll go read all the stuff about why we call normalizeEngineSuffixForCompare only in some cases.


cockroachkvs/cockroachkvs.go line 780 at r4 (raw file):

	if v := bytes.Compare(ks.roachKeys.UnsafeFirstSlice(), roachKey); v != 0 {
		return v > 0
	}

It would be clearer if we had something like
// Reminder, DecodeEngineKey excludes the length byte at the end, and roachKey excludes the sentinel byte.
// The suffix below would be the result of calling k[Split(k):], now that we know it is a valid engine key.
suffix := k[len(roachKey)+1:]

And we should document what DecodeEngineKey returns in its declaration.


cockroachkvs/cockroachkvs.go line 784 at r4 (raw file):

	// compare the key's suffix to the synthetic suffix.
	if len(syntheticSuffix) > 0 {
		return ComparePointSuffixes(syntheticSuffix, k[len(roachKey)+1:]) >= 0

and then changed this to use suffix.


cockroachkvs/cockroachkvs.go line 791 at r4 (raw file):

	// is either (a) unversioned (and sorts before all other suffixes) or (b) is
	// a non-MVCC key with an untyped version.
	if firstRowWall == 0 && firstLogical == 0 {

why are we not simply doing

return ComparePointSuffixes(ks.untypedVersions.At(0), suffix) >= 0

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumeerbhola reviewed 1 of 2 files at r4.
Reviewable status: 1 of 2 files reviewed, 5 unresolved discussions (waiting on @annrpom and @RaduBerinde)


cockroachkvs/cockroachkvs.go line 791 at r4 (raw file):

Previously, sumeerbhola wrote…

why are we not simply doing

return ComparePointSuffixes(ks.untypedVersions.At(0), suffix) >= 0

Oh, I see cockroachdb/cockroach#158463 (comment).

And since this is an untyped version, we don't need to normalize it. We should add a comment where normalizeEngineSuffixForCompare is declared that it is a noop for a non-MVCC suffix.

But suffix could be a MVCC suffix so shouldn't we normalize it, just to be consistent with what we do in ComparePointSuffixes which allows comparing a MVCC suffix with a non-MVCC suffix and would normalize the former?
So we would do normalizedVersion := normalizeEngineSuffixForCompare(suffix) and use that below?

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 files reviewed, 5 unresolved discussions (waiting on @annrpom and @sumeerbhola)


cockroachkvs/cockroachkvs.go line 791 at r4 (raw file):

Previously, sumeerbhola wrote…

Oh, I see cockroachdb/cockroach#158463 (comment).

And since this is an untyped version, we don't need to normalize it. We should add a comment where normalizeEngineSuffixForCompare is declared that it is a noop for a non-MVCC suffix.

But suffix could be a MVCC suffix so shouldn't we normalize it, just to be consistent with what we do in ComparePointSuffixes which allows comparing a MVCC suffix with a non-MVCC suffix and would normalize the former?
So we would do normalizedVersion := normalizeEngineSuffixForCompare(suffix) and use that below?

You're right. I think we should separate out a normalizeVersionForCompare and have a compareVersions helper.

@RaduBerinde
Copy link
Member

cockroachkvs/cockroachkvs.go line 791 at r4 (raw file):

Previously, RaduBerinde wrote…

You're right. I think we should separate out a normalizeVersionForCompare and have a compareVersions helper.

Feels like all these paths can be tested with a randomized test. We can pair every suffix with every prefix in TestCompare to build a set of keys; for each key, build a block with that key, call IsLowerBound against every key and cross-check against Compare.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 files reviewed, 5 unresolved discussions (waiting on @annrpom and @sumeerbhola)


cockroachkvs/cockroachkvs.go line 791 at r4 (raw file):

Previously, RaduBerinde wrote…

Feels like all these paths can be tested with a randomized test. We can pair every suffix with every prefix in TestCompare to build a set of keys; for each key, build a block with that key, call IsLowerBound against every key and cross-check against Compare.

Something like this, feel free to take this commit and put it in this PR: https://github.com/cockroachdb/pebble/compare/master...RaduBerinde:cockroachkvs-lower-bound-test?expand=1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

storage/metamorphic: TestPebbleEquivalenceNightly failed

4 participants