Skip to content

Conversation

MMcM
Copy link
Collaborator

@MMcM MMcM commented Sep 17, 2025

… using the same serialization key manager as record serialization.

@MMcM MMcM added the enhancement New feature or request label Sep 17, 2025
@MMcM MMcM force-pushed the lucene-encrypted-serialization branch 2 times, most recently from 122976b to c94febd Compare September 24, 2025 14:49
@MMcM MMcM marked this pull request as ready for review September 24, 2025 14:50
@MMcM MMcM force-pushed the lucene-encrypted-serialization branch from c94febd to a51cfbc Compare September 25, 2025 00:02
Copy link
Collaborator

@ScottDugas ScottDugas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get to really think about whether the encryption code is doing the right thing. It looks pretty similar to the record encryption though.

return encrypted;
public void setKeyNumber(final int keyNumber) {
this.keyNumber = keyNumber;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does look like TransformedRecordSerializerState is a super set of this class. Would it make sense to move this out, and have TransformedRecordSerializerState have one of these.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to share the base class with just the intended / found format.

@Nonnull byte[] encodedData, int prefixLength) {
final byte[] encoded = new byte[originalData.length + prefixLength];
System.arraycopy(encodedData, 0, encoded, 0, prefixLength);
encoded[0] &= ~ENCODING_COMPRESSED;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this guaranteed to be in byte[0] (i.e. what if prefixLength>1)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe so. Varints are 7-bit little-endian, with a "continuation" bit to know when to stop. So, the compressed flag bit, which is in the low bits, is always in the first byte, no matter how many higher bits there are in the following bytes.

void encrypted(long seed) throws Exception {
final RollingTestKeyManager rollingTestKeyManager = new RollingTestKeyManager(seed);
keyManager = rollingTestKeyManager;
// Save in many transactions to get more segments and so more blocks.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably also a good idea to test having a single segment with multiple blocks

Copy link
Collaborator Author

@MMcM MMcM Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these documents write on the order of 9 blocks already. I see that we don't actually have a metric for that, so I added one. It's hard to assert about that, though, because the commit method in the test fixture clears the timer and that's when the work gets done.


@ParameterizedTest
@RandomSeedSource
void encrypted(long seed) throws Exception {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth having some of the tests in LuceneIndexMaintenanceTest chose to randomly encrypt and-or compress.

Copy link
Collaborator Author

@MMcM MMcM Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added these serialization states as an argument to the existing tests that took a bunch of arguments.

try (FDBRecordContext context = openContext()) {
openRecordStore(context);
recordStore.saveRecord(doc);
commit(context);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be merging in-line when you commit. Would it be worthwhile to test merging separately? Perhaps that's best covered by LuceneIndexMaintenanceTest.

@MMcM MMcM force-pushed the lucene-encrypted-serialization branch from a51cfbc to 92ebe52 Compare September 26, 2025 14:45
}

public void setKeyNumber(final int keyNumber) {
this.keyNumber = keyNumber;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 0 a special key number (if the setter is not called and it remains uninitialized)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not particularly at this level; if encryption is not enabled, it does remain unset, but should also be unused. Did you have something in mind to make this clearer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought maybe making the class immutable (values in constructor) or initializing the field to some illegal value to protect ourselves from a bug where it is not initialized.


@Nullable
public static byte[] encode(@Nullable byte[] data, boolean compress, boolean encrypt) {
public byte[] encode(@Nullable byte[] data, boolean compress, boolean encrypt) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to move the compress and encrypt params to be fields of the class instead?

Copy link
Collaborator Author

@MMcM MMcM Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved compressionEnabled and encryptionEnabled from FDBDirectory into the serializer.

MMcM and others added 4 commits September 29, 2025 08:59
…ecord/lucene/directory/LuceneSerializer.java

Co-authored-by: ohadzeliger <70664918+ohadzeliger@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants