Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elligator2 Forward and Reverse Mappings #612

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jmwample
Copy link

@jmwample jmwample commented Jan 5, 2024

Hello,

This is an implementation of the Elligator2 forward and reverse mappings --- points to representative values, as well as representative values to points. The specific goal is to make Elligator mapping for x25519 handshakes possible in pure rust.

This implementation is gated behind a feature flag "elligator2" in the curve25519_dalek crate for point transforms. I have tested this against the test vectors from the Kleshni/Elligator-2 C implementation as well as (a fork of) the agl/ed25519/extra25519 golang implementation. These can be seen in the test cases for this PR.

I also added a feature flag "elligator2" to the x25519_dalek crate that exposes a PublicRepresentative type that mirrors the PublicKey type in order to exchange only Elligator Representatives when performing a diffie-hellman handshake.

This implementation of the Elligator2 transforms (to the best of my ability):

  • runs in constant time
  • is equivalent to agl/ed25519/extra25519
  • is equivalent to Kleshni/Elligator-2
  • supports existing backends (I have tested with u64 and u32, I am not sure how to test others.)

I have seen a couple issues and PRs dealing with elligator2 mappings and this library, so I apologize if this is muddying the water . This is a feature that I really needed so I have implemented the functionality based on several existing implementations in other languages.

I am also not an expert in crypto implementations and would greatly appreciate any feedback. I hope this is helpful and potentially solid enough to include in the library. 👍

Related to:

@rozbb
Copy link
Contributor

rozbb commented Jan 6, 2024

Thank you! It will take a bit for me to take a look at this (trying to make some paper deadlines).

Re the IETF standard, hasn't that officially been standardized now? RFC 9380. Would be good to have test vectors from there.

@jmwample
Copy link
Author

jmwample commented Jan 6, 2024

I will get those added. I haven't spent much time looking at that RFC yet, so I can try to work in the interface there as well (and get the rest of the tests passing).

@jmwample
Copy link
Author

I have added the RFC9380 test vectors for the elligator2 implementation, and an interface that should work if someone wanted to use this to implement the h2c interface for curve25519 and/or edwards25519. Not sure if the full h2c implementation belongs in this crate or a different crate with a more general / uniform interface.

The CI checks should also be passing now.

@jmwample jmwample force-pushed the elligator2-ntor branch 3 times, most recently from a972ecf to fcee93b Compare February 13, 2024 19:54
@randombit
Copy link
Contributor

Any chance of this landing soon?

@jmwample
Copy link
Author

jmwample commented Apr 8, 2024

I am working on adding some final tests to ensure that the bits of the ellgator2 representatives appear as uniform random.

TLDR: The sqrt_ratio_i function seems to be canonical so this library shouldn't suffer from the described computational distinguisher. However, deriving an elligator2 representative only gives 254 bits of random to begin with, so the high order bits need handled in some way to prevent a trivial distinguisher.

The specific issue that this is testing for can be described as:

An instantiation of Elligator is parameterized by what might be called
a “canonical” square root function, one with the property that
`√a^2 = √(−a)^2` for all field elements `a`. That is, we designate just
over half the field elements as “non-negative,” and the image of the
square root function consists of exactly those elements. A convenient
definition of “non-negative” for Curve25519, suggested by its authors,
is the lower half of the field, the elements `{0, 1, …, (q − 1)+/+2}`.
When there are two options for a square root, take the smaller of the two.

Any Elligator implementation that does not do this canonicalization of the final square root, and instead maps a given input systematically to either its negative or non-negative root is vulnerable to the following computational distinguisher.

[An adversary could] observe a representative, interpret it as a field
element, square it, then take the square root using the same
non-canonical square root algorithm. With representatives produced by
an affected version of [the elligator2 implementation], the output of
the square-then-root operation would always match the input. With
random strings, the output would match only half the time.

The solution from the agl/ed25519 is to randomize the high two bits when getting the representative, and clear the those same high order bits when performing a map_to_curve operation. One challenge is determining how this aligns with the test vectors from other implementations (kleshni and signal specifically, the rfc9380 tests seem to be handled properly).

For a more in-depth explanation see:

This should not impact the general interface of the PR, and I am hoping to have the changes finished within the week.

@jmwample
Copy link
Author

The latest commit fixes several issues.

  1. The Edwards RFC9380 testcases were not actually testing the things they were meant to be testing. This forced some changes in the way structure of the map_to_point functions as mapping to Montgomery, then to Edwards was missing a sign bit.

    • map_to_point for Edwards RFC9380 test cases now testing properly and passing
  2. The high order two bits of the representative are always 0 by default because correctly computed elligator2 representatives always finish with a sqrt() that takes the least-square-root value. That is, a value less than 2^254-10 (254 bits).

    • In order for the representative to be (optionally) indistinguishable from random we use a tweak byte to provide the extra randomness, added in when representative is created, and cleared when converting back to a point.
    • Both Kleshni & signal contain test cases that include non-least-square-root values which is not technically inline with the spec. In order to handle this (if interop is absolutely necessary) a map_to_point_unbounded() function is added that does not clear the high order bits before mapping to the curve.
    • A statistical test showing the effect that the tweak has on the apparent distribution of the bits over many representatives can be used to look at entropy based distinguishers (this does not necessarily help with computation based distinguishers).

I have no other changes planned for this PR without review / input.

@jmwample
Copy link
Author

jmwample commented Jul 9, 2024

I have added another refactor to the elligator2 implementation motivated by feedback based on issues encountered with encoded key distinguishability in obfs4. The changes required to fix the distinguisher resulted in two versions of the elligator2 algorithm which are not interchangeable.

More information on the issue can be found here, the solution added to the Randomized variant is described in Method 1: add a random low order point. The RFC9380 variant is the default and follows the RFC including test vectors.

A test variant exists for legacy implementations that do not use least-square-root value of the representative (i.e. kleshni & signal), but it is not exposed by default.


For now I have published my fork as its own crate (see curve25519-elligator2), but my intention is to hopefully get this merged here and eventually yank the forked crate.

@rozbb
Copy link
Contributor

rozbb commented Jul 22, 2024

Hi, thank you for this! I played around with this and had some notes:

  1. It seems like I can't get the Elligator2 tests to fail, even when they definitely should. For example, in src/field.rs, I replaced the return value of FieldElement::ct_gt with the constant Choice::from(0u8). Running cargo test --features "alloc,elligator2" did not produce any errors. Is a KAT missing?

  2. The reason I was playing with the above is because I was wondering about the correctness of the gt function in the fiat backend in the diff. My understanding is that this does a libgmp-style bigint subtraction without the subtraction. But the difference between a bigint and a fiat_25519_tight_field_element is that the latter has nontrivial equivalences. I'm worried gt might consider an unreduced fiat_25519_tight_field_element of value, say 2²⁵⁵ - 1 (equivalent to 18) as greater than 30, for example. I was trying to test this but ran into point (1) above.

I'll be reviewing the rest of the PR, but I think these should be addressed before merging. Thanks again!

jmwample and others added 3 commits July 26, 2024 10:52
with agl/ed25519/extra25519, the kleshni C implementation, and rfc9380.
Edwards rfc9380 tests and elligator representative randomness using tweaks.
fix for subgroup based computational distinguisher and updates / simplifications to the elligator2 interface as a result
@jmwample
Copy link
Author

jmwample commented Aug 2, 2024

  1. It seems like I can't get the Elligator2 tests to fail, even when they definitely should. Is a KAT missing?
  1. I was wondering about the correctness of the gt function in the fiat backend in the diff. [fiat_25519_tight_field_element] has nontrivial equivalences. I'm worried gt might consider an unreduced fiat_25519_tight_field_element of value, say 2²⁵⁵ - 1 (equivalent to 18) as greater than 30, for example.

You are definitely correct about the implementation of the gt function. I was relying on the idea that other implementations were using a reliable compare, but it turns out they fall victim to the exact issue that you describe. Really what they are trying to check is whether the value is negative, but they do so with by subtracting $(p-1)/2$ and
checking for overflow. However a direct negative check doesn't work for the Legacy implementations because the signal and kleshni implementations use the subtract and check for overflow which has a different result from a proper greater than comparison (or is_negative()) on the canonical representation of the FieldElement. I am trying to decide what the best way to handle the separate cases is. I have a proper ct_gt() implementation, but I am not sure it is even useful as the RFC9380 and Randomized variants can use is_negative() and Legacy can't use a proper greater-than.

I replaced the return value of FieldElement::ct_gt with the constant Choice::from(0u8). Running cargo test --features "alloc,elligator2" did not produce any errors.

For the RFC9380 and Randomized variants the effect is less obvious as it only takes effect when $\sqrt{\frac{-(a+A)}{2a}} \ne \sqrt{\frac{-a}{2*(a+A)}}$ (code ref) which for some reason is not the case for the test vectors from RFC8380 or my randomly selected test vectors. However a simple test where they are different is the identity value (which I have added in my work branch looking into this).


As an aside, is the implementation of is_negative() flawed in a similar way? In the elligator2 implementations
the check for negative values is to determine if the FieldElement is less than $(p-1)/2 = 2^{254}-10$ while the
FieldElement::is_negative() function only checks if the low bit is set meaning that values between $2^{254}-9$ and $2^{254}-1$ would be considered positive?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants