-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Reveal Substring Template #207
Conversation
Can we add an option for someone to constrain 'only one' or something, meaning that if its on it checks the rest of the string, and if any other substring matches then it returns an argument of 0? Can you also calculate r automatically with fiat shamir on all the inputs? So like the query string, the large string, and the index? |
I think these are two different use cases:
The current template is optimised to follow Case 1. What I suggest we can do is, we can have two variations for the two cases. For Case 2, we can use Zac's string search algorithm (https://github.com/noir-lang/noir_string_search) which is efficient for that case. |
/// @input revealedString An array of ASCII values representing the substring to be matched | ||
/// @input r A random value used for the RLC calculation | ||
/// @output isValid A signal that is 1 if the substring matches at the given index, 0 otherwise | ||
template SubstringMatch(maxLength, maxSubstringLength) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we VarLeftShift in
by startIndex
and then compare first endIndex - startIndex
bytes of both shiftedIn
and revealedString
? (Or even simply reveal that as output, instead of revealString as an input). We can do this mostly with existing circuits like PackByteSubarray
.
(Not sure which is more constraint efficient but something to compare if you are down)
But PR is good otherwise 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out @saleel. The SelectSubArray
template is indeed much more efficient than our current approach.
And I assume that it has been audited before too. I believe that using SelectSubArray
is better for revealing substrings. We can delete this PR.
cc: @Divide-By-0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Thank you for the benchmarks.
Yes, if there is no explicit need for comparing values for equality in circuit we can simply use SelectSubArray
. Otherwise this can be modified to do "SelectSubArray
+ Compare packed bytes
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we just want to reveal the substring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Divide-By-0 FYI ^
I think can close this PR if you also agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saleel made some changes, can you have a look?
Fixed a bug in retrieving the DKIM public key: For some emails, the DKIM public key is not stored TXT record directly in `selector._domainkey.domain`, but instead points to a CNAME. For example, when retrieving the DKIM public key for the following domain: `protonmail._domainkey.proton.me`, we first need to resolve the CNAME for this domain: `protonmail.domainkey.drfeyjwh4gwlal4e2rhajsytrp6auv2nhenecpzigu7muak6lw6ya.domains.proton.ch`. However, using the "dns" npm package, the CNAME cannot be resolved. With this update, all DNS resolutions will be performed using google DNS-over-HTTPS (DoH).
Co-authored-by: saleel <saleel@saleel.xyz>
* Fix SelectRegexReveal * Add a new test for the SelectRegexReveal template.
}).rejects.toThrow("Assert Failed"); | ||
}); | ||
|
||
it("should not match when substring is not at the beginning", async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait what, it should match in the test on line 65 right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is test if for the helper circuit CheckSubstringMatch which is used to check for matches from each starting index - so I think it correct.
Btw, why cant we reveal data by taking a startIndex and length as input? I think that would be still cheaper. Is there a use-case we need to enforce there is only one substring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for answering that @saleel, and you are correct. CheckSubstringMatch
checks only a part of the input (basically a window), so this template is actually used in CountSubstringOccurrences
.
Btw, why cant we reveal data by taking a startIndex and length as input? I think that would be still cheaper. Is there a use-case we need to enforce there is only one substring?
Yes, there is a use-case where the json data might be like:
{
command: "do this",
other_string: "command: \"do this instead\""
}
And we want to reveal n
characters after the substring command
. Someone can spoof it like this.
|
||
signal output substring[maxSubstringLength]; | ||
|
||
// substringStartIndex should be less than maxLength |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really necessary given the check on substringStartIndex + substringLength
? Additionally we have been assuming valid index on util circuits, and asks the consuming circuit to do these range checks. This is to avoid redundant checks when main circuit use many of these utils.
Though I think its ok to do substringStartIndex + substringLength < maxLength
in here.
We can also assert substringLength is not Zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, makes sense. I'll add a note above the template regarding this 🙌
Description
This PR introduces a new circom template called
SubstringMatch
that verifies if a given substring exists within a larger string at a specified index.The
SubstringMatch
template uses a Random Linear Combination (RLC) approach to perform efficient substring comparisons. This method allows for constant-time verification regardless of the input string length, making it suitable for use in zk-SNARKs.Type of Change
Functionality and Usage Example
The
SubstringMatch
template takes the following parameters and inputs:Parameters:
maxLength
: The maximum length of the input stringmaxSubstringLength
: The maximum length of the substring to be matchedInputs:
in
: An array of ASCII values representing the input stringstartIndex
: The starting index of the substring in the input stringrevealedString
: An array of ASCII values representing the substring to be matchedr
: A random value used for the RLC calculationOutput:
isValid
: A signal that is 1 if the substring matches at the given index, 0 otherwiseExample usage:
This example demonstrates how to use the
SubstringMatch
template to verify that the substring "World" exists within the string "Hello, World!" starting at index 7.Checklist: