Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

depends: commit relic-toolkit/relic@4140f28e to source tree #93

Closed
wants to merge 5 commits into from

Conversation

kwvg
Copy link
Collaborator

@kwvg kwvg commented Sep 25, 2024

Motivation

relic-toolkit/relic@4140f28e comes with optimizations for aarch64 (ARM64) hosts. As hosting providers are starting to provide ARM64-based Linux hosting and Apple has transitioned most of their product line to the Apple Silicon platform, it is beneficial to update relic to a version that includes them.

Additional Information

@kwvg kwvg marked this pull request as ready for review September 26, 2024 16:28
Relic uses definitions that mean different things depending on the
context, we only need to define them once and make sure they don't
collide with another option for the same param. So, to account for where
all these unique definitions have come from, we will annotate them and
mention the other available options as comments for completion's sake.

If an param doesn't define any new values, it has been omitted for
brevity's sake.
@kwvg kwvg force-pushed the bump_relic branch 2 times, most recently from 5412974 to 001a556 Compare October 4, 2024 19:59
kwvg added 4 commits October 4, 2024 20:00
4140f28e Indentation.
15f82369 Do not ignore GM8 curves.
0e9da713 Fixing bad batch-rename.
3810062a Now it works!
9c03fcc8 Another try.
26c0d147 More fixes.
60f3dab1 Fix.
d0564f82 Cycle counter for Apple M1.
94fc34cc Add support to AArch64.
b915b6e7 Merge pull request Chia-Network#249 from fominok/fix-unnecessary-setjmp-include
19b2a6c4 fix: include order
5764793a fix: add ifdef for setjmp include
c81ba67f Fix date and formatting.
01ee0d0c Typo.
234a0347 Make lazy reduction less agressive.
9a328da1 Add script to iterate through all pairing configurations.
472ee41b Limit extension degree to test/benchmark, generalize extension field arithmetic a bit.
2ffceecc Reach a better middle-ground.
4674dbd3 Simplify code further by removing corner cases.
1226cf51 Remove deprecated piece of ASM.
8082337c One missing place.
dc5ec29f Remove fp2_mulc_low() to simplify code and obtain performance.
72657a6c Fix silly bug in reading char 2 polys.
249bfa0e Reduce code duplication in string parsing.
97556e0d Small speedup.
d3ea4ceb Bug fix.
109d4422 Make CMLHS more flexible and more const for signature schemes.
f506551d Adjust CMLHS benchmarking.
15f035ee Fix padding issue in benchmarks.
877a7b97 Remove printing artifact.
d57ec586 Update demo program.
258a52f0 Add type casts.
c0d6e3bf Adjust API update.
e605e8b9 Make membership test in G2 more conservative.
8ec2d24a Add SSS benchmarks and adjust Lagrange prototypes.
329c83f5 Handle case when g2 == 0, next do it in constant time.
fdd77ad4 Throw error when trying to invert 0.
83502818 Now fix for encryption.
be9c2632 Bump library version.
eb559cd8 Fix BC sign error.
40f24f01 Fix compile error.
a73e819c Add extra checks for division by zero.
d7dcb228 Better comment.
df6c55ee Faster subgroup membership tests.
ce57d38b More const stuff.
f4333169 Fix prototype.
f1ab6d9e Fix const issues.
d5d86887 Use const and size_t more.
978420c2 Merge pull request Chia-Network#226 from jgdumas/rand_constmod
b15b94d8 Update makefile.
a955e8cf Add preset and fix comments.
e36acde8 Merge branch 'main' into rand_constmod
830506d2 Merge pull request Chia-Network#240 from tarakby/permute-optimize
695ed3f1 fix the loop end
3fe3bb8f merge array initialization with permutation
4fdc29ba Reverting due to API issues with compression.
1e420db4 Added BLS12-317 curve.
ceb475bc One more test for BGN.
cd5ded73 Fix referencing on EP4.
260c379c Remove dependence on fp4_sub_dig().
4dd709b6 Minor touches to preset and EP4.
62fc88dc Add missing stuff to LABEL support.
be47a13b Add missing function for LABEL support.
26cb4d1b Faster final exponentiation for BLS24 curves.
f2c65889 Minor adjustments.
7797e082 Added BLS24-315 curve and refactoring of backends.
77880ada Fix comment for precision.
700d0d3c Define missing macro.
c2c2ad25 Slightly faster final exponentiation in BLS12.
c65ae21a Add BLS12-377 curve.
417c5c55 Handle negative values better and add tests (closes Chia-Network#237 again)
70874a48 Reduce restrictions for modular reduction.
24226360 Handle corner cases with negative moduli better (closes Chia-Network#237)
f7ea6d64 Handle negative inputs correctly in Lehmer's GCD (close Chia-Network#236).
277a4f02 Parse negative zero correctly (closes Chia-Network#235).
77c37c65 Make linking hacks Mac-specific.
c71eaf25 Hardcoded link library path (Mac OS X is weird).
5e5f2c63 Restore OpenMP library linking.
b0356fed Add hardcoded include dir.
22969590 Update multi for MacOS.
821d03a8 One more fix.
8c35f31c Fix silly bug in memory alloc.
6d776c73 Fixed memory allocation problem.
8956634c Revert bad idea of moving asm macro here.
1f349b1d Faster `ep2_mul_sim()` using signed representation.
a54febf0 PSI optimizaitons.
3e57eba8 Fix free.
90de7630 Switch to dynamic allocation.
c34f9709 Better formatting.
6f0de20b Simpler implementation by hardcoding CRS in both sender/receiver.
e535105e Only warmup if running multiple benchmarks.
128a11a3 Include parameters from file.
f2461dd1 Add one-shot benchmarks.
dffc0a86 Further adjustment to the PSI API.
051c7768 Fix Makefile.
b10c4761 Add Makefile for demo.
544a0dff Better formatting for PSI benchmarks and new demo.
32eadbdc Avoid cheating on the receiver side.
49bfcaf7 Restoring.
eee15016 Trying brew now.
d8c85b1f Typo
81365298 Change MacOS environment.
432842d8 Fix compiler switch for Mac OS X.
36f6d932 Trying again.
8d77cf3f Windows is fixed, now Mac OS X.
99a20984 Another attempt at fixing OpenMP in Windows+MAC OS X.
fab8e8f2 Fix MULTI=PTHREAD, remove unnecessary linking libs, try to fix MACOSX.
70f2cd66 Uniformize CMake syntax.
f4007317 Try again to fix MULTI on Windows.
dd3beabc Avoid segmentation faults when core is not initialized.
a4e2f865 Better formatting.
3563cb20 Another corner case.
f52f80eb Add multithreading target.
384468a2 Fix allocation of intermediate value in Barrett.
3465cfbb Renamed rlc_thread for Windows compatibility.
87cb5953 Fall back to long division if input to Barreto reduction is out of bounds.
69cdd9f2 Fix corner case for reduction modulo 0 in Chia-Network#221.
67590b0e Align preset with the rest.
7cc79d68 Merge branch 'main' into rand_constmod
b12da5fa Improve multithreading configuration a bit.
2225e601 Move global pointer to context.
4d2c2b40 Uniformize SHI-PSI and RSA-PSI rounds and statistical security.
7ef3af9e Minor speedups.
0594e94c Refactor CRT support.
7a9ec0f8 Simplify code when e = NULL.
bfa06d2f Another try.
94e5653d Minor refactoring of GCD config/algorithms.
47bb7e61 Avoid overflow in bn_rand_mod().
280b1425 Memory bugs.
1c3d0e92 Memory fix.
180b6892 Update label support.
a483dfc7 Restore tests.
e3a1af19 Add SHI-PSI protocol, fix tests/benchmarks.
4b7ea245 Implement SHI-PSI protocol.
0e2c4099 Implement permutation within PSI protocols.
58e5c939 Add permutation generation algorithm.
f4f840db Better error handling.
1738af46 Better error handling.
b15ba275 More accumulator-based PSI protocols.
b4088e48 Remove compilation warnings.
fa52ec13 put bac kconst on _add
239ee3ae Merge branch 'main' into rand_constmod
a019a5ef Measure new function as well.
c72e80ec Add homomorphic addition function to PHPE.
9db54243 Allow SSS with threshold of 2.
e927c68e Merge branch 'main' into rand_constmod
af77aefa Identation.
0a193b72 Restore benchmarks.
70ca8223 Swap G1 and G2 in Laconic PSI protocol.
8355662d Make WITH=ALL.
6d6d289f Merge branch 'main' into rand_constmod
02913670 Minor polish.
addd1f26 Fix Shamir's Secret Sharing and memory alloc issue in ETRS.
6e05783c Merge branch 'main' into rand_constmod
526e295c Large rename to match library standard.
37714c3f Minor adjustments to tests.
a3875035 Reorganize implementation to use the crt_t type and save some lines of code.
1b782d01 Added first version of Shamir's Secret Sharing.
0a03f11a _print/_size const
e3ad5def _low const
f822fbeb oups one const too much
d762fe6b _fpx const !!!
00200c9e _enc /_dec const
75b6cbf0 print cmp const
fce082f1 is_infty const
48e6f001 more const _mul
a223bc4e const _mul
434b1ad5 const _add
f377c5a6 const in _map
0e6218cf const in cmp/copy
a7d78dc6 more const in g?_mul/exp
bfcc28fe more const
be5c72a6 modulus should be const in rand_mod?
211c3b9d Merge pull request Chia-Network#225 from jgdumas/shpe
ba1806ea forgot ->g
6bb15578 inverted p <--> q
1975412f test divisibility in gen factor prime
cea524e3 lower case
dce585ef put back "protocols from ECC"
6ce0271c Merge branch 'main' into shpe
19bea45f Subgroup Paillier Cryptosystem, bench and test
b3776b79 Merge pull request Chia-Network#224 from jgdumas/phpe
0e8509f1 start cp_shpe
198dbec9 comment in include now
3433feb2 bn_gen_factor_prime in header
74209819 Prime generation with prime factor of (p-1)
b969060e comment
3fc3232e If g=1+n, save one of the two exponentiations of phpe encryption
394b3d78 With g=1+n, dp and dq can be computed without any exponentiations This should be faster
058a73c9 Test also gt_exp_gen
6309986f Adjust Bezout coefficients and tweak the algorithm (closes Chia-Network#223)
fe61f2f9 Hope it's correct now.
9e523429 Fix endos again.
0775da6e Memory issue.
8736a422 Fix endormorphisms.
1f30423e Fix presets.
35ab6d73 Remove these, as they are not needed anymore.
98e4f129 Refactor of Lagrande interpolation.
3350fbfd Minor speedup with simultaneous inversion.
6c18368b Handle upper/lower case in backends a bit better.
1009a1c2 Remove warnings.
dea2a3a4 Fix compilation issue.
cffbdae9 Simplify subgroup membership tests.
202d9fd2 Several optimizations for ED module and ETRS scheme.
bf371297 Merge pull request Chia-Network#219 from Fuzzbawls/2022_remove-redundant-decls
ddaeb28a Remove redundant declarations in relic_fpx.h
97081d11 Minor speedup in LaPSI with cool trick on the exponent.
4107d14c Update LABEL support.
499a65a1 Better test and fixed memm alloc issue for PSI.
bbe8963e Memory allocation issue.
a3533dde Addind pairing-based PSI protocol.
e1b388a0 Trim bn_t at the end of copy.
8930b261 Restore benchmarks.
e9c681ab Fix bugs in EXTND coordinate system.
49b2a0d1 Respect STRIP.
a953d385 Fix symbol name.
5260f210 Build with stripping enabled.
4848f27d Fix problems with compression and blinding on EDDIE.
e4dd23d3 Remove redundant assignment.
50935c18 One more fix for EDDIE.
01433d7f Do not zero buffers without need.
2601df77 Make EDDIE compile with STRIP again.
6d29b274 Share more code between jumpdivstep inversion/symbol.
0f4a481f Refactoring of jumpdivstep inversion/symbol to share code.
f333841c Add missing present.
cb488ee4 Fix preset.
842cb1f8 Giving up again, problem is with GNU as and our macros.
970fac17 Trying again...
5d26c6a8 Fix typo.
eb09222d Another attempt at MacOS integration.
83090950 Merge pull request Chia-Network#216 from feandalo/patch-1
64a5e5e0 Update README.md
c56f2cc3 Remove MacOS for a while due to GCC shenanigans.
4a5b2527 Trying another compiler for MacOS.
052c1bf6 Fix artifact of the previous merge.
f8431f32 Fix syntax.
7d269435 Reorder CMake commands to see if config works.
bb278393 Merge pull request Chia-Network#215 from relic-toolkit/symbol
30f5b659 Fix signal on symbol computation as well.
9a001ec7 Fix for small limb sizes.
896022b3 Formatting.
2f081928 Fixed another bug in bn_mul2_low().
0c50b590 Another bug fixed.
a54c32c5 Fix carry issue in shifting functions.
06cff1fa Update LABEL support with new symbol functions.
37af1f4a Clarify this little hack.
0fce7d3f Merge branch 'main' into symbol
5ffc7f8c Fix bugs with divstep config and return value.
536422ad Update interface of the BN symbol functions.
c274e2b2 Update presets to complete configuration.
6c046a07 Better formatting.
e7a15a36 Remove compilation warning.
11592073 Add more missing files for Legendre.
6a2d8b9b Benchmark missing algorithm.
f5f36c07 Add another missing file, fix dates.
b7bbfa11 Fix default low-level implementation of Legendre.
a545d806 Add missing file.
f21d15ca Refactor configuration to include Legendre symbol, improve tests/bench.
d4747b69 Remove warning.
0b49c334 Fixed bug when prime is close to power of 2.
b06d27e5 Fix documentation of EP methods.
63aa2e29 Bump version for new future release.
a269b6db Fix allocation issue.
4ae9c899 Adjust batch size.
66993770 Simplify API and optimize constant-time execution.
e3f82e36 Rename workflow run.
0be5e47d Remove warning.
df5ce77a Restore commented out function.
f1688dd4 Replace dbl_t type with macros.
ab40d2f6 Fix typo.
24602b1b Add missing tests, label and benchmarks.
bb5a5dba Refactor divstep-based inversion.
1c6ef422 Fix type.
975573ee New jacobi symbol algorithm.
acdcf436 Merge pull request Chia-Network#213 from lorenzcat/main
3ab3a08f Fix NETBSD definition
260c9f8b More bug fixes involving inputs/outputs and corner cases for scalars.
ecaf98f9 Fix regression.
72e5652a Improve coverage of corner cases in tests.
a001e3e0 Simplify handling of scalars out of bounds.
7bc1d747 Make outputs independent of inputs.
156156a5 Fix sign bug in ep2_mul_sim_lot().
729bbc18 Fix LABEL and add GH Action for detecting regressions.
a33d45bd Add SMLERS.
dc851e38 Better printing.
cb2075dd Simplify code.
06f30d89 Update demo.
edd1aa0c Fix more.
89609c9e Fix compile error.
cf3a82a7 Add SMLERS and generalize SoK protocol.
64523296 Add bn_mod_inv_sim(), tests and benchmarks.
42790a02 Fix for ED module.
7a80c3e2 Improve documentation and flexibility of ep_mul_sim_lot().
edf5bd94 Merge branch 'main' of github.com:relic-toolkit/relic
804e5b7f Fix compile error with LABEL.
46742aef Add new algorithm.
2f0273c5 Minor fixes.
f49ae035 More conservative optimization.
7fb584fa Add new symbol computations, more work to be done on the API.

git-subtree-dir: depends/relic
git-subtree-split: 4140f28e9acb19081f522fe5595d3ddd769ee686
@PastaPastaPasta
Copy link
Member

develop

> $ ./build/src/runbench                                                                                                 [±develop ●]

Signing           config.guess*     depcomp*          install-sh*       python-bindings/                                  
Total: 5000 runs in 3553 ms
Avg: 0.7106 ms

Verification
Total: 10000 runs in 24035 ms
Avg: 2.4035 ms

Public key validation
Total: 100000 runs in 19129 ms
Avg: 0.19129 ms

Signature validation
Total: 100000 runs in 21508 ms
Avg: 0.21508 ms

Aggregation
Total: 100000 runs in 211 ms
Avg: 0.00211 ms

Batch verification
Total: 100000 runs in 98340 ms
Avg: 0.9834 ms

PopScheme Aggregation
Total: 5000 runs in 10 ms
Avg: 0.002 ms

PopScheme Proofs verification
Total: 5000 runs in 12127 ms
Avg: 2.4254 ms

PopScheme verification
Total: 5000 runs in 7 ms
Avg: 0.0014 ms

branch

> $ ./build/src/runbench                                                                                              [±bump_relic ●]
wrong fixed counters count
kpc_set_config failed

Signing
Total: 5000 runs in 4005 ms
Avg: 0.801 ms

Verification
Total: 10000 runs in 25657 ms
Avg: 2.5657 ms


Public key validation
Total: 100000 runs in 18703 ms
Avg: 0.18703 ms


Signature validation
Total: 100000 runs in 22096 ms
Avg: 0.22096 ms

Aggregation
Total: 100000 runs in 217 ms
Avg: 0.00217 ms


Batch verification
Total: 100000 runs in 101176 ms
Avg: 1.01176 ms

PopScheme Aggregation
Total: 5000 runs in 9 ms
Avg: 0.0018 ms

PopScheme Proofs verification
Total: 5000 runs in 11859 ms
Avg: 2.3718 ms

PopScheme verification
Total: 5000 runs in 7 ms
Avg: 0.0014 ms

@UdjinM6
Copy link

UdjinM6 commented Oct 4, 2024

can confirm that this branch is ~10% slower at Signing for me too

@kwvg
Copy link
Collaborator Author

kwvg commented Oct 5, 2024

Summary of Pasta's test results (source) conducted on an Apple Silicon MacBook running macOS Sequoia, built using CMake.

Note: Difference is calculated using values up to 4 decimal places and difference is up to 2 decimal places.

Test develop (adbd094) #93 (a9d20ce) Difference
Signing 0.7106 ms 0.801 ms -12.72%
Verification 2.4035 ms 2.5657 ms -6.74%
Public key validation 0.19129 ms 0.18703 ms 2.19%
Signature validation 0.21508 ms 0.22096 ms -2.74%
Aggregation 0.00211 ms 0.00217 ms negligible
Batch verification 0.9834 ms 1.01176 ms -2.87%
PopScheme Aggregation 0.002 ms 0.0018 ms 10.00%
PopScheme Proofs verification 2.4254 ms 2.3718 ms 2.20%
PopScheme verification 0.0014 ms 0.0014 ms none

@kwvg
Copy link
Collaborator Author

kwvg commented Oct 5, 2024

Testing conducted with a Ryzen 5 5600G on Debian 12 on develop (adbd094) (i.e. with Relic version aecdcae7), difference is %age against first column measurements, built using GNU Autotools.

Note: Optimization is disabled by manually setting CPU_ARCH and ARCH to none and RELIC_NONE respectively to avoid performance degradation from debug builds or disablement of compiler optimizations.

Test GMP disabled, optim. disabled GMP disabled, optim. enabled Difference GMP enabled, optim. disabled Difference GMP enabled, optim. enabled Difference
Signing 1.6758 ms 1.6974 ms -1.28% 0.7504 ms 55.22% 0.672 ms 59.89%
Verification 5.1007 ms 5.1095 ms -0.17% 2.2031 ms 56.80% 2.2009 ms 56.85%
Public key validation 0.27441 ms 0.2747 ms -0.10% 0.17649 ms 35.71% 0.17536 ms 36.11%
Signature validation 0.41288 ms 0.41454 ms -0.41% 0.18649 ms 54.84% 0.18708 ms 54.69%
Aggregation 0.004 ms 0.00402 ms negligible 0.00188 ms 55.00% 0.00188 ms 55.00%
Batch verification 2.27048 ms 2.27429 ms -0.16% 1.01539 ms 55.28% 1.00102 ms 55.91%
PopScheme Aggregation 0.004 ms 0.0038 ms 5.00% 0.0018 ms 55.00% 0.0018 ms 55.00%
PopScheme Proofs verification 5.1004 ms 5.1162 ms -0.30% 2.203 ms 56.80% 2.203 ms 56.80%
PopScheme verification 0.0026 ms 0.0026 ms none 0.0014 ms 46.15% 0.0014 ms 46.15%

Testing with similar conditions as above but with #93 (a9d20ce) (i.e. with Relic version 4140f28e)

Test GMP disabled, optim. disabled GMP disabled, optim. enabled Difference GMP enabled, optim. disabled Difference GMP enabled, optim. enabled Difference
Signing 1.6292 ms 1.6652 ms -2.20% 0.6454 ms 60.38% 0.6246 ms 61.66%
Verification 5.0248 ms 5.0388 ms -0.27% 2.1847 ms 56.52% 2.1895 ms 56.42%
Public key validation 0.26308 ms 0.26396 ms -0.34% 0.16588 ms 36.95% 0.16592 ms 36.92%
Signature validation 0.40091 ms 0.40593 ms -1.24% 0.18407 ms 54.10% 0.18358 ms 54.22%
Aggregation 0.00391 ms 0.00392 ms negligible 0.00189 ms 53.84% 0.00186 ms 53.84%
Batch verification 2.22878 ms 2.24061 ms -0.53% 0.98194 ms 55.94% 0.98237 ms 55.92%
PopScheme Aggregation 0.0038 ms 0.0038 ms none 0.0018 ms 52.63% 0.0018 ms 52.63%
PopScheme Proofs verification 5.0188 ms 5.033 ms -0.28% 2.1854 ms 56.45% 2.1854 ms 56.45%
PopScheme verification 0.0026 ms 0.0026 ms none 0.0014 ms 46.15% 0.0014 ms 46.15%

@kwvg kwvg mentioned this pull request Oct 6, 2024
5 tasks
PastaPastaPasta added a commit to dashpay/dash that referenced this pull request Oct 8, 2024
9dad525 build: set `-march` irrespective of target operating system (Kittywhiskers Van Gogh)
82b4405 build: update gmp to 6.3.0 (Kittywhiskers Van Gogh)

Pull request description:

  ## Additional Information

  After [bls-signatures#92](dashpay/bls-signatures#92), GMP is re-enabled for Apple Silicon macOS targets so long as GMP 6.3.0 or higher is used. GMP significantly contributes to performance improvements in bls-signatures, generally to the tune of ~50% ([source](dashpay/bls-signatures#93 (comment))).

  The URL has been changed based on guidance from the Homebrew recipe for `gmp` ([source](https://github.com/Homebrew/homebrew-core/blob/51c899140c84d38dfa4c4fe623c859e6241504a0/Formula/g/gmp.rb#L44-L45)).

  ## Breaking Changes

  None expected.

  ## Checklist:

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)**
  - [x] I have added or updated relevant unit/integration/functional/e2e tests **(note: N/A)**
  - [x] I have made corresponding changes to the documentation **(note: N/A)**
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  PastaPastaPasta:
    utACK [9dad525](9dad525)

Tree-SHA512: fbab727b9aa331f3eadd0573b925bc222380732782642cd4e12d670162cc0c45bf14edc8f99227960dc894f968f1d3f22496f0da7aca898ecb8db41d3a504f2b
@kwvg
Copy link
Collaborator Author

kwvg commented Oct 8, 2024

Changes don't seem to benefit us all that much, Relic-based optimizations seem to not make a difference outside margin of error and Apple Silicon performance regression was reproduced, closing.

@kwvg kwvg closed this Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants