Skip to content

Conversation

@hero78119
Copy link
Collaborator

@hero78119 hero78119 commented Jan 23, 2026

Extract common factor in monomial arithmetics to improve gpu performance

benchmark

Only modify prove_main_constraint
Below shows in 23817600 first 3 shards keccak prove_main_constraint data

name Before (ms) After (ms) Improvement
e2e 161s 158s
Shard 0th keccak prove_main_constraint 115.0 73.4 36.17%
Shard 1th keccak prove_main_constraint 119.0 76.8 35.46%
Shard 2th keccak prove_main_constraint 119.0 74.2 37.65%

circuit stats

Layer Total Terms Total Factors Min Max Avg Factors Groups Factored % Naive Mul Factored Mul
ADD_main 25 51 1 3 2.04 3 88.00% 51 27
SUB_main 25 51 1 3 2.04 3 88.00% 51 27
AND_main 29 57 1 2 1.97 1 96.55% 57 29
OR_main 29 57 1 2 1.97 1 96.55% 57 29
XOR_main 29 57 1 2 1.97 1 96.55% 57 29
SLL_main 108 286 1 4 2.65 23 78.70% 286 134
SRL_main 117 317 1 4 2.71 24 79.49% 317 145
SRA_main 117 317 1 4 2.71 24 79.49% 317 145
SLT_main 57 149 1 4 2.61 13 77.19% 149 73
SLTU_main 57 149 1 4 2.61 13 77.19% 149 73
MUL_main 26 54 1 3 2.08 3 88.46% 54 28
MULH_main 37 83 1 3 2.24 5 86.49% 83 41
MULHSU_main 37 83 1 3 2.24 5 86.49% 83 41
MULHU_main 37 83 1 3 2.24 5 86.49% 83 41
DIVU_main 151 450 1 4 2.98 43 71.52% 450 218
REMU_main 151 450 1 4 2.98 43 71.52% 450 218
DIV_main 151 450 1 4 2.98 43 71.52% 450 218
REM_main 151 450 1 4 2.98 43 71.52% 450 218
ADDI_main 21 43 1 3 2.05 3 85.71% 43 23
ANDI_main 25 49 1 2 1.96 1 96.00% 49 25
ORI_main 25 49 1 2 1.96 1 96.00% 49 25
XORI_main 25 49 1 2 1.96 1 96.00% 49 25
SLLI_main 101 272 1 4 2.69 23 77.23% 272 127
SRLI_main 110 303 1 4 2.75 24 78.18% 303 138
SRAI_main 110 303 1 4 2.75 24 78.18% 303 138
SLTI_main 53 141 1 4 2.66 13 75.47% 141 69
SLTIU_main 53 141 1 4 2.66 13 75.47% 141 69
LUI_main 17 33 1 2 1.94 1 94.12% 33 17
AUIPC_main 69 184 1 3 2.67 11 84.06% 184 79
BEQ_main 30 69 1 3 2.30 5 83.33% 69 34
BNE_main 30 69 1 3 2.30 5 83.33% 69 34
BLT_main 54 144 1 4 2.67 14 74.07% 144 71
BLTU_main 54 144 1 4 2.67 14 74.07% 144 71
BGE_main 54 144 1 4 2.67 14 74.07% 144 71
BGEU_main 54 144 1 4 2.67 14 74.07% 144 71
JAL_main 14 27 1 2 1.93 1 92.86% 27 14
JALR_main 46 114 1 3 2.48 9 80.43% 114 54
LW_main 45 110 1 3 2.44 7 84.44% 110 51
LHU_main 50 123 1 3 2.46 8 84.00% 123 57
LH_main 52 128 1 3 2.46 9 82.69% 128 60
LBU_main 56 138 1 3 2.46 11 80.36% 138 66
LB_main 58 143 1 3 2.47 12 79.31% 143 69
SW_main 45 110 1 3 2.44 7 84.44% 110 51
SH_main 50 124 1 3 2.48 8 84.00% 124 57
SB_main 59 146 1 3 2.47 13 77.97% 146 71
ECALL_HALT_main 9 17 1 2 1.89 1 88.89% 17 9
Ecall_Keccak 2785 5595 1 3 2.01 34 98.89% 5595 2816
weierstrass_add 4512 11599 1 3 2.57 65 98.56% 11599 4576
weierstrass_double 5297 13697 1 3 2.59 97 98.17% 13697 5393
fp_add 3192 10242 1 4 3.21 1156 63.78% 10242 5371
fp_mul 3192 10242 1 4 3.21 1156 63.78% 10242 5371
fp2_add 1938 5124 1 3 2.64 194 89.99% 5124 2131
fp2_mul 6656 18527 1 3 2.78 129 98.06% 18527 6784
weierstrass_decompress 5332 14681 1 4 2.75 291 94.54% 14681 5686
secp256k1_scalar_invert 1877 5337 1 3 2.84 65 96.54% 5337 1941
secp256r1_scalar_invert 1877 5337 1 3 2.84 65 96.54% 5337 1941
uint256_mul 5298 17558 1 4 3.31 726 86.30% 17558 6615
DYNAMIC_RANGE_18 4 7 1 2 1.75 1 75.00% 7 4
DOUBLE_RANGE_DoubleU8 4 7 1 2 1.75 1 75.00% 7 4
And_OPS_ROM_TABLE 5 9 1 2 1.80 1 80.00% 9 5
Or_OPS_ROM_TABLE 5 9 1 2 1.80 1 80.00% 9 5
Xor_OPS_ROM_TABLE 5 9 1 2 1.80 1 80.00% 9 5
Ltu_OPS_ROM_TABLE 5 9 1 2 1.80 1 80.00% 9 5
RAM_Register_RegTable 4 7 1 2 1.75 1 75.00% 7 4
RAM_Memory_StaticMemTable 4 7 1 2 1.75 1 75.00% 7 4
RAM_Memory_PubIOTable 4 7 1 2 1.75 1 75.00% 7 4
HintsTable_Memory_RAM 4 7 1 2 1.75 1 75.00% 7 4
StackTable_Memory_RAM 2 3 1 2 1.50 1 50.00% 3 2
HeapTable_Memory_RAM 2 3 1 2 1.50 1 50.00% 3 2
LocalRAMTableFinal 6 11 1 2 1.83 1 83.33% 11 6
ShardRamCircuit_main 957 2778 1 4 2.90 542 43.36% 2778 1778
SHA256_EXTEND_main 530 1059 1 2 2.00 1 99.81% 1059 530
LOG_PC_CYCLE_main 26 51 1 2 1.96 1 96.15% 51 26
PROGRAM 9 17 1 2 1.89 1 88.89% 17 9

@hero78119 hero78119 changed the title finish most of impl monomial term factorized Jan 23, 2026
@hero78119 hero78119 changed the title monomial term factorized [performance] monomial term arithmetics common term factorized Jan 23, 2026
@hero78119 hero78119 force-pushed the feat/monomial_common_term branch from 80f5567 to bdfb1d1 Compare January 23, 2026 02:51
@hero78119 hero78119 force-pushed the feat/monomial_common_term branch from bdfb1d1 to c96c218 Compare January 23, 2026 02:53
@hero78119 hero78119 force-pushed the feat/monomial_common_term branch from 8c34c4c to da20132 Compare January 23, 2026 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants