Split `simplifyFormulas` to sub-functions by Tomaqa · Pull Request #888 · usi-verification-and-security/opensmt

Tomaqa · 2025-11-18T15:58:41Z

This is just a refactoring. The behavior or performance should not change.

This makes the preprocessing more maintainable, allows overriding particular parts, and also allows preprocessing single formulas without necessarily giving them to the SMT solver with giveToSolver.

Also added preprocess as an alias to simplifyFormulas.

Tomaqa · 2025-11-18T16:33:24Z

Also renamed internal MainSolver::firstNotSimplifiedFrame->firstNotPreprocessedFrame and insertedFormulasCount->insertedAssertionsCount

Tomaqa · 2025-11-21T16:03:35Z

Comparison with master, 2-minute timeout, QF_LRA, (non-|)incremental + unsat cores (i.e. unsat non-incremental and producing resolution proofs, hence tracking partitions).

2025-v2.9.2-21-gfe95a928.scatter.pdf

Tomaqa · 2025-11-24T10:18:54Z

To check also the case when trackPartitions() is true, I relied on unsat cores. However, they are only a subset of non-incremental benchmarks, and we also do not really care about the unsat core computation itself. Hence, I instead add more comparisons of only (non-|)incremental benchmarks, but using a hacked implementation that forces trackPartitions() to true. In the scatter plots below, these correspond to the versions with the -dirty suffix.

I again compare the new (v2.9.2-23-g883d2889) against master (v2.9.2-21-gfe95a928), using the 2-minute timeout, QF_LRA, (non-|)incremental.

Master -dirty vs. new -dirty:
2025-v2.9.2-21-gfe95a928-dirty_v2.9.2-23-g883d2889-dirty.scatter.pdf
That is, the behavior does not change when tracking partitions. In the previous scatter plot, I showed that it also does not change when not tracking partitions.

For completeness, I show that tracking partitions matters.
This is a comparison of master vs. master -dirty:
2025-v2.9.2-21-gfe95a928_v2.9.2-21-gfe95a928-dirty.scatter.pdf
It is almost the same when comparing the new vs. the new -dirty:
2025-v2.9.2-23-g883d2889_v2.9.2-23-g883d2889-dirty.scatter.pdf
As expected, tracking partitions overall harms the performance, although sometimes it is better, probably due to instability.

This makes the preprocessing more maintainable, allows overriding particular parts and also allows preprocessing single formulas without necessarily giving them to the SMT solver Also added `preprocess` as an alias to `simplifyFormulas`

…ulasCount

BritikovKI · 2025-12-09T09:31:41Z

src/api/MainSolver.cc

+    return processed;
+}
+
+void MainSolver::preprocessFormulaDoFinalTheoryPreprocessing(PreprocessingContext const &) {


It is a function which calls one function, why not use original function call?
(Also imho name's a bit too long)

This way, it is clearer that this part must not be omitted. Without the auxiliary function, it could be missed. Besides, connecting the theory and preprocessor is not completely useless, prevents code duplication.

I'm not sure I understand 🤔 Why would one assume this part can be ommited? It can also be highlighted by the comment in the code, what theory->afterPreprocessing does...

Can you elaborate on code duplication point? Because as I understand a single function function call is being replaced by a different function call + additional lines of code in MainSolver.cc and MainSolver.h

BritikovKI · 2025-12-09T09:32:55Z

src/api/MainSolver.cc

+    return processed;
+}
+
+PTRef MainSolver::preprocessFormulaBeforeFinalTheoryPreprocessing(PTRef fla, PreprocessingContext const & context) {


I think it's too big of a name... I see that there is a naming convention, but that can be a pain to use inside the code...
Functionality can be described with a comment or documentation, but using this long func names feels a bit too much 🤔

Why not
initialPreprocessing
corePreprocessing
finalPreprocessing
or smth shorter...

I wanted to use sth. like that, but I do not like that corePreprocessing (or body or whatever) would be just that one call, which is actually just a minor step within the whole process. The whole reason why it is separated is that in preprocessFormulasPerPartition, both preprocessFormulaBeforeFinalTheoryPreprocessing and preprocessFormulaAfterFinalTheoryPreprocessing are called on each particular frame formula, while preprocessFormulaDoFinalTheoryPreprocessing is called just once. In preprocessFormulasDefault, i.e., in preprocessFormula, all is called just once (because we mix all formulas into one conjunction).

What could potentially work is preprocessFormulaBegin, preprocessFormulaMiddle, preprocessFormulaEnd. This way, middle does not make the impression that it is the main part.
What do you think?

I like your idea here!
Middle looks a little bit off, but I don't have a better idea for the name
There is also an option to ask LLM for the recommendation))

src/api/MainSolver.cc

BritikovKI · 2025-12-09T10:00:17Z

src/api/MainSolver.cc

    ts.setClauseCallBack(&callBack);
    ts.Cnfizer::cnfize(root, push_id);
    bool const keepPartitionsSeparate = trackPartitions();
-    Lit frameLit = push_id == 0 ? Lit{} : term_mapper->getOrCreateLit(frameTerms[push_id]);


Why not lambda?
(Not important, but tbh I just like lambdas 😊)

To not use the variable at all if push_id == 0 (i.e., do not even initialize it on the stack). Maybe it would be worth to also add the [[maybe_unused]] attribute to the variable, to make it more explicit.

Hmm, wouldn't it get optimized by compiler?
Main reason I like lambda is because it looks a little bit more compact here, it is not a major point of contention though...

BritikovKI · 2025-12-09T10:05:14Z

Overall 6 functions might be a little bit complex to navigate 🤔
To fully understand the code one needs to go back and forth, scrolling to the right func, then scrolling back...
This is the Software Engineering discussion but I'm supportive of function use if:
(1.) There is a piece of code that is used in many places or
(2.) There is a complex piece of code which serves one functional purpose...

Tomaqa · 2025-12-12T13:53:27Z

I need to use preprocessFormula later as a separate function in a different PR (#873).
Maybe encapsulating all related stuff into a class (e.g. PreprocessFormula) would help readability?

BritikovKI · 2025-12-15T16:02:32Z

I need to use preprocessFormula later as a separate function in a different PR (#873). Maybe encapsulating all related stuff into a class (e.g. PreprocessFormula) would help readability?

I think it actually can improve readability, especially if separated and split into private/public, to see which functions will be used outside and which are purely internal...

Tomaqa mentioned this pull request Nov 24, 2025

Added counters of already preprocessed assertions #889

Draft

Tomaqa requested review from BritikovKI and blishko December 2, 2025 16:18

Tomaqa added 2 commits December 2, 2025 17:20

Split simplifyFormulas to sub-functions

e575257

This makes the preprocessing more maintainable, allows overriding particular parts and also allows preprocessing single formulas without necessarily giving them to the SMT solver Also added `preprocess` as an alias to `simplifyFormulas`

Renamed internal MainSolver::firstNotSimplifiedFrame and insertedForm…

118750e

…ulasCount

Tomaqa force-pushed the preprocess branch from 883d288 to 118750e Compare December 2, 2025 16:21

Placed the loop body of simplifyFormulas into another separate function

fcbdc29

BritikovKI reviewed Dec 9, 2025

View reviewed changes

Comments

Conversation

Tomaqa commented Nov 18, 2025

Uh oh!

Tomaqa commented Nov 18, 2025

Uh oh!

Tomaqa commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tomaqa commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BritikovKI Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Tomaqa Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

BritikovKI Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

BritikovKI Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Tomaqa Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

BritikovKI Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BritikovKI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Tomaqa Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

BritikovKI Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

BritikovKI commented Dec 9, 2025

Uh oh!

Tomaqa commented Dec 12, 2025

Uh oh!

BritikovKI commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Tomaqa commented Nov 21, 2025 •

edited

Loading

Tomaqa commented Nov 24, 2025 •

edited

Loading

BritikovKI Dec 9, 2025 •

edited

Loading

BritikovKI Dec 9, 2025 •

edited

Loading