Skip to content

Comments

Split simplifyFormulas to sub-functions#888

Open
Tomaqa wants to merge 3 commits intomasterfrom
preprocess
Open

Split simplifyFormulas to sub-functions#888
Tomaqa wants to merge 3 commits intomasterfrom
preprocess

Conversation

@Tomaqa
Copy link
Member

@Tomaqa Tomaqa commented Nov 18, 2025

This is just a refactoring. The behavior or performance should not change.

This makes the preprocessing more maintainable, allows overriding particular parts, and also allows preprocessing single formulas without necessarily giving them to the SMT solver with giveToSolver.

Also added preprocess as an alias to simplifyFormulas.

@Tomaqa
Copy link
Member Author

Tomaqa commented Nov 18, 2025

Also renamed internal MainSolver::firstNotSimplifiedFrame->firstNotPreprocessedFrame and insertedFormulasCount->insertedAssertionsCount

@Tomaqa
Copy link
Member Author

Tomaqa commented Nov 21, 2025

Comparison with master, 2-minute timeout, QF_LRA, (non-|)incremental + unsat cores (i.e. unsat non-incremental and producing resolution proofs, hence tracking partitions).

2025-v2.9.2-21-gfe95a928.scatter.pdf

@Tomaqa
Copy link
Member Author

Tomaqa commented Nov 24, 2025

To check also the case when trackPartitions() is true, I relied on unsat cores. However, they are only a subset of non-incremental benchmarks, and we also do not really care about the unsat core computation itself. Hence, I instead add more comparisons of only (non-|)incremental benchmarks, but using a hacked implementation that forces trackPartitions() to true. In the scatter plots below, these correspond to the versions with the -dirty suffix.

I again compare the new (v2.9.2-23-g883d2889) against master (v2.9.2-21-gfe95a928), using the 2-minute timeout, QF_LRA, (non-|)incremental.

Master -dirty vs. new -dirty:
2025-v2.9.2-21-gfe95a928-dirty_v2.9.2-23-g883d2889-dirty.scatter.pdf
That is, the behavior does not change when tracking partitions. In the previous scatter plot, I showed that it also does not change when not tracking partitions.

For completeness, I show that tracking partitions matters.
This is a comparison of master vs. master -dirty:
2025-v2.9.2-21-gfe95a928_v2.9.2-21-gfe95a928-dirty.scatter.pdf
It is almost the same when comparing the new vs. the new -dirty:
2025-v2.9.2-23-g883d2889_v2.9.2-23-g883d2889-dirty.scatter.pdf
As expected, tracking partitions overall harms the performance, although sometimes it is better, probably due to instability.

@Tomaqa Tomaqa requested review from BritikovKI and blishko December 2, 2025 16:18
This makes the preprocessing more maintainable, allows overriding particular parts and also allows preprocessing single formulas without necessarily giving them to the SMT solver

Also added `preprocess` as an alias to `simplifyFormulas`
return processed;
}

void MainSolver::preprocessFormulaDoFinalTheoryPreprocessing(PreprocessingContext const &) {
Copy link
Member

@BritikovKI BritikovKI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a function which calls one function, why not use original function call?
(Also imho name's a bit too long)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way, it is clearer that this part must not be omitted. Without the auxiliary function, it could be missed. Besides, connecting the theory and preprocessor is not completely useless, prevents code duplication.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand 🤔 Why would one assume this part can be ommited? It can also be highlighted by the comment in the code, what theory->afterPreprocessing does...

Can you elaborate on code duplication point? Because as I understand a single function function call is being replaced by a different function call + additional lines of code in MainSolver.cc and MainSolver.h

return processed;
}

PTRef MainSolver::preprocessFormulaBeforeFinalTheoryPreprocessing(PTRef fla, PreprocessingContext const & context) {
Copy link
Member

@BritikovKI BritikovKI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's too big of a name... I see that there is a naming convention, but that can be a pain to use inside the code...
Functionality can be described with a comment or documentation, but using this long func names feels a bit too much 🤔

Why not
initialPreprocessing
corePreprocessing
finalPreprocessing
or smth shorter...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to use sth. like that, but I do not like that corePreprocessing (or body or whatever) would be just that one call, which is actually just a minor step within the whole process. The whole reason why it is separated is that in preprocessFormulasPerPartition, both preprocessFormulaBeforeFinalTheoryPreprocessing and preprocessFormulaAfterFinalTheoryPreprocessing are called on each particular frame formula, while preprocessFormulaDoFinalTheoryPreprocessing is called just once. In preprocessFormulasDefault, i.e., in preprocessFormula, all is called just once (because we mix all formulas into one conjunction).

What could potentially work is preprocessFormulaBegin, preprocessFormulaMiddle, preprocessFormulaEnd. This way, middle does not make the impression that it is the main part.
What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your idea here!
Middle looks a little bit off, but I don't have a better idea for the name
There is also an option to ask LLM for the recommendation))

ts.setClauseCallBack(&callBack);
ts.Cnfizer::cnfize(root, push_id);
bool const keepPartitionsSeparate = trackPartitions();
Lit frameLit = push_id == 0 ? Lit{} : term_mapper->getOrCreateLit(frameTerms[push_id]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not lambda?
(Not important, but tbh I just like lambdas 😊)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To not use the variable at all if push_id == 0 (i.e., do not even initialize it on the stack). Maybe it would be worth to also add the [[maybe_unused]] attribute to the variable, to make it more explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, wouldn't it get optimized by compiler?
Main reason I like lambda is because it looks a little bit more compact here, it is not a major point of contention though...

@BritikovKI
Copy link
Member

Overall 6 functions might be a little bit complex to navigate 🤔
To fully understand the code one needs to go back and forth, scrolling to the right func, then scrolling back...
This is the Software Engineering discussion but I'm supportive of function use if:
(1.) There is a piece of code that is used in many places or
(2.) There is a complex piece of code which serves one functional purpose...

@Tomaqa
Copy link
Member Author

Tomaqa commented Dec 12, 2025

I need to use preprocessFormula later as a separate function in a different PR (#873).
Maybe encapsulating all related stuff into a class (e.g. PreprocessFormula) would help readability?

@BritikovKI
Copy link
Member

I need to use preprocessFormula later as a separate function in a different PR (#873). Maybe encapsulating all related stuff into a class (e.g. PreprocessFormula) would help readability?

I think it actually can improve readability, especially if separated and split into private/public, to see which functions will be used outside and which are purely internal...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants