-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HolSmt: add support for Z3 v4.12.4 proof reconstruction #1194
Conversation
Great work. (I ever spent some time searching inside Z3 4.x documentation to try to figure out how it does proof logging but didn't succeed. Now I think I used wrong keywords.) For your question, it's not enforced anywhere in HOL4's source code. Instead, I simply meant that, only Z3 4.x provided release downloads for arm64 Linux (and it's already installed in the docker CI image). The other two supported SMT solvers (CVC5 and Yices 1.x) seem to have no publicly available source code, nor they have available arm64 Linux binaries. |
I'll give @tjark a chance to weigh in before merging, but many thanks for the ongoing work on this! |
Totally agree on this. Now on macOS (both x86_64 and arm64), HOL4 users can freely use HolSmt (with Z3 4.x as the backend) in their proofs, and there's no worry about the potential mistakes caused by untrusted oracles. |
These test cases were verified to work properly for the SMT-LIB translation but I had forgotten to enable cvc5 for them.
Otherwise, when certain tests fail, it's impossible to distinguish them from others that have the exact same formula but have different types, e.g. nums vs ints.
Support for old-style Z3 v2.19 proofs is retained, since the differences are minor. That said, in the future it might make sense to remove support for Z3 v2.19 proofs due to such an old version being obsolete, although it'd be desirable for that to happen only after Z3 v4 proof replay reaches feature parity.
These are used by Z3 v4.12.x (unlike Z3 v2.19) to declare variables of a specified type. Apparently, Z3 expects us to unify these variables with arbitrary terms as part of `rewrite` proof rules. Z3 also assigns formulas / terms to these variables as part of `intro-def` rules.
This rule basically introduces a name for a term, i.e. a definition of the form: ``var = term`` However, Z3 often defines such a variable by providing a term in a more complicated form such as (for example): ``(var \/ ~term) /\ (~var \/ term)`` (All such schematic forms provided by Z3 are tautologies given that `var = term`). We track all such definitions, which accumulate in the set of hypotheses of all theorems that depend on them, and at the end they are all removed from the final theorem. We can remove them from the final theorem because these variables do not appear in the remaining hypotheses nor in the conclusion of the theorem, i.e. all such variable names are invented by Z3 for the purposes of facilitating intermediate proofs and are thus irrelevant with regards to the final theorem.
While reconstructing a proof of `(if x < y then x else y) <= x`, Z3 v4.12.4 asks us to prove the following rewrite rule as one of the proof steps: (let ((@x91 (rewrite (= (not (>= (+ x (* (- 1) (ite $x36 y x))) 0)) $x86)))) ... corresponding to this term: ~(x + -1 * (if x + -1 * y >= 0 then y else x) >= 0) <=> ~(x + -1 * $var$(z3name!0) >= 0) ... where z3name!0 is a variable declared by Z3 at the beginning of its proof certificate, but which we know nothing about at this point. To make the proof go through, we unify the two sides of the equivalence, such that we obtain: z3name!0 = (if x + -1 * y >= 0 then y else x) We then prove the theorem by substituting the variable(s) in the goal and add the above definition to the same list of Z3-provided definitions as that used by the `z3_intro_def` handler, so that it gets removed at the end of the proof.
Z3 describes it as being the same as `bvsdiv`, "but created in a context where the second operand can be assumed to be non-zero". Currently we process it in the same way as `bvsdiv`.
Some terms, particularly in proofs containing word division, are huge in size (for word32 division, dozens of MB when printed out, due to combinatory explosion), so profiling is useful to figure out where the proof replay is stuck or taking too much time.
They uncovered a bug during development of support for Z3 v4.12.x proof replay, so it's probably useful to add them to the test suite to avoid future regressions.
This proof rule is supposed to handle multiple symmetry and transitivity rules. Z3 provides the following example: A1 |- R a b A2 |- R c b A3 |- R c d --------------------------------------- trans* A1 u A2 u A3 |- R a d Although more generally, the proof rule is supposed to handle any number of theorems passed as arguments and any path between the elements. R must be a symmetric and transitive relation. This rule came up in the Z3 v4.12.4 proof reconstruction of the following self-test: f (f x) = x /\ f (f (f (f (f x)))) = x ==> f x = x The only relation that came up in this proof was for equality. It is unknown if currently Z3 also uses this proof rule for other relations. However, it is worth noting that the related `symm` and `trans` proof rules, as currently implemented in HolSmt, can only handle the equality relation, even though Z3 v4.12.4 also describes them as proof rules for arbitrary relations. For the initial implementation, we rely on metisLib.METIS_TAC to find a proof. This is more than fast enough for the small proof of the above self-test. However, if METIS_TAC becomes a bottleneck, a more specialized proof handler could be implemented to improve performance.
This commit adds new rewrite theorems for: ~(p ==> q) <=> ~(~p \/ q) (~p <=> q) <=> (p <=/=> q) In this commit they were inserted as theorems r032 and r010 (resp.), others being renumbered as appropriate. They come up in the Z3 v4.12.4 proof reconstruction for the following self-tests, respectively: (f (f x) = x) /\ (f (f (f (f (f x)))) = x) ==> (f x = x) ((x, y) = (y, x)) = (x = y)
This file was introduced in the following commit: 18c8b76 However, in that commit I forgot to add a description of the file to the README (which contains a description of the other HolSmt files).
This commit fixes the parsing of the `bvand`, `bvor`, `bvadd`, `bvmul`, `bvxor` and `bvxnor` operators in Z3 v4 proof reconstruction. `bvxor` and `bvxnor` were already implemented but only as binary operators. However, Z3 v4.12.4 declares `bvxor` and `bvxnor` as being "flat associative" and in (at least) one of the Z3 proofs in the self-tests `bvxor` does appear with 3 arguments. Therefore, the parser was changed to parse these operators using a left-associative application of the arguments. The other operators were missing, i.e. they were not being parsed. In z3 v4.12.4 they are similarly declared as being flat associative. Strangely, this version of Z3 does not declare `bvnand` and `bvnor` as being flat associative, so we keep parsing them as binary operators.
HolSmt now supports proof reconstruction for Z3 v4.12.4, so we can now use it by default to run the HolSmt self-tests.
Now that proof reconstruction for Z3 v4.12.4 works, we can deprecate Z3 v2.19. Using a modern version of Z3 has many advantages, since Z3 is now open-source and also, binaries exist (or can be compiled) for platforms other than x86 and x86-64. Also, Z3 v4.12.x is in active development, which means there's a greater chance that bugs that are encountered will be fixed (unlike Z3 v2.19 which is a very old, deprecated version) and other improvements made. Modern versions of Z3 are also easier to install in current Linux distributions, since usually they are already part of their package collection. Furthermore, if we can ignore Z3 v2.19 then we can start enabling proof reconstruction for those self-tests which can only be reconstructed in Z3 v4 (once the reconstruction is fixed, that is).
Please go ahead! I haven't reviewed the code, but from the description of the PR, this sounds amazing. |
A good way to get more examples would be to run Z3 on the benchmarks in (relevant logics of) SMT-LIB. (Depending on how far you want to push this, you could basically re-do the evaluation from the "Fast LCF-Style Proof Reconstruction for Z3" paper with Z3 v4.12.4.) |
Minor update:
Also, I should say that I'm not too happy with the implementation of z1 = x + 1
z2 = y + 1
z3 = z2 + (y + 1) + z1
z3 = (y + 1) + z2 + (x + 1)
z4 = z3 + (z2 + z2 + z1) + z1
z4 = (z2 + z2 + (x + 1)) + z3 + (x + 1)
(...)
z64 = z63 + (z62 + z62 + z1) + z1
z64 = (z62 + z62 + (x + 1)) + z63 + (x + 1) In particular, this example would fail with the tactic I implemented since it would require more than 2^64 expansions, due to the deep nesting. I'm thinking of doing a different implementation, where first I would gather all the definitions of z64 = z63 + z63 + z1
... assuming the following:
z63 = z62 + z62 + (x + 1)
z63 = z62 + z62 + z1
z1 = x + 1 I would then somehow remove all other definitions of Finally, I would perform the exact same thing for I think this new alternative implementation could be a lot slower if Z3 provides us many duplicate definitions that are already fully or almost fully expanded (since we'd be doing a lot of unnecessary term unifications with huge terms), but the advantage is that it'd avoid the exponential blow-up for these worst-case scenarios. I am also suspecting that these worst-case scenarios might actually show up in practice because I suspect that the entire reason why Z3 is inventing these definitions is indeed to avoid this exponential blow-up in proofs! |
That's a great idea, thanks! I would love to do this after I try fixing as many self-tests as possible to work with proof reconstruction! |
So, I've been discussing this issue with @tjark, who gave me great suggestions. @tjark's idea is similar to what I was proposing, but @tjark's implementation suggestions lead to a simpler and much faster solution that what I was proposing, I think -- probably as fast as the current implementation in this PR in the straightforward cases, but completely avoiding the exponential blow-up in the hard ones! I'm hoping I'll be able to do the new implementation within the next few days. |
I'll leave this open for the "new implementation" so we can catch all the work in one PR. Thanks! |
This new implementation was developed to address concerns about exponential term blow-up. Basically, if we have definitions of the form: z1 = 1 z2 = z1 + z1 z3 = z2 + z2 ... z64 = z63 + z63 ... then the old implementation would have fully expanded the definition of `z64` in order to remove it, which would not fit in memory due to the 2^64 variable substitutions which would be required. This new implementation avoids fully expanding definitions. It was developed with valuable suggestions and support from @tjark.
Added two commits:
That said, @tjark seems to have come up with another tricky case which I'm not sure the new implementation can handle. However, I did not understand the exact specifics. @tjark, can you provide more details about the example that you mentioned for which unification might not save us? I would like to add a unit test for that case as well, if possible. Thank you! |
Update: @tjark came up with a new clever scenario which 1) I hadn't anticipated and 2) is leading to circular definitions (due to term unification) even though they weren't circular to begin with. I have added a unit test for this new case (in my local repository) and I've confirmed that the new implementation is failing due to this issue. @tjark has also suggested a straightforward fix for this new issue which I'm planning to implement soon. Thanks! |
e72e7e5
to
29fd902
Compare
I've fixed this PR to handle a kind of definitions which might appear and which I wasn't originally anticipating. These are definitions of the form These definitions were problematic for various reasons, especially the issue that @tjark identified which could cause circular definitions to arise even though they weren't circular to begin with. To fix all the issues in a principled way, I've added code to keep track of which variables Z3 has defined in its proof log, so that we can refer to them in the proof replay procedure. I wasn't originally planning to do this as it didn't seem necessary, but in hindsight I believe it makes the code that handles these definitions clearer and more robust, especially when handling definitions of this kind. The commit messages and comments in the code contain more details. I've also added a couple of unit tests to exercise the issue that @tjark identified with circular definitions. Thanks! |
These are run as part of the self-tests. Currently only unit tests for `Z3_ProofReplay.remove_definitions` have been added. Two of them are meant to test tricky cases which would cause the previous implemention of `remove_definitions` to fail.
It turns out that there are multiple reasons for why keeping track of Z3-defined variables is desirable (or perhaps even needed) during proof replay. One of the issues is that when removing definitions, term unification may end up creating new definitions of this form: var1 = var2 ... where `var1` was not in the previous set of variables that we were keeping track of for removal. This would cause such definitions to be ignored when calculating the new set of definitions to remove. Another reason is that we want to avoid ending up with circular definitions such as: var1 = var2 var2 = var1 ... where `var1` and `var2` are both Z3-defined variables. To prevent this, we can orient such definitions so that `var2 = var1` is always translated into `var1 = var2` (where `var1` <= `var2`, for some definition of `<=`), i.e. we can always create them in a canonical orientation. Keeping track of Z3-defined variables also allows us to orient definitions created due to term unification (e.g. as part of `rewrite` proof rules) such that they end up as `var = x` instead of `x = var`, where `x` is a user-defined variable and `var` is a Z3-defined variable. Therefore, this commit adds code to keep track of which variables have been defined by Z3. As a side effect, it fixes the first issue mentioned above. A subsequent commit will use this functionality to orient the definitions appropriately during proof replay, which will fix the remaining two issues.
In Z3 proof replay, sometimes we may end up with definitions of the form: var1 = var2 This kind of definition may theoretically arise in three different places: 1. As an explicit Z3-provided definition in `intro-def` proof rules 2. As an implicit Z3-provided definition which we instantiate as part of `rewrite` proof rules (after term unification) 3. When unifying terms to add new definitions in the process of removing the definitions from the final theorem. This can be problematic if `var1` is not a Z3-defined variable (which means `var2` is, and thus the definition should be reversed) or if both `var1` and `var2` are Z3-defined variables. If both vars are Z3-defined variables, we may end up with the following two hypotheses in the final theorem, which `remove_definitions` can't handle due to the circularity: var1 = var2 var2 = var1 To avoid ending up with such problematic definitions, this commit introduces code to change `var1 = var2` into `var2 = var1`, so that the left-hand side is always a Z3-defined variable and, if the right-hand side is also a Z3-defined variable, the former is not "greater" than the latter (for some definition of "greater").
29fd902
to
00ac518
Compare
Sounds like the epic story has come to a fittingly successful conclusion! Thanks for all the work on this one. |
This PR adds support for reconstructing Z3 v4.12.4 proof certificates in HOL4.
It contains 16 commits -- a (more) detailed description is available in each of them.
So far it has only been tested with the HolSmt self-tests. All the self-tests that were passing with Z3 v2.19 proof reconstruction now also pass with Z3 v4.12.4 proof reconstruction.
As a consequence, I'm deprecating HolSmt support for Z3 v2.19 (see motivation in commit b539306). The HolSmt documentation was changed appropriately. However, for the moment this PR still retains support for parsing and reconstructing Z3 v2.19 proofs, since these are mostly just a subset of Z3 v4.12.4 proofs.
A few other minor, unrelated improvements to HolSmt were also made.
Note that this is only an initial implementation: it is almost guaranteed that HolSmt is still missing support for some Z3 v4.12.4 proof rules, but it is difficult to implement them until I run across examples where these rules appear in the proof certificates. I expect this gap will decrease over time, as I intent to keep fixing more HolSmt self-tests to work with proof reconstruction and add support for more features (to the extent that I can). It's also worth noting that the existing HolSmt support for Z3 v2.19 proof reconstruction is similarly incomplete, in part due to bugs in Z3 v2.19 which cannot be fixed, since this version is not being actively developed anymore and the source code is not available.
I've also attempted to update the Docker CI files to use Z3 v4.12.4 by default, although I'm not sure these changes are correct (help would be appreciated, esp. @binghe as he added this feature 😄). Related question: @binghe you mentioned in #1179 that for
arm64
only Z3 4.12.4 is available: is this enforced anywhere?The above change to the Docker CI files should be the only change in this PR that affects anything outside HolSmt.
As a side effect of this PR, it should now be possible to reconstruct Z3 proofs in architectures other than x86 and x86-64 (without using an emulator) 🎉
cc @tjark