Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HolSmt: add support for Z3 v4.12.4 proof reconstruction #1194

Merged
merged 21 commits into from
Feb 21, 2024

Conversation

someplaceguy
Copy link
Contributor

@someplaceguy someplaceguy commented Feb 7, 2024

This PR adds support for reconstructing Z3 v4.12.4 proof certificates in HOL4.

It contains 16 commits -- a (more) detailed description is available in each of them.

So far it has only been tested with the HolSmt self-tests. All the self-tests that were passing with Z3 v2.19 proof reconstruction now also pass with Z3 v4.12.4 proof reconstruction.

As a consequence, I'm deprecating HolSmt support for Z3 v2.19 (see motivation in commit b539306). The HolSmt documentation was changed appropriately. However, for the moment this PR still retains support for parsing and reconstructing Z3 v2.19 proofs, since these are mostly just a subset of Z3 v4.12.4 proofs.

A few other minor, unrelated improvements to HolSmt were also made.

Note that this is only an initial implementation: it is almost guaranteed that HolSmt is still missing support for some Z3 v4.12.4 proof rules, but it is difficult to implement them until I run across examples where these rules appear in the proof certificates. I expect this gap will decrease over time, as I intent to keep fixing more HolSmt self-tests to work with proof reconstruction and add support for more features (to the extent that I can). It's also worth noting that the existing HolSmt support for Z3 v2.19 proof reconstruction is similarly incomplete, in part due to bugs in Z3 v2.19 which cannot be fixed, since this version is not being actively developed anymore and the source code is not available.

I've also attempted to update the Docker CI files to use Z3 v4.12.4 by default, although I'm not sure these changes are correct (help would be appreciated, esp. @binghe as he added this feature 😄). Related question: @binghe you mentioned in #1179 that for arm64 only Z3 4.12.4 is available: is this enforced anywhere?

The above change to the Docker CI files should be the only change in this PR that affects anything outside HolSmt.

As a side effect of this PR, it should now be possible to reconstruct Z3 proofs in architectures other than x86 and x86-64 (without using an emulator) 🎉

cc @tjark

@binghe
Copy link
Member

binghe commented Feb 7, 2024

I've also attempted to update the Docker CI files to use Z3 v4.12.4 by default, although I'm not sure these changes are correct (help would be appreciated, esp. @binghe as he added this feature 😄). Related question: @binghe you mentioned in #1179 that for arm64 only Z3 4.12.4 is available: is this enforced anywhere?

Great work. (I ever spent some time searching inside Z3 4.x documentation to try to figure out how it does proof logging but didn't succeed. Now I think I used wrong keywords.)

For your question, it's not enforced anywhere in HOL4's source code. Instead, I simply meant that, only Z3 4.x provided release downloads for arm64 Linux (and it's already installed in the docker CI image). The other two supported SMT solvers (CVC5 and Yices 1.x) seem to have no publicly available source code, nor they have available arm64 Linux binaries.

@mn200
Copy link
Member

mn200 commented Feb 8, 2024

I'll give @tjark a chance to weigh in before merging, but many thanks for the ongoing work on this!

@binghe
Copy link
Member

binghe commented Feb 8, 2024

As a side effect of this PR, it should now be possible to reconstruct Z3 proofs in architectures other than x86 and x86-64 (without using an emulator) 🎉

Totally agree on this. Now on macOS (both x86_64 and arm64), HOL4 users can freely use HolSmt (with Z3 4.x as the backend) in their proofs, and there's no worry about the potential mistakes caused by untrusted oracles.

These test cases were verified to work properly for the SMT-LIB
translation but I had forgotten to enable cvc5 for them.
Otherwise, when certain tests fail, it's impossible to distinguish
them from others that have the exact same formula but have different
types, e.g. nums vs ints.
Support for old-style Z3 v2.19 proofs is retained, since the
differences are minor.

That said, in the future it might make sense to remove support for Z3
v2.19 proofs due to such an old version being obsolete, although it'd
be desirable for that to happen only after Z3 v4 proof replay reaches
feature parity.
These are used by Z3 v4.12.x (unlike Z3 v2.19) to declare variables
of a specified type. Apparently, Z3 expects us to unify these
variables with arbitrary terms as part of `rewrite` proof rules.

Z3 also assigns formulas / terms to these variables as part of
`intro-def` rules.
This rule basically introduces a name for a term, i.e. a definition
of the form:

``var = term``

However, Z3 often defines such a variable by providing a term in a
more complicated form such as (for example):

``(var \/ ~term) /\ (~var \/ term)``

(All such schematic forms provided by Z3 are tautologies given that
`var = term`).

We track all such definitions, which accumulate in the set of
hypotheses of all theorems that depend on them, and at the end they
are all removed from the final theorem.

We can remove them from the final theorem because these variables do
not appear in the remaining hypotheses nor in the conclusion of the
theorem, i.e. all such variable names are invented by Z3 for the
purposes of facilitating intermediate proofs and are thus irrelevant
with regards to the final theorem.
While reconstructing a proof of `(if x < y then x else y) <= x`,
Z3 v4.12.4 asks us to prove the following rewrite rule as one of the
proof steps:

(let ((@x91 (rewrite (= (not (>= (+ x (* (- 1) (ite $x36 y x))) 0))
  $x86))))

... corresponding to this term:

~(x + -1 * (if x + -1 * y >= 0 then y else x) >= 0) <=>
~(x + -1 * $var$(z3name!0) >= 0)

... where z3name!0 is a variable declared by Z3 at the beginning of
its proof certificate, but which we know nothing about at this point.

To make the proof go through, we unify the two sides of the
equivalence, such that we obtain:

z3name!0 = (if x + -1 * y >= 0 then y else x)

We then prove the theorem by substituting the variable(s) in the goal
and add the above definition to the same list of Z3-provided
definitions as that used by the `z3_intro_def` handler, so that it
gets removed at the end of the proof.
Z3 describes it as being the same as `bvsdiv`, "but created in a
context where the second operand can be assumed to be non-zero".

Currently we process it in the same way as `bvsdiv`.
Some terms, particularly in proofs containing word division, are huge
in size (for word32 division, dozens of MB when printed out, due to
combinatory explosion), so profiling is useful to figure out where the
proof replay is stuck or taking too much time.
They uncovered a bug during development of support for Z3 v4.12.x
proof replay, so it's probably useful to add them to the test suite to
avoid future regressions.
This proof rule is supposed to handle multiple symmetry and
transitivity rules. Z3 provides the following example:

A1 |- R a b   A2 |- R c b   A3 |- R c d
--------------------------------------- trans*
           A1 u A2 u A3 |- R a d

Although more generally, the proof rule is supposed to handle any
number of theorems passed as arguments and any path between the
elements.

R must be a symmetric and transitive relation.

This rule came up in the Z3 v4.12.4 proof reconstruction of the
following self-test:

f (f x) = x /\ f (f (f (f (f x)))) = x ==> f x = x

The only relation that came up in this proof was for equality.
It is unknown if currently Z3 also uses this proof rule for other
relations.

However, it is worth noting that the related `symm` and `trans` proof
rules, as currently implemented in HolSmt, can only handle the
equality relation, even though Z3 v4.12.4 also describes them as proof
rules for arbitrary relations.

For the initial implementation, we rely on metisLib.METIS_TAC to find
a proof. This is more than fast enough for the small proof of the
above self-test.

However, if METIS_TAC becomes a bottleneck, a more specialized proof
handler could be implemented to improve performance.
This commit adds new rewrite theorems for:

~(p ==> q) <=> ~(~p \/ q)

(~p <=> q) <=> (p <=/=> q)

In this commit they were inserted as theorems r032 and r010 (resp.),
others being renumbered as appropriate.

They come up in the Z3 v4.12.4 proof reconstruction for the following
self-tests, respectively:

(f (f x) = x) /\ (f (f (f (f (f x)))) = x) ==> (f x = x)

((x, y) = (y, x)) = (x = y)
This file was introduced in the following commit:
18c8b76

However, in that commit I forgot to add a description of the file to
the README (which contains a description of the other HolSmt files).
This commit fixes the parsing of the `bvand`, `bvor`, `bvadd`,
`bvmul`, `bvxor` and `bvxnor` operators in Z3 v4 proof reconstruction.

`bvxor` and `bvxnor` were already implemented but only as binary
operators.

However, Z3 v4.12.4 declares `bvxor` and `bvxnor` as being "flat
associative" and in (at least) one of the Z3 proofs in the self-tests
`bvxor` does appear with 3 arguments. Therefore, the parser was
changed to parse these operators using a left-associative application
of the arguments.

The other operators were missing, i.e. they were not being parsed.
In z3 v4.12.4 they are similarly declared as being flat associative.

Strangely, this version of Z3 does not declare `bvnand` and `bvnor` as
being flat associative, so we keep parsing them as binary operators.
HolSmt now supports proof reconstruction for Z3 v4.12.4, so we can
now use it by default to run the HolSmt self-tests.
Now that proof reconstruction for Z3 v4.12.4 works, we can deprecate
Z3 v2.19.

Using a modern version of Z3 has many advantages, since Z3 is now
open-source and also, binaries exist (or can be compiled) for
platforms other than x86 and x86-64.

Also, Z3 v4.12.x is in active development, which means there's a
greater chance that bugs that are encountered will be fixed (unlike
Z3 v2.19 which is a very old, deprecated version) and other
improvements made.

Modern versions of Z3 are also easier to install in current Linux
distributions, since usually they are already part of their package
collection.

Furthermore, if we can ignore Z3 v2.19 then we can start enabling
proof reconstruction for those self-tests which can only be
reconstructed in Z3 v4 (once the reconstruction is fixed, that is).
@tjark
Copy link

tjark commented Feb 8, 2024

I'll give @tjark a chance to weigh in before merging, but many thanks for the ongoing work on this!

Please go ahead! I haven't reviewed the code, but from the description of the PR, this sounds amazing.

@tjark
Copy link

tjark commented Feb 8, 2024

Note that this is only an initial implementation: it is almost guaranteed that HolSmt is still missing support for some Z3 v4.12.4 proof rules, but it is difficult to implement them until I run across examples where these rules appear in the proof certificates.

A good way to get more examples would be to run Z3 on the benchmarks in (relevant logics of) SMT-LIB. (Depending on how far you want to push this, you could basically re-do the evaluation from the "Fast LCF-Style Proof Reconstruction for Z3" paper with Z3 v4.12.4.)

@someplaceguy
Copy link
Contributor Author

someplaceguy commented Feb 8, 2024

Minor update:

  • Fixed some typos and slightly inaccurate terminology.
  • Fixed trans* handler to perform a union of all the hypotheses of all input theorems, which then becomes the set of assumptions used for proving. Otherwise, if any of the input theorems had any hypotheses, the theorem we are trying to prove would become unprovable. I had already realized that this was necessary (hence the A1 u A2 u A3 in the example I provided in the comment) but I had forgotten to implement it.
  • Added one more argument for deprecating Z3 v2.19.

Also, I should say that I'm not too happy with the implementation of remove_definitions. I think it should be very fast in most cases, i.e. if the nesting of the definitions isn't too deep, but otherwise it might cause an exponential term blow-up. The problem is that while proving the theorems necessary to remove those definitions from the set of hypotheses, the tactic I implemented will fully expand all definitions. This can be a problem in cases like this, which seem to be quite tricky to handle:

z1 = x + 1
z2 = y + 1

z3 =   z2    + (y + 1) +   z1
z3 = (y + 1) +    z2   + (x + 1)

z4 =          z3         + (z2 + z2 + z1) +   z1
z4 = (z2 + z2 + (x + 1)) +       z3       + (x + 1)

(...)

z64 =          z63          + (z62 + z62 + z1) +   z1
z64 = (z62 + z62 + (x + 1)) +        z63       + (x + 1)

In particular, this example would fail with the tactic I implemented since it would require more than 2^64 expansions, due to the deep nesting.

I'm thinking of doing a different implementation, where first I would gather all the definitions of z64 and would perform term unification between them, to arrive at a minimal expression that defines z64:

z64 = z63 + z63 + z1

... assuming the following:

z63 = z62 + z62 + (x + 1)
z63 = z62 + z62 + z1
z1 = x + 1

I would then somehow remove all other definitions of z64 (I guess by proving with substitution, using these assumptions).
Once I'd have a single minimal definition for z64 it would be trivial to remove it.

Finally, I would perform the exact same thing for z63, z62, z61 ... z1 (in order). It'd be important to do this in order, using a topological sort, to avoid doing a lot of extra work (since e.g. when dealing with z64 we'd be adding new assumptions for z63 ... z1, so if we process them in that order, they would deduplicate automatically when stored in the set of hypotheses).

I think this new alternative implementation could be a lot slower if Z3 provides us many duplicate definitions that are already fully or almost fully expanded (since we'd be doing a lot of unnecessary term unifications with huge terms), but the advantage is that it'd avoid the exponential blow-up for these worst-case scenarios.

I am also suspecting that these worst-case scenarios might actually show up in practice because I suspect that the entire reason why Z3 is inventing these definitions is indeed to avoid this exponential blow-up in proofs!

@someplaceguy
Copy link
Contributor Author

A good way to get more examples would be to run Z3 on the benchmarks in (relevant logics of) SMT-LIB. (Depending on how far you want to push this, you could basically re-do the evaluation from the "Fast LCF-Style Proof Reconstruction for Z3" paper with Z3 v4.12.4.)

That's a great idea, thanks! I would love to do this after I try fixing as many self-tests as possible to work with proof reconstruction!

@someplaceguy
Copy link
Contributor Author

someplaceguy commented Feb 8, 2024

Also, I should say that I'm not too happy with the implementation of remove_definitions.

I'm thinking of doing a different implementation (...)

So, I've been discussing this issue with @tjark, who gave me great suggestions. @tjark's idea is similar to what I was proposing, but @tjark's implementation suggestions lead to a simpler and much faster solution that what I was proposing, I think -- probably as fast as the current implementation in this PR in the straightforward cases, but completely avoiding the exponential blow-up in the hard ones!

I'm hoping I'll be able to do the new implementation within the next few days.

@mn200
Copy link
Member

mn200 commented Feb 8, 2024

I'll leave this open for the "new implementation" so we can catch all the work in one PR. Thanks!

This new implementation was developed to address concerns about
exponential term blow-up.

Basically, if we have definitions of the form:

z1 = 1
z2 = z1 + z1
z3 = z2 + z2
...
z64 = z63 + z63

... then the old implementation would have fully expanded the
definition of `z64` in order to remove it, which would not fit in
memory due to the 2^64 variable substitutions which would be
required.

This new implementation avoids fully expanding definitions. It was
developed with valuable suggestions and support from @tjark.
@someplaceguy
Copy link
Contributor Author

someplaceguy commented Feb 15, 2024

Added two commits:

  1. 67a8163 Replaces the old implementation of remove_definitions with the new one, which should properly handle the tricky cases that me and @tjark came up with.
  2. e72e7e5 Adds unit tests for remove_definitions. The two tricky unit tests were verified to fail with the old implementation of remove_definitions (i.e. if the previous commit is reverted).

That said, @tjark seems to have come up with another tricky case which I'm not sure the new implementation can handle. However, I did not understand the exact specifics. @tjark, can you provide more details about the example that you mentioned for which unification might not save us? I would like to add a unit test for that case as well, if possible.

Thank you!

@someplaceguy someplaceguy marked this pull request as draft February 20, 2024 10:38
@someplaceguy
Copy link
Contributor Author

Update: @tjark came up with a new clever scenario which 1) I hadn't anticipated and 2) is leading to circular definitions (due to term unification) even though they weren't circular to begin with. I have added a unit test for this new case (in my local repository) and I've confirmed that the new implementation is failing due to this issue. @tjark has also suggested a straightforward fix for this new issue which I'm planning to implement soon. Thanks!

@someplaceguy someplaceguy marked this pull request as ready for review February 21, 2024 17:12
@someplaceguy
Copy link
Contributor Author

I've fixed this PR to handle a kind of definitions which might appear and which I wasn't originally anticipating. These are definitions of the form var1 = var2, where at least one of the variables is defined by Z3.

These definitions were problematic for various reasons, especially the issue that @tjark identified which could cause circular definitions to arise even though they weren't circular to begin with.

To fix all the issues in a principled way, I've added code to keep track of which variables Z3 has defined in its proof log, so that we can refer to them in the proof replay procedure. I wasn't originally planning to do this as it didn't seem necessary, but in hindsight I believe it makes the code that handles these definitions clearer and more robust, especially when handling definitions of this kind. The commit messages and comments in the code contain more details.

I've also added a couple of unit tests to exercise the issue that @tjark identified with circular definitions.

Thanks!

These are run as part of the self-tests.

Currently only unit tests for `Z3_ProofReplay.remove_definitions`
have been added.

Two of them are meant to test tricky cases which would cause the
previous implemention of `remove_definitions` to fail.
It turns out that there are multiple reasons for why keeping track of
Z3-defined variables is desirable (or perhaps even needed) during
proof replay.

One of the issues is that when removing definitions, term unification
may end up creating new definitions of this form:

var1 = var2

... where `var1` was not in the previous set of variables that we were
keeping track of for removal. This would cause such definitions to be
ignored when calculating the new set of definitions to remove.

Another reason is that we want to avoid ending up with circular
definitions such as:

var1 = var2
var2 = var1

... where `var1` and `var2` are both Z3-defined variables. To prevent
this, we can orient such definitions so that `var2 = var1` is always
translated into `var1 = var2` (where `var1` <= `var2`, for some
definition of `<=`), i.e. we can always create them in a canonical
orientation.

Keeping track of Z3-defined variables also allows us to orient
definitions created due to term unification (e.g. as part of
`rewrite` proof rules) such that they end up as `var = x` instead
of `x = var`, where `x` is a user-defined variable and `var` is a
Z3-defined variable.

Therefore, this commit adds code to keep track of which variables have
been defined by Z3. As a side effect, it fixes the first issue
mentioned above.

A subsequent commit will use this functionality to orient the
definitions appropriately during proof replay, which will fix the
remaining two issues.
In Z3 proof replay, sometimes we may end up with definitions of the
form:

var1 = var2

This kind of definition may theoretically arise in three different
places:

1. As an explicit Z3-provided definition in `intro-def` proof rules
2. As an implicit Z3-provided definition which we instantiate as part
   of `rewrite` proof rules (after term unification)
3. When unifying terms to add new definitions in the process of
   removing the definitions from the final theorem.

This can be problematic if `var1` is not a Z3-defined variable (which
means `var2` is, and thus the definition should be reversed) or if
both `var1` and `var2` are Z3-defined variables.

If both vars are Z3-defined variables, we may end up with the
following two hypotheses in the final theorem, which
`remove_definitions` can't handle due to the circularity:

var1 = var2
var2 = var1

To avoid ending up with such problematic definitions, this commit
introduces code to change `var1 = var2` into `var2 = var1`, so that
the left-hand side is always a Z3-defined variable and, if the
right-hand side is also a Z3-defined variable, the former is not
"greater" than the latter (for some definition of "greater").
@mn200
Copy link
Member

mn200 commented Feb 21, 2024

Sounds like the epic story has come to a fittingly successful conclusion! Thanks for all the work on this one.

@mn200 mn200 merged commit 525694a into HOL-Theorem-Prover:develop Feb 21, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants