-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg: Use filtered_formula
to represent dependencies
#10918
pkg: Use filtered_formula
to represent dependencies
#10918
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tests expectedly fail with hash incongruences, but some others fail with this:
- File "x.opam", line 1, characters 0-0:
- Error: Expected formula to be a conjunction of atoms but encountered non-atom
- term 'a | b'
+ Error: Unable to solve dependencies for the following lock directories:
+ Lock directory dune.lock:
+ Can't find all required versions.
+ Selected: x.dev a
+ - a -> (problem)
+ No known implementations at all
Shouldn't these tests now pass with a proper build?
@ElectreAAS The failure you quoted makes sense, because the test in The hash changes are an unfortumate side-effect as the non-local dependency set is calculated from a different representation and bar of implementing the OPAM filters into Dune's condition language and then hashing that there is no good way of preserving the previous hashes (or running a hybrid approach of encoding them one way if they can and another if they cannot be represented using dune-lang conditions). I don't think we guarantee that the hashes are stable between versions of Dune. |
3183e7e
to
31229fb
Compare
I've replaced the failing test with one that shows that disjunctions in opam files lead to valid lock directory solutions using either side of the disjunction. |
Oh nice! Thanks for updating the test |
b3969c7
to
5728b11
Compare
src/dune_pkg/opam_solver.ml
Outdated
let deps = | ||
List.map pkg.dependencies ~f:(fun (d : Package_dependency.t) -> d.name) | ||
in | ||
let deps = Local_package.For_solver.dependency_names pkg in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this wrong? You should not consider packages that haven't been selected for the reachability check. This would mean that you're including packages in the lock directory that are not needed for the actual build. This is a good thing to add a test for as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're completely right, thanks for pointing this out. This code needs to be changed because it uses the input to the solver as the output instead of using the solution that the solver outputs.
I'll add a test that checks that dependencies that don't get picked by the solver don't end up in the lock directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking what the semantics of reachable
is and according to #10814 it is about removing post
dependencies (and also post-deps-solving.t
is the only test failing if I comment out the reachability test).
Since this is basically about filtering the solver output via the possible deps from the input, how about still collecting all possible packages (since they are potentially reachable) except those that are post
and using that for reachability?
That way packages that the solver is not picking will not be part of the lock dir (as the solution just does not contain them) and post
packages won't be picked either (as they will be filtered out by the reachability check). This makes post
a bit special but post
is already a magic variable in opam
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test showing an unnecessary post dependencies being eliminated.
how about still collecting all possible packages (since they are potentially reachable) except those that are post and using that for reachability?
I think it's just simpler to do this without any hacks. We've already resolved the or
deps when writing the dependency list in the lock directory. Might as well just use that here instead of the formula
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, I'm thinking of just using this snippet to evaluate the formula:
Resolve_opam_formula.filtered_formula_to_package_names
~with_test:false
(add_self_to_filter_env opam_package (Solver_env.to_env solver_env))
version_by_package_name
what
We already do this for external packages. No reason why we can't do it for local packages for the reachability check. You might want to wait until #11022 is merged though, as I'm going to move this snippet somewhere where it can be shared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed the comment. Ok, I'll try to evaluate the formula instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the documentation of filtered_formula_to_package_names
makes me question whether this is the right thing to use:
Given a map associating package names with the version of that package in a solution, this function evaluates an
OpamFormula.t
to a concrete list of package names all contained in the solution with versions which satisfy the formula, or an error if the the version constraints or packages in the formula can't be satisfied with the packages in the solution. If the formula contains disjunctions (the|
operator in opam) then there may be multiple sets of packages that satisfy the formula. In this case the choice of which set to return is made arbitrarily.
This sounds to me that if the solver chooses one side of the disjunction and filtered_formula_to_package_names
the other, this will lead to all the solver-selected legitimate dependencies being filtered out. Or, if I pass in the result from the solver as the package name map, the function will fail to find a solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds to me that if the solver chooses one side of the disjunction and filtered_formula_to_package_names the other, this will lead to all the solver-selected legitimate dependencies being filtered out. Or, if I pass in the result from the solver as the package name map, the function will fail to find a solution.
If this is an issue, then it's already present with packages from opam repository. Working around it for local packages defined using the opam syntax is going to be an insufficient work around anyway.
But I think this isn't a huge issue. Rephrased less pessimistically, it tell us that solver can find at least one way to satisfy the disjunction. If there's only one way, we will find it. If there's more than one way, we will pick some arbitrary way - but it will still satisfy the formula. We're only planning to execute one build plan using this solution, so we indeed need only one solution.
I agree that It's rather unsatisfying that we can't tell the user there's more than one solution. I would really prefer to tell the user to write more constraints so that we don't have to choose things arbitrarily. But in the end, only one build plan is executed, and therefore only one side of any disjunction needs to be reachable.
Looked over the approach again. I agree that emitting a dependency that wasn't selected for the hash check as an example isn't so bad. It's also an improvement to have the dependency hash reflect the formula. There's an issue introduced to the reachability check though. That one should most definitely only consider effective dependency sets. Doing otherwise would include useless packages in the lock dir |
48312d3
to
6848f33
Compare
In my tests this seems to work, as the reachability is tested by filtering over a solver solution, so if the solver doesn't pick it, it isn't ever checked for reachability. That works even in the presence of |
8b2c551
to
4cfe9e4
Compare
src/dune_pkg/dependency_formula.ml
Outdated
Dune_sexp.List [ Dune_sexp.atom "or"; l; r ] | ||
;; | ||
|
||
let rec to_sexp v = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of a shakem to introduce this much boilerplate. I just noticed we have some converters in Opam_dyn
. How about we try using that instead of the sexp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works too, I've replaced it with converting to Dyn.t
and hashing that. I currently hash it via to_string
but maybe a nicer solution would be using Dyn.hash
and converting the int
into a hex digest?
8c479b4
to
64ba323
Compare
ac42c54
to
408bd26
Compare
The `Dependency_set.t` representation can't deal with disjunctions but in most cases that is not even necessary as the set gets turned into a `filtered_formula` again. Thus it might be easier to keep the original representation and implement the necessary dependency set functionality on top of that. Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com> Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
24f3a62
to
440d328
Compare
972e02d
to
ce95b47
Compare
LGTM. I just replaced the recursion with some existing opam library functions. |
At the moment when an OPAM package in the local packages we only accept a list of
Atom
s, so if it has an disjunction in the dependencies we fail when loading as we attempt to parse the list ofAtom
s into aDependency_set.t
structure. This structure only supports a sequence of conjunctions (as that is the only syntax thatdune-project
files support at the moment).This PR changes the internal representation of the dependencies to be
OpamTypes.filtered_formula
as when reading OPAM files this is what we get from the OPAM file parser. When we pass the data to the solver it is alsoOpamTpyes.filtered_formula
that we have to encode back to. Thus I suggest keepingOpamTypes.filtered_formula
as dependency specification and compute the information we need from it directly. The information is fairly limited as we only need one example package name for an error message and a hash of the dependencies.The downside of this is that all existing dependency hashes change (see tests), as the representation of the package constraints the hash is calculated over is changing, but I consider this an acceptable compromise.
Fixes #10837