Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: OrderedAssocList #556

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

feat: OrderedAssocList #556

wants to merge 19 commits into from

Conversation

kim-em
Copy link
Collaborator

@kim-em kim-em commented Jan 23, 2024

Ordered wrapper around AssocList, and basic functions find?, insert?, filterMapVal, and merge. An extensionality lemma (∀ a, l₁.find? a = l₂.find? a) → l₁ = l₂.

I will later use this to implement sparse coefficients in omega.

@kim-em kim-em added the awaiting-review This PR is ready for review; the author thinks it is ready to be merged. label Jan 23, 2024
@joehendrix
Copy link
Contributor

I'm assuming this is useful for kernel execution efficiency. Is that the case? Could you test performance against RBMap?

@kim-em
Copy link
Collaborator Author

kim-em commented Jan 30, 2024

I'm assuming this is useful for kernel execution efficiency. Is that the case? Could you test performance against RBMap?

No, actually, there's no need to run any of this in the kernel. I want this because I think it will be easier to write the theorems about algebraic operations on finitely supported functions Nat \to Int that I need in this representation than in RBMap.

Co-authored-by: Mario Carneiro <di.gama@gmail.com>
@digama0
Copy link
Member

digama0 commented Jan 30, 2024

In that case I'm a bit dubious of adding a new data structure like this to std, since it has neither good kernel performance nor good runtime performance, and we already have other ways to write finitely supported functions for proving purposes.

@kim-em
Copy link
Collaborator Author

kim-em commented Jan 30, 2024

Hmm, okay. I will see how proofs go on top of RBMap + a wrapper asserting there are no zero values.

@kim-em
Copy link
Collaborator Author

kim-em commented Jan 30, 2024

Currently RBMap.mergeWith is O(n₂ * log (n₁ + n₂)), whereas the merge here is O(n₁ + n₂). (I hope. :-)

@digama0
Copy link
Member

digama0 commented Jan 30, 2024

Hmm, okay. I will see how proofs go on top of RBMap + a wrapper asserting there are no zero values.

What are you trying to do more specifically? Why do you need proofs in this case? If this is an omega data structure I would expect proofs to not be necessary.

Currently RBMap.mergeWith is O(n₂ * log (n₁ + n₂)), whereas the merge here is O(n₁ + n₂). (I hope. :-)

Wikipedia describes a better algorithm for RBMap merge with complexity O(n₂ * log (n₁ / n₂ + 1)), which is strictly better than O(n₁ + n₂) (it wins mainly in the unbalanced case). The implementation here is a bit lazy, but with some proof work I think it wouldn't be too bad to implement the more efficient algorithm.

@kim-em
Copy link
Collaborator Author

kim-em commented Jan 30, 2024

Hmm, okay. I will see how proofs go on top of RBMap + a wrapper asserting there are no zero values.

What are you trying to do more specifically? Why do you need proofs in this case? If this is an omega data structure I would expect proofs to not be necessary.

This is very slightly inaccurate (because of subsequent laziness taking some shortcuts), but the functions and theorems I need on the representation of coefficients are given in https://github.com/leanprover/std4/blob/main/Std/Tactic/Omega/Coeffs/IntList.lean

Currently all these functions are implemented on List Int, i.e. a dense representation of coefficients, only trimming trailing zeros as needed to make the theorems true!

As you can see the intention is that I should be able to swap out any representation satisfying that API.

This is not particularly urgent, as right now now one is actually calling omega with more than ~10 variables... But Leo assures me that that someone will want to eventually.

@digama0
Copy link
Member

digama0 commented Jan 30, 2024

So why not just use AssocList there? Or Array (Nat x Int) for that matter.

@kim-em
Copy link
Collaborator Author

kim-em commented Jan 30, 2024

So why not just use AssocList there? Or Array (Nat x Int) for that matter.

Because then the merge operation (which is actually run, not just proved about) is O(n1 * n2), and I wanted to avoid that.

@digama0
Copy link
Member

digama0 commented Jan 30, 2024

So why not just use AssocList there? Or Array (Nat x Int) for that matter.

Because then the merge operation (which is actually run, not just proved about) is O(n1 * n2), and I wanted to avoid that.

I mean, you can still implement a O(n1 + n2) merge operation on AssocList which works provided the inputs are sorted. That is, you are working with ordered AssocLists, just not as a distinct type.

@kim-em
Copy link
Collaborator Author

kim-em commented Jan 31, 2024

So why not just use AssocList there? Or Array (Nat x Int) for that matter.

Because then the merge operation (which is actually run, not just proved about) is O(n1 * n2), and I wanted to avoid that.

I mean, you can still implement a O(n1 + n2) merge operation on AssocList which works provided the inputs are sorted. That is, you are working with ordered AssocLists, just not as a distinct type.

That's what this PR does! AssocList.orderedMerge. It seems really cumbersome to then state all the theorems about it (e.g. how it relates to find?, associativity, etc) if you forbid yourself from mentioning the bundled type that wraps up the witness of ordered-ness.

@digama0
Copy link
Member

digama0 commented Feb 1, 2024

Is it? I would expect it to just be a simple predicate, e.g.

theorem orderedMerge_ordered : Ordered l1 -> Ordered l2 -> Ordered (l1.orderedMerge l2) := ...

Actually it might not even be Ordered, it could be Pairwise R or something, at least for the low level primitive. You can therefore manipulate the ordered-ness independently of the list itself, and this features quite prominently in sorting functions for lists, for example, so I don't think it's a particularly bad design.

@kim-em kim-em added WIP work in progress and removed awaiting-review This PR is ready for review; the author thinks it is ready to be merged. labels Feb 3, 2024
@kim-em
Copy link
Collaborator Author

kim-em commented Feb 3, 2024

Is it?

I've started refactoring, and agree this is better. Thanks for talking me around. :-)

Std/Data/AssocList.lean Outdated Show resolved Hide resolved
@fgdorais
Copy link
Collaborator

fgdorais commented Feb 6, 2024

I think all my concerns have been addressed. Just one additional remark: the design of AssocList so far is that defs are specced using toList, correctness and other lemmas are then proved using the toList translation and the List library. I think this makes good sense and should be considered in the revisions.

@kim-em
Copy link
Collaborator Author

kim-em commented Feb 7, 2024

the design of AssocList so far is that defs are specced using toList, correctness and other lemmas are then proved using the toList translation and the List library.

This is still WIP, I will probably eventually remove the bundled OrderedAssocList entirely when I get back to this PR.

However, I am not intending to use toList for specification: the great property of an ordered assoc list is that find? suffices for specifications, and this is intentionally different from what we have to do for AssocList.

@leanprover-community-mathlib4-bot leanprover-community-mathlib4-bot added the merge-conflict This PR has merge conflicts with the `main` branch which must be resolved by the author. label Mar 5, 2024
@leanprover-community-mathlib4-bot leanprover-community-mathlib4-bot removed the merge-conflict This PR has merge conflicts with the `main` branch which must be resolved by the author. label Nov 23, 2024
@kim-em
Copy link
Collaborator Author

kim-em commented Nov 23, 2024

I've cleaned this up. I decided to keep the bundled structure because the whole point is the nice extensionality lemma available there.

@leanprover-community-bot
Copy link
Collaborator

Mathlib CI status (docs):

Copy link
Collaborator

@fgdorais fgdorais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only made it part way but I don't know when I will come back to it :/

The predicate that the keys of an `AssocList` are
in strictly increasing order according to the comparator `cmp`.
-/
def keysOrdered (cmp : α → α → Ordering) : AssocList α β → Prop
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make Bool valued. It's nicely tail recursive too!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm skeptical. What is the pay-off? It's kind of annoying to adapt to this.

Copy link
Collaborator

@fgdorais fgdorais Nov 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, it's a major refactor. My concern is that the decision procedure for the current version is not necessarily as efficient. Can we have both? Also, keysOrdered should be KeysOrdered if it's a proposition.

Have you considered using ltHeadKey? here to save some cases?

def KeysOrdered (cmp : α → α → Ordering) : AssocList α β → Prop
  | .nil => True
  | .cons a _ t => ltHeadKey? cmp a t ∧ KeysOrdered cmp t

/--
The condition that an element is less than the first key of an `AssocList`, or that list is empty.
-/
abbrev ltHeadKey? (cmp : α → α → Ordering) (a : α) (t : AssocList α β) : Prop :=
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes more sense as Bool valued. Something about the ? makes me unhappy here but it's hard to think of a more accurate alternative.

| .cons _ _ .nil => True
| .cons a _ (.cons x y t) => cmp a x = .lt ∧ keysOrdered cmp (.cons x y t)

instance instKeysOrderedDecidablePred : DecidablePred (keysOrdered cmp : AssocList α β → Prop) := by
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't be necessary if keysOrdered returns Bool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
builds-mathlib WIP work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants