-
Notifications
You must be signed in to change notification settings - Fork 54
Maps
We've done enough with lists. Often, we want not just a collection of elements but to associate every element with some other value, and to be able to look up those values quickly.
The Mercury standard library provides a map abstract data type map(K, V)
that holds a set of key-value pairs. The keys are of type K, which can be
any type for which builtin.compare/3
provides an ordering (not predicate or
function types, for example).
The values associated with each key are of any type V.
You may wish to open the documentation for the map module as you read this chapter.
Start by importing the map
module:
:- import_module map.
There is no special support for maps in the compiler. To create an empty map,
you call either the map.init
predicate or function:
% Initialize an empty map.
%
:- func init = (map(K, V)::uo) is det.
:- pred init(map(_, _)::uo) is det.
You can also create an map of a single key-value pair with this function:
% Initialize a map containing the given key-value pair.
%
:- func singleton(K, V) = map(K, V).
Map0 = map.singleton("some key", "some value")
You can insert another key-value pair into a map with the insert
predicate
or function:
% Insert a new key and corresponding value into a map.
% Fail if the key already exists.
%
:- func insert(map(K, V), K, V) = map(K, V) is semidet.
:- pred insert(K::in, V::in, map(K, V)::in, map(K, V)::out) is semidet.
Notice that insert is careful not to overwrite an existing key-value pair already in the map.
If you are sure the key doesn't exist, or you would rather catch an exception
if it does exist, you can use the det_insert
predicate or function:
% Insert a new key and corresponding value into a map.
% Abort if the key already exists.
%
:- func det_insert(map(K, V), K, V) = map(K, V).
:- pred det_insert(K::in, V::in, map(K, V)::in, map(K, V)::out) is det.
"Abort" in the documentation means throwing an exception (we should probably change the wording).
map.init(Map0),
map.det_insert("some key", "some value", Map0, Map1)
To check if a map already contains a key, use the contains/2
predicate:
% Succeed iff the map contains the given key.
%
:- pred contains(map(K, _V)::in, K::in) is semidet.
More often, you'll also want to get the value associated with the key.
Then use search
or lookup
:
% Search map for key.
%
:- func search(map(K, V), K) = V is semidet.
:- pred search(map(K, V)::in, K::in, V::out) is semidet.
% Search map for key, but abort if search fails.
%
:- func lookup(map(K, V), K) = V.
:- pred lookup(map(K, V)::in, K::in, V::out) is det.
This naming convention, where "search" fails if something does not exist but "lookup" throws an exception, is common in the standard library.
( if map.search(Map, "some key", Value) then
/* Value is associated with the key */
else
/* key not found */
)
To delete a key-value pair from a map, you would (surprise)
call the delete
predicate or function:
% Delete a key-value pair from a map.
% If the key is not present, leave the map unchanged.
%
:- func delete(map(K, V), K) = map(K, V).
:- pred delete(K::in, map(K, V)::in, map(K, V)::out) is det.
There are also predicates named remove
or det_remove
that succeed
only if the key actually exists in the map. That allows them to also
return the value that was associated with the key.
% Delete a key-value pair from a map and return the value.
% Fail if the key is not present.
%
:- pred remove(K::in, V::out, map(K, V)::in, map(K, V)::out) is semidet.
% Delete a key-value pair from a map and return the value.
% Abort if the key is not present.
%
:- pred det_remove(K::in, V::out, map(K, V)::in, map(K, V)::out) is det.
The delete/remove distinction is another naming convention used in the standard library.
( if map.remove("some key", Value, Map0, Map1) then
/* Value was associated with the key */
else
/* key not found */
)
You can change the value associated with a key within a map. If you expect
the key to exist in the map, you would use the update
or det_update
predicate or function:
% Update the value corresponding to a given key
% Fail if the key doesn't already exist.
%
:- func update(map(K, V), K, V) = map(K, V) is semidet.
:- pred update(K::in, V::in, map(K, V)::in, map(K, V)::out) is semidet.
% Update the value corresponding to a given key
% Abort if the key doesn't already exist.
%
:- func det_update(map(K, V), K, V) = map(K, V).
:- pred det_update(K::in, V::in, map(K, V)::in, map(K, V)::out) is det.
If you do not care whether the key already exists in the map, then you
can use the set
predicate or function:
% Update value if the key is already present, otherwise
% insert new key and value.
%
:- func set(map(K, V), K, V) = map(K, V).
:- pred set(K::in, V::in, map(K, V)::in, map(K, V)::out) is det.
Using det_insert
or det_update
can potentially turn up logical errors
in your code that set
would hide.
Just as we could "fold" over a list, there are predicates to perform in-order or reverse order traversals over a map, i.e. call a higher-order term for each key-value pair in the map.
The most basic one is foldl with one accumulator pair.
The ordering is as determined by builtin.compare/3
.
% Perform an inorder traversal of the map, applying
% an accumulator predicate for each key-value pair.
%
:- func foldl(func(K, V, A) = A, map(K, V), A) = A.
:- pred foldl(pred(K, V, A, A), map(K, V), A, A).
:- mode foldl(pred(in, in, in, out) is det, in, in, out) is det.
% more modes omitted
Or, if you want to perform the traversal in reverse:
:- func foldr(func(K, V, A) = A, map(K, V), A) = A.
:- pred foldr(pred(K, V, A, A), map(K, V), A, A).
:- mode foldr(pred(in, in, in, out) is det, in, in, out) is det.
% more modes omitted
As with lists, there are more predicates that do that same thing but accept
more accumulator pairs, up to map.foldl5
and map.foldr5
.
There are also analogues of list.map
to transform each value in the map
(map_values
, map_values_only
, etc.) and map-foldl fusions.
It can be convenient to convert a map into a list representation then work with the key-value pairs as a list. Or you build up a list of keys and values and need to convert it to a map for efficient value lookup, or other reasons.
An association list has the type assoc_list(K, V)
, defined in the
assoc_list
module as:
:- type assoc_list(K, V) == list(pair(K, V)).
And pair(K, V)
is a type defined in the pair
module:
:- type pair(T1, T2)
---> (T1 - T2).
So ["one" - 1, "two" - 2, "three" - 3]
is an association list of type
assoc_list(string, int)
. Association lists are not necessarily sorted,
and may contain duplicate keys.
The map
module provides predicates and functions to convert a map to an
association list:
% Convert a map to an association list.
%
:- func to_assoc_list(map(K, V)) = assoc_list(K, V).
:- pred to_assoc_list(map(K, V)::in, assoc_list(K, V)::out) is det.
% Convert a map to an association list which is sorted on the keys.
%
:- func to_sorted_assoc_list(map(K, V)) = assoc_list(K, V).
:- pred to_sorted_assoc_list(map(K, V)::in, assoc_list(K, V)::out) is det.
to_assoc_list
could, in theory, return elements in any order.
In the other direction, we have:
% Convert an association list to a map.
%
:- func from_assoc_list(assoc_list(K, V)) = map(K, V).
:- pred from_assoc_list(assoc_list(K, V)::in, map(K, V)::out) is det.
from_assoc_list
has to work correctly even if its input list is unsorted or
contains duplicate keys. If the input is already sorted and has no duplicates,
it is possible to construct the map more efficiently using these predicates
and functions:
% Convert a sorted association list with no duplicated keys to a map.
%
:- func from_sorted_assoc_list(assoc_list(K, V)) = map(K, V).
:- pred from_sorted_assoc_list(assoc_list(K, V)::in, map(K, V)::out)
is det.
% Convert a reverse sorted association list with no duplicated keys
% to a map.
%
:- func from_rev_sorted_assoc_list(assoc_list(K, V)) = map(K, V).
:- pred from_rev_sorted_assoc_list(assoc_list(K, V)::in, map(K, V)::out)
is det.
You have to be careful, though, because "sorted" means as ordered by
builtin.compare/3
. In the following example, AssocList
is sorted
according to the Swedish sort order, which might be employed if you
filtered the list through the Unix sort
utility, but it would be
invalid to pass the list to map.from_sorted_assoc_list
.
AssocList = [
"piña" - 1,
"pint" - 2
],
map.from_sorted_assoc_list(AssocList, Map) % invalid
That should cover the most common operations on maps. The map
module
provides other operations that might come in handy so have a look at its
documentation.
In a pure language like Mercury, when we modify a map M0 to produce M1, we don't want the value of the old map to change. Unless we can guarantee that M0 can never be referenced ever again, we need to create a new map instead of destroying the old one. Tree structures are often employed to implement persistent data structures (data structures which preserve old versions) because a new tree can be created fairly cheaply by sharing most of its sub-trees with the old tree.
The map(K, V)
type is implemented using balanced 2-3-4 trees. A 2-3-4
tree has interior nodes with 2, 3 or 4 children so will have a smaller
height than an equivalent binary tree with the same number of
elements. When you "change" a node in the tree, there will be fewer
nodes leading up to the root of the tree that need to be recreated.
I believe that is why the 2-3-4 tree representation is preferred over,
say, a red-black tree. Either that, or because keeping 2-3-4 trees
balanced is more efficient. (The rbtree
module contains an
implementation of red-black trees if you wish to compare.)
As map(K, V)
is represented with a balanced tree, you can know that
the insert/search/delete operations have O(log n) time complexity.
That may change if the representation ever changes.
This program reads from standard input then prints out the number
of occurrences of each unique word (a "word" being simply whatever
string.word
considers a word). It uses a map to keep track of the
number of occurrences of each word seen so far.
./count_words < animals.txt
1 gnu
2 liger
2 lion
1 tiger
:- module count_words.
:- interface.
:- import_module io.
:- pred main(io::di, io::uo) is det.
:- implementation.
:- import_module int.
:- import_module list.
:- import_module map.
:- import_module string.
main(!IO) :-
map.init(Counts0),
read_and_count_words(Counts0, Counts, !IO),
map.foldl(print_count, Counts, !IO).
:- pred read_and_count_words(map(string, int)::in, map(string, int)::out,
io::di, io::uo) is det.
read_and_count_words(!Counts, !IO) :-
io.read_line_as_string(ReadRes, !IO),
(
ReadRes = ok(Line),
list.foldl(count_string, words(Line), !Counts),
read_and_count_words(!Counts, !IO)
;
ReadRes = eof
;
ReadRes = error(Error),
io.write_string("Error: ", !IO),
io.write_string(error_message(Error), !IO),
io.nl(!IO)
).
:- pred count_string(string::in,
map(string, int)::in, map(string, int)::out) is det.
count_string(String, !Counts) :-
( if map.search(!.Counts, String, Count) then
map.det_update(String, Count + 1, !Counts)
else
map.det_insert(String, 1, !Counts)
).
:- pred print_count(string::in, int::in, io::di, io::uo) is det.
print_count(String, Count, !IO) :-
io.write_int(Count, !IO),
io.write_char('\t', !IO),
io.write_string(String, !IO),
io.nl(!IO).
-
Make
count_words
print out words ordered by the number of occurrences. -
Think of a more interesting example program.