Skip to content
wangp edited this page Aug 27, 2017 · 1 revision

We've done enough with lists. Often, we want not just a collection of elements but to associate every element with some other value, and to be able to look up those values quickly.

The Mercury standard library provides a map abstract data type map(K, V) that holds a set of key-value pairs. The keys are of type K, which can be any type for which builtin.compare/3 provides an ordering (not predicate or function types, for example). The values associated with each key are of any type V.

You may wish to open the documentation for the map module as you read this chapter.

Creating a map

Start by importing the map module:

:- import_module map.

There is no special support for maps in the compiler. To create an empty map, you call either the map.init predicate or function:

    % Initialize an empty map.
    %
:- func init = (map(K, V)::uo) is det.
:- pred init(map(_, _)::uo) is det.

You can also create an map of a single key-value pair with this function:

    % Initialize a map containing the given key-value pair.
    %
:- func singleton(K, V) = map(K, V).

Example

Map0 = map.singleton("some key", "some value")

Inserting elements

You can insert another key-value pair into a map with the insert predicate or function:

    % Insert a new key and corresponding value into a map.
    % Fail if the key already exists.
    %
:- func insert(map(K, V), K, V) = map(K, V) is semidet.
:- pred insert(K::in, V::in, map(K, V)::in, map(K, V)::out) is semidet.

Notice that insert is careful not to overwrite an existing key-value pair already in the map.

If you are sure the key doesn't exist, or you would rather catch an exception if it does exist, you can use the det_insert predicate or function:

    % Insert a new key and corresponding value into a map.
    % Abort if the key already exists.
    %
:- func det_insert(map(K, V), K, V) = map(K, V).
:- pred det_insert(K::in, V::in, map(K, V)::in, map(K, V)::out) is det.

"Abort" in the documentation means throwing an exception (we should probably change the wording).

Example

map.init(Map0),
map.det_insert("some key", "some value", Map0, Map1)

Searching for elements

To check if a map already contains a key, use the contains/2 predicate:

    % Succeed iff the map contains the given key.
    %
:- pred contains(map(K, _V)::in, K::in) is semidet.

More often, you'll also want to get the value associated with the key. Then use search or lookup:

    % Search map for key.
    %
:- func search(map(K, V), K) = V is semidet.
:- pred search(map(K, V)::in, K::in, V::out) is semidet.

    % Search map for key, but abort if search fails.
    %
:- func lookup(map(K, V), K) = V.
:- pred lookup(map(K, V)::in, K::in, V::out) is det.

This naming convention, where "search" fails if something does not exist but "lookup" throws an exception, is common in the standard library.

Example

( if map.search(Map, "some key", Value) then
    /* Value is associated with the key */
else
    /* key not found */
)

Deleting elements

To delete a key-value pair from a map, you would (surprise) call the delete predicate or function:

    % Delete a key-value pair from a map.
    % If the key is not present, leave the map unchanged.
    %
:- func delete(map(K, V), K) = map(K, V).
:- pred delete(K::in, map(K, V)::in, map(K, V)::out) is det.

There are also predicates named remove or det_remove that succeed only if the key actually exists in the map. That allows them to also return the value that was associated with the key.

    % Delete a key-value pair from a map and return the value.
    % Fail if the key is not present.
    %
:- pred remove(K::in, V::out, map(K, V)::in, map(K, V)::out) is semidet.

    % Delete a key-value pair from a map and return the value.
    % Abort if the key is not present.
    %
:- pred det_remove(K::in, V::out, map(K, V)::in, map(K, V)::out) is det.

The delete/remove distinction is another naming convention used in the standard library.

Example

( if map.remove("some key", Value, Map0, Map1) then
    /* Value was associated with the key */
else
    /* key not found */
)

Updating elements

You can change the value associated with a key within a map. If you expect the key to exist in the map, you would use the update or det_update predicate or function:

    % Update the value corresponding to a given key
    % Fail if the key doesn't already exist.
    %
:- func update(map(K, V), K, V) = map(K, V) is semidet.
:- pred update(K::in, V::in, map(K, V)::in, map(K, V)::out) is semidet.

    % Update the value corresponding to a given key
    % Abort if the key doesn't already exist.
    %
:- func det_update(map(K, V), K, V) = map(K, V).
:- pred det_update(K::in, V::in, map(K, V)::in, map(K, V)::out) is det.

If you do not care whether the key already exists in the map, then you can use the set predicate or function:

    % Update value if the key is already present, otherwise
    % insert new key and value.
    %
:- func set(map(K, V), K, V) = map(K, V).
:- pred set(K::in, V::in, map(K, V)::in, map(K, V)::out) is det.

Using det_insert or det_update can potentially turn up logical errors in your code that set would hide.

Map traversals

Just as we could "fold" over a list, there are predicates to perform in-order or reverse order traversals over a map, i.e. call a higher-order term for each key-value pair in the map.

The most basic one is foldl with one accumulator pair. The ordering is as determined by builtin.compare/3.

    % Perform an inorder traversal of the map, applying
    % an accumulator predicate for each key-value pair.
    %
:- func foldl(func(K, V, A) = A, map(K, V), A) = A.
:- pred foldl(pred(K, V, A, A), map(K, V), A, A).
:- mode foldl(pred(in, in, in, out) is det, in, in, out) is det.
% more modes omitted

Or, if you want to perform the traversal in reverse:

:- func foldr(func(K, V, A) = A, map(K, V), A) = A.
:- pred foldr(pred(K, V, A, A), map(K, V), A, A).
:- mode foldr(pred(in, in, in, out) is det, in, in, out) is det.
% more modes omitted

As with lists, there are more predicates that do that same thing but accept more accumulator pairs, up to map.foldl5 and map.foldr5. There are also analogues of list.map to transform each value in the map (map_values, map_values_only, etc.) and map-foldl fusions.

Converting between map and list representations

It can be convenient to convert a map into a list representation then work with the key-value pairs as a list. Or you build up a list of keys and values and need to convert it to a map for efficient value lookup, or other reasons.

An association list has the type assoc_list(K, V), defined in the assoc_list module as:

:- type assoc_list(K, V) == list(pair(K, V)).

And pair(K, V) is a type defined in the pair module:

:- type pair(T1, T2)
    --->    (T1 - T2).

So ["one" - 1, "two" - 2, "three" - 3] is an association list of type assoc_list(string, int). Association lists are not necessarily sorted, and may contain duplicate keys.

The map module provides predicates and functions to convert a map to an association list:

    % Convert a map to an association list.
    %
:- func to_assoc_list(map(K, V)) = assoc_list(K, V).
:- pred to_assoc_list(map(K, V)::in, assoc_list(K, V)::out) is det.

    % Convert a map to an association list which is sorted on the keys.
    %
:- func to_sorted_assoc_list(map(K, V)) = assoc_list(K, V).
:- pred to_sorted_assoc_list(map(K, V)::in, assoc_list(K, V)::out) is det.

to_assoc_list could, in theory, return elements in any order.

In the other direction, we have:

    % Convert an association list to a map.
    %
:- func from_assoc_list(assoc_list(K, V)) = map(K, V).
:- pred from_assoc_list(assoc_list(K, V)::in, map(K, V)::out) is det.

from_assoc_list has to work correctly even if its input list is unsorted or contains duplicate keys. If the input is already sorted and has no duplicates, it is possible to construct the map more efficiently using these predicates and functions:

    % Convert a sorted association list with no duplicated keys to a map.
    %
:- func from_sorted_assoc_list(assoc_list(K, V)) = map(K, V).
:- pred from_sorted_assoc_list(assoc_list(K, V)::in, map(K, V)::out)
    is det.

    % Convert a reverse sorted association list with no duplicated keys
    % to a map.
    %
:- func from_rev_sorted_assoc_list(assoc_list(K, V)) = map(K, V).
:- pred from_rev_sorted_assoc_list(assoc_list(K, V)::in, map(K, V)::out)
    is det.

You have to be careful, though, because "sorted" means as ordered by builtin.compare/3. In the following example, AssocList is sorted according to the Swedish sort order, which might be employed if you filtered the list through the Unix sort utility, but it would be invalid to pass the list to map.from_sorted_assoc_list.

AssocList = [
    "piña" - 1,
    "pint" - 2
],
map.from_sorted_assoc_list(AssocList, Map)  % invalid

Other operations

That should cover the most common operations on maps. The map module provides other operations that might come in handy so have a look at its documentation.

Implementation

In a pure language like Mercury, when we modify a map M0 to produce M1, we don't want the value of the old map to change. Unless we can guarantee that M0 can never be referenced ever again, we need to create a new map instead of destroying the old one. Tree structures are often employed to implement persistent data structures (data structures which preserve old versions) because a new tree can be created fairly cheaply by sharing most of its sub-trees with the old tree.

The map(K, V) type is implemented using balanced 2-3-4 trees. A 2-3-4 tree has interior nodes with 2, 3 or 4 children so will have a smaller height than an equivalent binary tree with the same number of elements. When you "change" a node in the tree, there will be fewer nodes leading up to the root of the tree that need to be recreated. I believe that is why the 2-3-4 tree representation is preferred over, say, a red-black tree. Either that, or because keeping 2-3-4 trees balanced is more efficient. (The rbtree module contains an implementation of red-black trees if you wish to compare.)

As map(K, V) is represented with a balanced tree, you can know that the insert/search/delete operations have O(log n) time complexity. That may change if the representation ever changes.

Example program

This program reads from standard input then prints out the number of occurrences of each unique word (a "word" being simply whatever string.word considers a word). It uses a map to keep track of the number of occurrences of each word seen so far.

Sample run

./count_words < animals.txt
1	gnu
2	liger
2	lion
1	tiger

Source Code

:- module count_words.
:- interface.

:- import_module io.

:- pred main(io::di, io::uo) is det.

:- implementation.

:- import_module int.
:- import_module list.
:- import_module map.
:- import_module string.

main(!IO) :-
    map.init(Counts0),
    read_and_count_words(Counts0, Counts, !IO),
    map.foldl(print_count, Counts, !IO).

:- pred read_and_count_words(map(string, int)::in, map(string, int)::out,
    io::di, io::uo) is det.

read_and_count_words(!Counts, !IO) :-
    io.read_line_as_string(ReadRes, !IO),
    (
        ReadRes = ok(Line),
        list.foldl(count_string, words(Line), !Counts),
        read_and_count_words(!Counts, !IO)
    ;
        ReadRes = eof
    ;
        ReadRes = error(Error),
        io.write_string("Error: ", !IO),
        io.write_string(error_message(Error), !IO),
        io.nl(!IO)
    ).

:- pred count_string(string::in,
    map(string, int)::in, map(string, int)::out) is det.

count_string(String, !Counts) :-
    ( if map.search(!.Counts, String, Count) then
        map.det_update(String, Count + 1, !Counts)
    else
        map.det_insert(String, 1, !Counts)
    ).

:- pred print_count(string::in, int::in, io::di, io::uo) is det.

print_count(String, Count, !IO) :-
    io.write_int(Count, !IO),
    io.write_char('\t', !IO),
    io.write_string(String, !IO),
    io.nl(!IO).

Exercises

  • Make count_words print out words ordered by the number of occurrences.

  • Think of a more interesting example program.