HashTable 源码阅读 #1

lfkdsk · 2017-07-13T01:23:31Z

HashTable 源码解析

HashTable 是我们经常会使用的一种数据结构，是一种哈希表的具体实现，提供键值对的存储，对于哈希表我们了解的是比较多的。其中的比较重要的问题是如何处理其中的哈希冲突的问题，在系统中的具体实现是使用了 链地址法 和 re-hash 的结合方式处理冲突。另外还有就是 HashTable 和我们更为常用的 HashMap 的一个主要的区别就是 HashTable 是支持多线程操作的，在操作 HashTable 的很多的方法里都是同步的代码块，另外就是 HashTable 不允许 null 值的设置。

加载因子和扩容

我们首先来看看我们是如何进行数据的具体存储的，和很久之前写的那篇文章中分析 ArrayList 和 LinkedList 的一样，HashTable 本身的存储也依赖于泛型数组：

    /**
     * The hash table data.
     */
    private transient Entry<?,?>[] table;

    /**
     * The total number of entries in the hash table.
     */
    private transient int count;

这里我们使用了 Entry 里面填充的双通配符的模式来存储我们的 Key-Value 的键值对，当然这个坑主要还是因为 Java 本身没办法实现泛型数组的声明，只能靠泛型转换来搞定，另外 count 属性本身是用来存储 table 的本身的长度。

另外我们还有一些的数据对 HashTable 也同样重要的数据：

    /**
     * The table is rehashed when its size exceeds this threshold.  (The
     * value of this field is (int)(capacity * loadFactor).)
     *
     * @serial
     */
    private int threshold;

    /**
     * The load factor for the hashtable.
     *
     * @serial
     */
    private float loadFactor;

    /**
     * The number of times this Hashtable has been structurally modified
     * Structural modifications are those that change the number of entries in
     * the Hashtable or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the Hashtable fail-fast.  (See ConcurrentModificationException).
     */
    private transient int modCount = 0;

threshold 是我们 HashTable 扩容的一个阈值和门限，只有超过了这个门限的 table 才会对数据进行扩容，另外还有 loadFactor 是加载因子，加载因子本身应该是一个小于 1 的数字：

table = new Entry<?,?>[initialCapacity];
threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);

我们可以看到 threshold 这个门限参数本身来自于 capacity （扩容的大小），和 loadFactor 加载因子的乘积共同决定的，所以说乘出来的门限，就是我们的扩容的阈值。因此在源码的注释之中，也提到了优化的方式，就是不要有过高的初始化的容量和过小的加载因子，因为这两者都会造成 table 的过早扩容，造成浪费的情况出现。

数据存储

我们在该类的实际的使用之中采用了一个内部类 Entry 存储数据：

    private static class Entry<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Entry<K,V> next;
    }

每个 Entry 对象本身是一个单向列表的节点，也就是说我们在用 链地址法 使用的时候结成的那个单向链表的节点，我们可以用如下的一幅图来表示：

处理数据冲突

/**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key.equals(k))},
     * then this method returns {@code v}; otherwise it returns
     * {@code null}.  (There can be at most one such mapping.)
     *
     * @param key the key whose associated value is to be returned
     * @return the value to which the specified key is mapped, or
     *         {@code null} if this map contains no mapping for the key
     * @throws NullPointerException if the specified key is null
     * @see     #put(Object, Object)
     */
    @SuppressWarnings("unchecked")
    public synchronized V get(Object key) {
        Entry<?,?> tab[] = table;
        int hash = key.hashCode();
        int index = (hash & 0x7FFFFFFF) % tab.length;
        for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
            if ((e.hash == hash) && e.key.equals(key)) {
                return (V)e.value;
            }
        }
        return null;
    }

我们首先来看 get 方法的处理是如何进行的，首先获得 key 的 hashCode 的值，通过和 0x7FFFFFFF 相与这个步骤

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HashTable 源码阅读 #1

HashTable 源码阅读 #1

lfkdsk commented Jul 13, 2017

HashTable 源码阅读 #1

HashTable 源码阅读 #1

Comments

lfkdsk commented Jul 13, 2017

HashTable 源码解析

加载因子和扩容

数据存储

处理数据冲突