Jdk1.8以后的HashMap存储数据结构链表长度超过8且数组长度大于64时数据结构改为了红黑树

Jdk1.8之前HashMap中存储数据结构

HashMap的高效率

HashMap是将key做hash算法,然后将hash值映射到内存地址,直接获取key所对应的数据。HashMap中数据存储的结构是数组+链表,链表是为了解决hash碰撞问题。HashMap为什么快?主要是以下三点:

  1. hash算法是高效的;
  2. hash值到内存地址(数组索引)的算法是快速的;
  3. 根据内存地址(数组索引)可以直接获取对应的数据。

负载因子和阀值

负载因子又叫填充比,是一个介于0和1之间的浮点数,他决定了HashMap在扩容之前内部数组的填充度。默认HashMap初始大小16,负载因子0.75;

负载因子=元素个数/内部数组总数

Jdk1.8HashMap中存储数据结构的调整

在Jdk1.8以及以后的版本中,HashMap的底层数据结构由原来的数组+链表,调整为数组+链表/红黑树。结合源码分析一下:
首先明确链表的时间复杂度是O(n),红黑树时间复杂度是O(Logn)。既然红黑树的复杂度优于链表,那么为为什么HashMap不直接使用红黑树代替链表呢?
进入HashMap源码可以看到这段描述:

1
2
3
4
* Because TreeNodes are about twice the size of regular nodes, we
* use them only when bins contain enough nodes to warrant use
* (see TREEIFY_THRESHOLD). And when they become too small (due to
* removal or resizing) they are converted back to plain bins.

大概意思是说树的节点占的空间是普通节点的两倍,在节点足够多的时候才会使用树形数据结构,如果节点变少了还是会变回普通节点。总的来说就是节点太少的时候没必要转换、不仅转换后的数据结构占空间而且转换也需要花费时间。
小括号中有个see TREEIFY_THRESHOLD. 这里让提示上文说的足够大是多少:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
In
* usages with well-distributed user hashCodes, tree bins are
* rarely used. Ideally, under random hashCodes, the frequency of
* nodes in bins follows a Poisson distribution
* (http://en.wikipedia.org/wiki/Poisson_distribution) with a
* parameter of about 0.5 on average for the default resizing
* threshold of 0.75, although with a large variance because of
* resizing granularity. Ignoring variance, the expected

/**
* The bin count threshold for using a tree rather than list for a
* bin. Bins are converted to trees when adding an element to a
* bin with at least this many nodes. The value must be greater
* than 2 and should be at least 8 to mesh with assumptions in
* tree removal about conversion back to plain bins upon
* shrinkage.
*/
static final int TREEIFY_THRESHOLD = 8;

描述信息为了使hashCode分布良好,树形结构很少使用。并且在理想状态下受随机分布的hashCode影响,链表中的节点遵循泊松分布,据统计链表中的节点数是8的概率已经接近千分之一且此时链表的性能已经很差。所以在这种比较罕见的和极端的情况下才会把链表转变为红黑树。
就是说大部分情况下HashMap还是使用链表,如果理想的均匀分布节点数不到8就已经自动扩容了。源码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
/**
* The smallest table capacity for which bins may be treeified.
* (Otherwise the table is resized if too many nodes in a bin.)
* Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
* between resizing and treeification thresholds.
*/
static final int MIN_TREEIFY_CAPACITY = 64;

/**
* Replaces all linked nodes in bin at index for given hash unless
* table is too small, in which case resizes instead.
*/
final void treeifyBin(Node<K,V>[] tab, int hash) {
int n, index; Node<K,V> e;
// 数组长度小于64会扩容而不是转为红黑树。
if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
resize();
else if ((e = tab[index = (n - 1) & hash]) != null) {
TreeNode<K,V> hd = null, tl = null;
do {
TreeNode<K,V> p = replacementTreeNode(e, null);
if (tl == null)
hd = p;
else {
p.prev = tl;
tl.next = p;
}
tl = p;
} while ((e = e.next) != null);
if ((tab[index] = hd) != null)
hd.treeify(tab);
}
}