Sometimes it is important to know how much memory you need in order to store the data in a Java collection.

Here is a little overview of the memory overhead of the most important Java collections.

In some cases I also added my own implementations to see how they shape up.

All collections where filled with 10000 Integer elements and then the memory was measured by producing a heap dump and then analyzing the heap dump with MemoryAnalyzer.
Executed in:

Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)

on a

Intel(R) Core(TM) i7 CPU       M 620  2.67GHz (4 CPUs), ~2.7GHz

Set

The data stored in the sets is always the same: 10000 instances of Integer:

Class Name	Objects	Bytes
java.util.Integer	10,000	240,000
Total	10,000	240,000

Memory Footprint

java.util.HashSet

Class Name	Objects	Bytes
java.util.HashMap$Entry	10,000	480,000
java.util.HashMap$Entry[]	1	131,096
java.util.HashMap	1	64
java.util.HashSet	1	24
java.util.HashMap$KeySet	1	24
Total	10,004	611,208

java.util.TreeSet

Class Name	Objects	Bytes
java.util.TreeMap$Entry	10,000	640,000
java.util.TreeMap	1	80
java.util.TreeSet	1	24
Total	10,002	640,104

Collections.synchronizedSet(java.util.HashSet)

A synchronized HashSet by wrapping it with the Collections.synchronizedSet() method.

Class Name	Objects	Bytes
java.util.HashMap$Entry	10,000	480,000
java.util.HashMap$Entry[]	1	131,096
java.util.HashMap	1	64
java.util.Collections$SynchronizedSet	1	32
java.util.HashSet	1	24
Total	10,004	611,208

Collections.newSetFromMap(ConcurrentHashMap)

Java does not provide a ConcurrentHashSet out of the box, but you can create an equivalent by wrapping a ConcurrentHashMap with Collections.newSetFromMap().

Class Name	Objects	Bytes
java.util.concurrent.ConcurrentHashMap$HashEntry	10,000	480,000
java.util.concurrent.ConcurrentHashMap$HashEntry[]	16	131,456
java.util.concurrent.ConcurrentHashMap$Segment	16	768
java.util.concurrent.ConcurrentHashMap$NonfairSync	16	768
java.util.concurrent.ConcurrentHashMap$Segment[]	1	152
java.util.concurrent.ConcurrentHashMap	1	72
java.util.Collections$SetFromMap	1	32
java.util.concurrent.ConcurrentHashMap$KeySet	1	24
Total	10,052	613,272

ch.obermuhlner.collection.ArraySet

This is an experimental implementation of an array-based mutable Set that was designed to have minimal memory footprint.

Access with contains() is O(n).

Class Name	Objects	Bytes
java.lang.Object[]	1	80,024
ch.obermuhlner.collection.ArraySet	1	24
Total	2	80,048

ch.obermuhlner.collection.ImmutableArraySet

Similar to the ArraySet above but immutable. Designed for minimal memory footprint.

Access with contains() is O(n).

Class Name	Objects	Bytes
java.lang.Object[]	1	80,024
ch.obermuhlner.collection.ImmutableArraySet	1	24
Total	2	80,048

ch.obermuhlner.collection.ImmutableSortedArraySet

Another experimental implementation of an array-based immutable Set.
The array is sorted by hash code and contains() uses a binary search.

Access with contains() is O(log(n)).

The ImmutableSortedArraySet has the option to store the hash codes of all elements in a separate int[] which trading improved performance with additional memory footprint.

Class Name	Objects	Bytes
java.lang.Object[]	1	80,024
int[]	1	40,024
ch.obermuhlner.collection.ImmutableSortedArraySet	1	32
Total	3	120,080

Performance

The performance of the different Sets was measured by running contains() with an existing random key against a set of a specific size.

Measuring with up to 20 elements shows that ArraySet and ImmutableArraySet really are linear. With less than 10 elements they are actually faster than the O(log(n)) ImmutableSortedArraySet.

In the next chart we see the performance with up to 1000 elements. The linear Sets are no longer shown because they out-scale everything else.

2 Responses to Memory Impact of Java Collections

Alojamiento web says:

2016-08-28 at 22:31

If overall memory utilization is increasing continuously despite garbage collection, there is a memory leak, which will inevitably lead to an out-of-memory error. In this case, a memory heap analysis is necessary.

- admin says:
  
  2017-03-04 at 15:42
  
  Sorry for answering so late, there was so much spam that I did not see your comment.
  
  You are of course correct. Memory utilization will increase if you keep references to your object, for example by having added it to a collection.
  
  What I am measuring here is how much memory is needed to maintain the data structure of the collection and how much heap memory can be freed by storing the object data in non-heap memory.
  
  This has nothing to do with memory leaks.

Memory Impact of Java Collections

Set