The Collision Problem


Ideal World
In an ideal world with a perfect hash function we would have no collisions.

Every key would be converted perfectly to it's own unique index:

  • key1 is converted to index 0
  • key2 to index 1,
  • etc.
Real World
However, in the real world, even the best hash functions are not perfect.

Thus we end up with collisions like the one shown.

Here two keys both end up at index 3.

We would end up keeping one customer object and losing one!
Example Collisions
Here is an example using Java's hash function for a small array of size 5.

Three of five keys collide on index position 4!

More Collision Examples...
     key               hashCode  index (mod 5)
44000999oakst1        741744416       1
55000999elmave195     397815034       4
55555999maplelane295  533223409       4
12345999sprucehwy935  258691609       4
10001999sequoia402    525797612       2
Collision Problem
We need to figure out how to resolve this "collision problem"


Collision Problem

Solving the Collision Problem


Overview


We ran into a show-stopper with the hash index "collision problem". It caused us to lose data.

Resolving this problem is surprisingly simple.

Solution


Collision Revisited
Here is a reminder of what a collision looks like.

Remember how it caused us to lose a customer.
Use 'Buckets'
A small tweak to our approach will solve the collision problem.

Rather than storing one object at each index, we store a "bucket".

When a collision occurs, we simply add the object into the appropriate bucket (at the hash index).
What Is a Bucket?
Bucket properties:

Concept Design 2 💡


HashTable Object Diagram Try #2
We just tweak a sub-component name here.

A small but very important change indeed.
HashTable
  |
 elements buckets
  |
  +---- Array