CISC 3130
Data Structures
Associative Implementation Containers — I
Hashing
What's the most efficient way (in terms of time) of:
a) reading in 50 integers all of which are in the range of 0 ... 99 and determining if there are any duplicates?
b) reading in 5,000 integers all of which are in the range of 0 ... 99 and determining if there are any duplicates?
c) reading in 50 integers all of which are in the range of 100 ... 199 and determining if there are any duplicates?
d) reading in 50 integers all of which are in the range of 1,000,000 ... 1,000,099 (still only 100 possible value) and determining if there are any duplicates?
e) reading in 50 integers all of which are in the range of 0 ... 99 and printing them out in sorted order?
f) reading in 5,000 integers all of which are in the range of 0 ... 99 and printing them out in sorted order?
g) reading in 50 integers all of which are in the range of 0 ... 99, and then reading in a number and determining if it was one of the original 50?
h) reading in 50 integers all of which are in the range of 100 ... 199, and then reading in a number and determining if it was one of the original 50?
i) reading in 50 integers (no range restriction), and then reading in a number and determining if it was one of the original 50?
Hash Tables — An O(1) Implementation
- Potentially constant (
O(1)
lookup time
- hash clash / collision
- rehash - open addressing / closed hashing
- Linear / quadratic probing; double hashing
- chaining / bucketing - closed addressing / open hashing
- load factor: how filled the table gets before resizing
Hash Table
Closed Hashing / Open Addressing — Linear Probe
- Closed Hashing: everything stays in one container — the hash table
- Open Addressing: the final address where the key resides may vay
class HashTable<E>{
…
private:
E [] table;
};
- collisions resolved by linear search from target of hash function
Closed Hashing / Open Addressing — Quadratic Probe
Open Hashing / Closed Addressing — Chaining
- Open Hashing: there is the hash table proper and then there is the bucket
- Closed Addressing: the final address where the key resides may is fixed
class HashTable<E>{
…
private:
Collection<E> [] table; // array, list, vector, set, tree, etc
};
- collisions resolved by maintaining a container (outside the table) for all value targeting to the entry by the hash function