Hashing and Natural Parallism
Hashing and Natural Parallism Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Sequential Closed Hash Map
Art of Multiprocessor Programming 2 2 Sequential Closed Hash Map 0 1 2 3 16 9 h(k) = k mod 4 2 Items buckets
Add an Item
Art of Multiprocessor Programming 3 Add an Item 0 1 2 3 16 9 7 h(k) = k mod 4 3 Items
Add Another: Collision
Art of Multiprocessor Programming 4 4 Add Another: Collision 0 1 2 3 16 4 9 7 h(k) = k mod 4 4 Items
More Collisions
Art of Multiprocessor Programming 5 5 More Collisions 0 1 2 3 16 4 9 7 15 h(k) = k mod 4 5 Items
More Collisions
Art of Multiprocessor Programming 6 6 More Collisions 0 1 2 3 16 4 9 7 15 h(k) = k mod 4 5 Items Problem: buckets getting too long
Resizing
Art of Multiprocessor Programming 7 7 Resizing 0 1 2 3 16 4 9 7 15 4 5 6 7 Grow the array 5 Items h(k) = k mod 8
Resizing
Art of Multiprocessor Programming 8 8 5 Items Resizing 0 1 2 3 16 4 9 7 15 4 5 6 7 h(k) = k mod 8 Adjust hash function
Resizing
Art of Multiprocessor Programming 9 9 Resizing 0 1 2 3 16 9 7 15 h(4) = 0 mod 8 4 5 6 7 4 h(k) = k mod 8
Resizing
Art of Multiprocessor Programming 10 10 Resizing 0 1 2 3 16 4 9 7 15 4 5 6 7 h(k) = k mod 8 h(4) = 4 mod 8
Resizing
Art of Multiprocessor Programming 11 11 Resizing 0 1 2 3 16 4 9 7 15 4 5 6 7 h(k) = k mod 8 h(15) = 7 mod 8
Resizing
Art of Multiprocessor Programming 12 12 Resizing 0 1 2 3 16 4 9 4 5 6 7 h(k) = k mod 8 h(15) = 7 mod 8 15 7
Fields
Art of Multiprocessor Programming 13 13 Fields public class SimpleHashSet { protected LockFreeList[] table; public SimpleHashSet(int capacity) { table = new LockFreeList[capacity]; for (int i = 0; i < capacity; i++) table[i] = new LockFreeList(); } Array of lock-free lists
Constructor
Art of Multiprocessor Programming 14 14 Constructor public class SimpleHashSet { protected LockFreeList[] table; public SimpleHashSet ( int capacity) { table = new LockFreeList[capacity]; for (int i = 0; i < capacity; i++) table[i] = new LockFreeList(); } Initial size
Constructor
Art of Multiprocessor Programming 15 15 Constructor public class SimpleHashSet { protected LockFreeList[] table; public SimpleHashSet(int capacity) { table = new LockFreeList[capacity]; for (int i = 0; i < capacity; i++) table[i] = new LockFreeList(); } Allocate memory
Constructor
Art of Multiprocessor Programming 16 16 Constructor public class SimpleHashSet { protected LockFreeList[] table; public SimpleHashSet(int capacity) { table = new LockFreeList[capacity]; for ( int i = 0; i < capacity; i++) table[i] = new LockFreeList(); } Initialization
Add Method
Art of Multiprocessor Programming 17 17 Add Method public boolean add(Object key) { int hash = key.hashCode() % table.length; return table[hash].add(key); }
Add Method
Art of Multiprocessor Programming 18 18 Add Method public boolean add(Object key) { int hash = key.hashCode() % table.length; return table[hash].add(key); } Use object hash code to pick a bucket
Add Method
Art of Multiprocessor Programming 19 19 Add Method public boolean add(Object key) { int hash = key.hashCode() % table.length; return table[hash].add(key); } Call bucket’s add() method
No Brainer?
Art of Multiprocessor Programming 20 20 No Brainer? We just saw a Simple Lock-free Concurrent hash-based set implementation What’s not to like?
No Brainer?
Art of Multiprocessor Programming 21 21 No Brainer? We just saw a Simple Lock-free Concurrent hash-based set implementation What’s not to like? We don’t know how to resize …
Is Resizing Necessary?
Art of Multiprocessor Programming 22 22 Is Resizing Necessary? Constant-time method calls require Constant-length buckets Table size proportional to set size As set grows, must be able to resize
Set Method Mix
Art of Multiprocessor Programming 23 23 Set Method Mix Typical load 90% contains() 9% add () 1% remove() Growing is important Shrinking not so much
When to Resize?
Art of Multiprocessor Programming 24 When to Resize? Many reasonable policies. Here’s one. Pick a threshold on num of items in a bucket Global threshold When ≥ ¼ buckets exceed this value Bucket threshold When any bucket exceeds this value
Coarse-Grained Locking
Art of Multiprocessor Programming 25 Coarse-Grained Locking Good parts Simple Hard to mess up Bad parts Sequential bottleneck
Fine-grained Locking
Art of Multiprocessor Programming 26 Fine-grained Locking 0 1 2 3 4 8 9 7 11 17 Each lock associated with one bucket
Resize This
Art of Multiprocessor Programming 27 Resize This Make sure table reference didn’t change between resize decision and lock acquisition 0 1 2 3 4 8 9 7 11 17 Acquire locks in ascending order
Resize This
Art of Multiprocessor Programming 28 Resize This 0 1 2 3 4 8 9 7 11 17 0 1 2 3 4 5 6 7 Allocate new super -sized table
Resize This
Art of Multiprocessor Programming 29 Resize This 0 1 2 3 9 7 4 8 17 0 1 2 3 4 5 6 7 8 4 9 17 7 11 11
Resize This
Art of Multiprocessor Programming 31 Resize This 0 1 2 3 0 1 2 3 4 5 6 7 8 4 9 17 7 11 Striped Locks: each lock now associated with two buckets
Observations
Art of Multiprocessor Programming 32 Observations We grow the table, but not locks Resizing lock array is tricky … We use sequential lists Not LockFreeList lists If we’re locking anyway, why pay?
Read/Write Locks
Art of Multiprocessor Programming 48 Read/Write Locks public interface ReadWriteLock { Lock readLock(); Lock writeLock(); }
Read/Write Locks
Art of Multiprocessor Programming 49 Read/Write Locks public interface ReadWriteLock { Lock readLock(); Lock writeLock(); } Returns associated read lock
Read/Write Locks
Art of Multiprocessor Programming 50 Read/Write Locks public interface ReadWriteLock { Lock readLock(); Lock writeLock(); } Returns associated read lock Returns associated write lock
Lock Safety Properties
Art of Multiprocessor Programming 51 Lock Safety Properties Read lock: Locks out writers Allows concurrent readers Write lock Locks out writers Locks out readers
Read/Write Lock
Art of Multiprocessor Programming 52 Read/Write Lock Safety If readers > 0 then writer == false If writer == true then readers == 0 Liveness? Will a continual stream of readers … Lock out writers?
FIFO R/W Lock
Art of Multiprocessor Programming 53 FIFO R/W Lock As soon as a writer requests a lock No more readers accepted Current readers “drain” from lock Writer gets in
The Story So Far
Art of Multiprocessor Programming 54 The Story So Far Resizing is the hard part Fine-grained locks Striped locks cover a range (not resized) Read/Write locks FIFO property tricky
Optimistic Synchronization
Art of Multiprocessor Programming 55 Optimistic Synchronization Let the contains() method Scan without locking If it finds the key OK to return true Actually requires a proof …. What if it doesn’t find the key?
Optimistic Synchronization
Art of Multiprocessor Programming 56 Optimistic Synchronization If it doesn’t find the key May be victim of resizing Must try again Getting a read lock this time Makes sense if Keys are present Resizes are rare
Stop The World Resizing
Art of Multiprocessor Programming 57 Stop The World Resizing Resizing stops all concurrent operations What about an incremental resize? Must avoid locking the table A lock-free table + incremental resizing?
Lock-Free Resizing Problem
Art of Multiprocessor Programming 58 Lock-Free Resizing Problem 0 1 2 3 4 8 9 7 15
Lock-Free Resizing Problem
Art of Multiprocessor Programming 59 4 12 Lock-Free Resizing Problem 0 1 2 3 8 9 7 15 4 5 6 7 4 12 Need to extend table
Lock-Free Resizing Problem
Art of Multiprocessor Programming 60 Lock-Free Resizing Problem 0 1 2 3 8 4 9 7 15 4 5 6 7 12 4 12
Lock-Free Resizing Problem
Art of Multiprocessor Programming 61 Lock-Free Resizing Problem 0 1 2 3 9 7 15 4 5 6 7 12 to remove and then add even a single item single location CAS not enough 4 8 4 12 We need a new idea…
Don’t move the items
Art of Multiprocessor Programming 62 Don’t move the items Move the buckets instead Keep all items in a single lock-free list Buckets become “shortcut pointers” into the list 16 4 9 7 15 0 1 2 3
Recursive Split Ordering
Art of Multiprocessor Programming 63 Recursive Split Ordering 0 0 4 2 6 1 5 3 7
Recursive Split Ordering
Art of Multiprocessor Programming 64 Recursive Split Ordering 0 1/2 1 0 4 2 6 1 5 3 7
Recursive Split Ordering
Art of Multiprocessor Programming 65 Recursive Split Ordering 0 1/2 1 1/4 3/4 2 3 0 4 2 6 1 5 3 7
Recursive Split Ordering
Art of Multiprocessor Programming 66 Recursive Split Ordering 0 1/2 1 1/4 3/4 2 3 0 4 2 6 1 5 3 7 List entries sorted in order that allows recursive splitting. How?
Recursive Split Ordering
Art of Multiprocessor Programming 67 67 Recursive Split Ordering 0 0 4 2 6 1 5 3 7
Recursive Split Ordering
Art of Multiprocessor Programming 68 68 Recursive Split Ordering 0 1 0 4 2 6 1 5 3 7 LSB = Least significant Bit LSB 0 LSB 1
Recursive Split Ordering
Art of Multiprocessor Programming 69 69 Recursive Split Ordering 0 1 2 3 0 4 2 6 1 5 3 7 LSB 00 LSB 10 LSB 01 LSB 11
Split-Order
Art of Multiprocessor Programming 70 70 Split-Order If the table size is 2 i , Bucket b contains keys k k = b (mod 2 i ) bucket index consists of key's i LSBs
When Table Splits
Art of Multiprocessor Programming 71 When Table Splits Some keys stay b = k mod(2 i+1 ) Some move b+2 i = k mod(2 i+1 ) Determined by (i+1) st bit Counting backwards Key must be accessible from both Keys that will move must come later
A Bit of Magic
Art of Multiprocessor Programming 72 A Bit of Magic 0 4 2 6 1 5 3 7 Real keys:
A Bit of Magic
Art of Multiprocessor Programming 73 A Bit of Magic 0 4 2 6 1 5 3 7 Real keys: 0 1 2 3 4 5 6 7 Split-order: Real key 1 is in the 4 th location
A Bit of Magic
Art of Multiprocessor Programming 74 A Bit of Magic 0 4 2 6 1 5 3 7 Real keys: 0 1 2 3 4 5 6 7 Split-order: 000 100 010 110 001 101 011 111 000 001 010 011 100 101 110 111 Real key 1 is in 4 th location
A Bit of Magic
Art of Multiprocessor Programming 75 A Bit of Magic Real keys: Split-order: 000 100 010 110 001 101 011 111 000 001 010 011 100 101 110 111
A Bit of Magic
Art of Multiprocessor Programming 76 A Bit of Magic Real keys: Split-order: 000 100 010 110 001 101 011 111 000 001 010 011 100 101 110 111 Just reverse the order of the key bits
Split Ordered Hashing
Art of Multiprocessor Programming 77 Split Ordered Hashing 0 1 2 3 0 4 2 6 1 5 3 7 000 001 010 011 100 101 110 111 Order according to reversed bits
Parent Always Provides a Short Cut
Art of Multiprocessor Programming 78 Parent Always Provides a Short Cut 0 1 2 3 0 4 2 6 1 5 3 7 search
Sentinel Nodes
Art of Multiprocessor Programming 79 Sentinel Nodes 0 1 2 3 16 4 9 7 15 Problem: how to remove a node pointed by 2 sources using CAS
Sentinel Nodes
Art of Multiprocessor Programming 80 Sentinel Nodes 0 1 2 3 16 4 9 7 15 3 Solution: use a Sentinel node for each bucket 0 1
Sentinel vs Regular Keys
Art of Multiprocessor Programming 81 Sentinel vs Regular Keys Want sentinel key for i ordered before all keys that hash to bucket i after all keys that hash to bucket (i-1)
Splitting a Bucket
Art of Multiprocessor Programming 82 Splitting a Bucket We can now split a bucket In a lock-free manner Using two CAS() calls ... One to add the sentinel to the list The other to point from the bucket to the sentinel
Initialization of Buckets
Art of Multiprocessor Programming 83 Initialization of Buckets 0 1 16 4 9 7 15 0 1
Initialization of Buckets
Art of Multiprocessor Programming 84 Initialization of Buckets 0 1 2 3 16 4 9 7 15 0 1 3 Need to initialize bucket 3 to split bucket 1
Adding 10
Art of Multiprocessor Programming 85 Adding 10 0 1 2 3 16 4 9 3 7 0 1 2 10 = 2 mod 4 2 Must initialize bucket 2 Before adding 10
Recursive Initialization
Art of Multiprocessor Programming 86 Recursive Initialization 0 1 2 3 8 12 0 7 = 3 mod 4 To add 7 to the list 3 Must initialize bucket 3 Must initialize bucket 1 = 1 mod 2 1 Could be log n depth But expected depth is constant
Lock-Free List
Art of Multiprocessor Programming 87 Lock-Free List int makeRegularKey( int key) { return reverse(key | 0x80000000); } int makeSentinelKey( int key) { return reverse(key); }
Lock-Free List
Art of Multiprocessor Programming 88 Lock-Free List int makeRegularKey( int key) { return reverse(key | 0x80000000); } int makeSentinelKey(int key) { return reverse(key); } Regular key: set high-order bit to 1 and reverse
Lock-Free List
Art of Multiprocessor Programming 89 Lock-Free List int makeRegularKey(int key) { return reverse(key | 0x80000000); } int makeSentinelKey( int key) { return reverse(key); } Sentinel key: simply reverse (high-order bit is 0)
Main List
Art of Multiprocessor Programming 90 Main List Lock-Free List from earlier class With some minor variations
Lock-Free List
Art of Multiprocessor Programming 91 Lock-Free List public class LockFreeList { public boolean add(Object object, int key) {...} public boolean remove( int k) {...} public boolean contains( int k) {...} public LockFreeList(LockFreeList parent, int key) {...}; }
Lock-Free List
Art of Multiprocessor Programming 92 Lock-Free List public class LockFreeList { public boolean add(Object object, int key) {...} public boolean remove(int k) {...} public boolean contains(int k) {...} public LockFreeList(LockFreeList parent, int key) {...}; } Change: add takes key argument
Lock-Free List
Art of Multiprocessor Programming 93 Lock-Free List public class LockFreeList { public boolean add(Object object, int key) {...} public boolean remove(int k) {...} public boolean contains(int k) {...} public LockFreeList(LockFreeList parent, int key) {...}; } Inserts sentinel with key if not already present …
Lock-Free List
Art of Multiprocessor Programming 94 Lock-Free List public class LockFreeList { public boolean add(Object object, int key) {...} public boolean remove(int k) {...} public boolean contains(int k) {...} public LockFreeList(LockFreeList parent, int key) {...}; } … returns new list starting with sentinel (shares with parent)
Split-Ordered Set: Fields
Art of Multiprocessor Programming 95 Split-Ordered Set: Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet( int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(2); setSize = new AtomicInteger(0); }
Fields
Art of Multiprocessor Programming 96 Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet(int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(2); setSize = new AtomicInteger(0); } For simplicity treat table as big array …
Fields
Art of Multiprocessor Programming 97 Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet(int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(2); setSize = new AtomicInteger(0); } In practice, want something that grows dynamically
Fields
Art of Multiprocessor Programming 98 Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet(int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(2); setSize = new AtomicInteger(0); } How much of table array are we actually using?
Fields
Art of Multiprocessor Programming 99 Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet(int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(2); setSize = new AtomicInteger(0); } Track set size so we know when to resize
Fields
Art of Multiprocessor Programming 100 Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet( int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(1); setSize = new AtomicInteger(0); } Initially use single bucket, and size is zero
add()
Art of Multiprocessor Programming 101 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false ; resizeCheck(); return true ; }
add()
Art of Multiprocessor Programming 102 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true; } Pick a bucket
add()
Art of Multiprocessor Programming 103 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true; } Non-Sentinel split-ordered key
add()
Art of Multiprocessor Programming 104 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true; } Get reference to bucket’s sentinel, initializing if necessary
add()
Art of Multiprocessor Programming 105 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (! list.add(object, key) ) return false; resizeCheck(); return true; } Call bucket’s add() method with reversed key
add()
Art of Multiprocessor Programming 106 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false ; resizeCheck(); return true; } No change? We’re done.
add()
Art of Multiprocessor Programming 107 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true ; } Time to resize?
Resize
Art of Multiprocessor Programming 108 Resize Divide set size by total number of buckets If quotient exceeds threshold Double tableSize field Up to fixed limit
Initialize Buckets
Art of Multiprocessor Programming 109 Initialize Buckets Buckets originally null If you find one, initialize it Go to bucket’s parent Earlier nearby bucket Recursively initialize if necessary Constant expected work
Recall: Recursive Initialization
Art of Multiprocessor Programming 110 Recall: Recursive Initialization 0 1 2 3 8 12 0 7 = 3 mod 4 To add 7 to the list 3 Must initialize bucket 3 Must initialize bucket 1 = 1 mod 2 1 Could be log n depth expected depth is constant
Initialize Bucket
Art of Multiprocessor Programming 111 Initialize Bucket void initializeBucket( int bucket) { int parent = getParent(bucket); if (table[parent] == null ) initializeBucket(parent); int key = makeSentinelKey(bucket); LockFreeList list = new LockFreeList(table[parent], key); }
Initialize Bucket
Art of Multiprocessor Programming 112 Initialize Bucket void initializeBucket(int bucket) { int parent = getParent(bucket); if (table[parent] == null ) initializeBucket(parent); int key = makeSentinelKey(bucket); LockFreeList list = new LockFreeList(table[parent], key); } Find parent, recursively initialize if needed
Initialize Bucket
Art of Multiprocessor Programming 113 Initialize Bucket void initializeBucket(int bucket) { int parent = getParent(bucket); if (table[parent] == null) initializeBucket(parent); int key = makeSentinelKey(bucket); LockFreeList list = new LockFreeList(table[parent], key); } Prepare key for new sentinel
Initialize Bucket
Art of Multiprocessor Programming 114 Initialize Bucket void initializeBucket(int bucket) { int parent = getParent(bucket); if (table[parent] == null) initializeBucket(parent); int key = makeSentinelKey(bucket); LockFreeList list = new LockFreeList(table[parent], key); } Insert sentinel if not present, and get back reference to rest of list
Correctness
Art of Multiprocessor Programming 115 Correctness Linearizable concurrent set Theorem: O(1) expected time No more than O(1) items expected between two dummy nodes on average Lazy initialization causes at most O(1) expected recursion depth in initializeBucket()
Closed (Chained) Hashing
Closed (Chained) Hashing Advantages: with N buckets, M items, Uniform h retains good performance as table density ( M/N ) increases less resizing Disadvantages: dynamic memory allocation bad cache behavior (no locality) Oh, did we mention that cache behavior matters on a multicore?
Linear Probing*
Linear Probing* 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 contains (x ) – search linearly from h(x) to h(x) + H recorded in bucket. H z x h(x) =7 z *Attributed to Amdahl…
Linear Probing
Linear Probing 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 add (x ) put in first empty bucket, and update H . H z z h(x) =3 z z z z z z z z z z z z x =6
Linear Probing
Linear Probing Open address means M · N Expected items in bucket same as Chaining Expected distance till open slot: ½ (1+(1/(1-M/N)) 2 M/N = 0.5 search 2.5 buckets M/N = 0.9 search 50 buckets
Linear Probing
Linear Probing Advantages: Good locality fewer cache misses Disadvantages: As M/N increases more cache misses searching 10s of unrelated buckets “Clustering” of keys into neighboring buckets As computation proceeds “Contamination” by deleted items more cache misses
Cuckoo Hashing
Cuckoo Hashing 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Add(x) if h 1 (x) and h 2 (x) full evict y and move it to h 2 (y) h 2 (x) . Then place x in its place. z h 1 (x) z z z z z z z z z z z 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 z h 2 (x) z z z w z z z z z z z z h 2 (y) y x But cycles can form
Cuckoo Hashing
Cuckoo Hashing Advantages: contains () : deterministic 2 buckets No clustering or contamination Disadvantages: 2 tables h i (x) are complex As M/N increases relocation cycles Above M/N = 0.5 Add() does not work!
Hopscotch Hashing
Hopscotch Hashing Single Array, Simple hash function Idea: define neighborhood of original bucket In neighborhood items found quickly Use sequences of displacements to move items into their neighborhood
Hopscotch Hashing
Hopscotch Hashing