- head(tail) are incremented by producer(consumer) after every insert(remove).
- cache line containing head and tail are frequently invalidated.
- only works on sequential consistency memory model.
- buffer/queue is initially all
NULL
s. - control variables
head(tail)
are local to producer(consumer), and comparison toNULL
is used to enqueue(dequeue). - If slot at head is
NULL
, producer can insert. If slot at tail is full, consumer can remove. - if consumer and producer are very close to each other,
buffer[head]
accessed by producer to enqueue andbuffer[tail]
accessed by consumer to dequeue could be located on the same cache-line, causing the cache-line to bounce between producer and consumer. NULL
is used for control, so it cannot be used as data.
actualHead
,actualTail
are shared global variables between producer and consumer.- producer uses
localHead
,tailEstimate
for most insert operations. producer writes tolocalHead
only. only afterBATCH_SIZE
inserts,actualHead
is made equal to currentlocalHead
(actualHead
catches up once everyBATCH_SIZE
inserts, and is generally always behindlocalHead
). - While inserting, if
NEXT(localHead) == tailEstimate
, queue is possibly full. So check ifNEXT(localHead) == actualTail
. If yes queue is actually full. If not update estimate oflocalTail = actualTail
and continue insert. - reduces frequency of producer to read the actual tail
actualTail
, and write to actual head,actualHead
. - if producer is less than
BATCH_SIZE
ahead of consumer, consumer will be blocked, even if buffer is not empty. - control variables are not on queue,
NULL
can be used as data. - batching improves the performance, but makes the algorithm prone to deadlock (t1 <=> t2 synchronization situation, when
t1 generates data less than
BATCH_SIZE
, t2 generates feedback less thanBATCH_SIZE
)
- only local control variables are used:
head
andbatchHead
by producer,tail
andbatchTail
by consumer. - Producer:
batchHead
is usuallyBATCH_SIZE
ahead ofhead
. Slots betweenhead
andbatchHead
are safe to insert.- Only if there are
BATCH_SIZE
slots ahead ofhead
that are empty (NULL), the producer starts inserting atbuffer[head]
(and incrementshead
). It continues inserting untilhead == batchHead
. - Once
head == batchHead
, probeBATCH_SIZE
slots ahead (makebatchHead = batchHead + BATCH_SIZE
, and check ifbuffer[BATCH_SIZE == NULL
) to see if that slot is empty. This means all slots betweenhead
andbatchHead
are empty. Only then start inserting. head
catches up withbatchHead
once everyBATCH_SIZE
inserts => slow path. Else it is always on the fast path. IfBATCH_SIZE
is a multiple of cache-line size, cache trashing never occurs on the fast path.- batching allows producer and consumer to detect a batch of available slots at a tie, reducing the no. of shared memory accesses.
- Consumer:
- Slots between
tail
andbatchTail
are safe to remove. - Symmetrically, we would do: Only if there are
BATCH_SIZE
slots ahead oftail
that are full, consumer starts removing. However, note that this is the problem with MCSRingBuffer. If slot attail + BATCH_SIZE
is not full, then 1.) wait for a few ticks, to allow producer to get ahead, 2.) check ifbuffer[(tail + BATCH_SIZE)/2]
is full. If not check ifbuffer[(tail + BATCH_SIZE)/4]
is full, and so on. - Guarantees that consumer progresses as long as producer has produced some data, taking
log(2)BATCH_SIZE
memory accesses in the worst case.
- Slots between