@shutterstock/chunker
calls a blocking async callback before adding an item that would exceed a user-defined size limit OR when the count of items limit is reached.
A common use case for @shutterstock/chunker
is as a "batch accumulator" that gathers up items to be processed in a batch where the batch has specific count and size constraints that must be followed. For example, sending batches to an AWS Kinesis Data Stream requires that there be 500 or less records totalling 5 MB or less in size (see AWS Kinesis PutRecords) . The record count part is easy, but the record size check and handling both is more difficult.
The package is available on npm as @shutterstock/chunker
npm i @shutterstock/chunker
import { Chunker } from '@shutterstock/chunker';
After installing the package, you might want to look at our API Documentation to learn about all the features available.
Chunker
has a BlockingQueue
that it uses to store items until the size or count limits are reached. When the limits are reached, the Chunker
calls the user-provided callback with the items in the queue. The callback is expected to return a Promise
that resolves when the items have been processed. The Chunker
will wait for the Promise
to resolve before continuing.
See below for an example of using Chunker
to write batches of records to an AWS Kinesis Data Stream.
nvm use
npm i
npm run build
npm run lint
npm run test
- Create Kinesis Data Stream using AWS Console or any other method
- Example:
aws kinesis create-stream --stream-name chunker-test-stream --shard-count 1
- Default name is
chunker-test-stream
- 1 shard is sufficient
- 1 day retention is sufficient
- No encryption is sufficient
- On-demand throughput is sufficient
- Example:
npm run example:aws-kinesis-writer
- If the stream name was changed:
KINESIS_STREAM_NAME=my-stream-name npm run example:aws-kinesis-writer
- If the stream name was changed:
- Observe in the log output that the
enqueue
method intermittently blocks when the count or size constraints would be breached. During the block the records are written to the Kinesis Data Stream, after which the block is released and the new item is added to the next batch.