This document provides information for some common questions about Chronicle Queue.
Single Chronicle Queue is designed to be easier to work with compared with the previous vanilla and indexed Chronicle Queue versions. It supports:
-
concurrent readers in different JVMs.
-
concurrent writers in different JVMs.
-
writing raw bytes or use a wire format to make schema changes, and to simplify dumping the data.
-
rolling files; weekly, daily, or even more often.
-
storing key information in the header so that there is no longer the requirement that the writers and readers be configured the same.
-
not needing to know the size of the message.
Chronicle Queue’s design to record everything of interest in the logger and persisted IPC. A key requirement of low latency systems is transparency, and you can record enough information to enable a monitoring system to recreate the state of the monitored system. This allows downstream systems to record any information they need, and perform queries without having to touch the critical system.
Chronicle Queue works best in SEDA
style event-driven systems, where latency is critical, and you need a record of exactly what was performed, and when. All this can be done without expensive network sniffing/recording systems.
The original design was for a low-latency trading system which required persistence of everything in, and out, for a complete record of
what happened, when, and for deterministic testing. The target round-trip time for persisting the request, processing, and persisting the response, was 1
micro-second.
Key principles are:
-
ultra-low garbage collection; less than one object per event
-
lock-less
-
cache friendly data structures
The marshalling, de-marshalling, and handling of thread-safe off-heap memory, has been added more recently, as well as moving into the Java-Lang
module.
This library now supports low-latency/GC-less writing, reading/parsing of text, as well as binary.
Our largest Chronicle Queue client pulls in up to 100
TB into a single JVM using an earlier version of Chronicle Queue.
Chronicle Queue is compelling as it uses a no-flow control model.
Chronicle Queue is designed to not slow down the producer if you have a slow consumer. Instead, you need to give it plenty of disk space as a buffer. Disk space is cheaper than main memory, and is cheaper than heap space. You can buy a system with multiple 16
TB SSD drives today. No one would consider having a JVM heap with 100
TB.
A couple of prime examples are:
-
Market data consumers. You cannot use flow control with an exchange.
-
Compliance. It Is something you have to have, but systems which send data to compliance, never want to be slowed down by it.
It is possible to use Chronicle Queue to invoke a method on the other JVM, and wait for the return value. However, this could be overkill, especially if you do not have to keep the history of the request and responses.
Imagine a simple scenario with two processes: C
(client) and S
(server). Create two `SingleChronicleQueue`s:
-
Q1
for sending requests fromC
toS
-
Q2
for sending responses fromS
toC
The server has a thread that is polling (busy spin with back-off) on Q1
. When it receives a request:
-
with
id=x
it does whatever is needed, and writes out response toQ2
-
with
id=x
.C
pollsQ2
with some policy, and reads out responses as they appear. It uses theid
to tie responses to requests.
The main task would be in devising a wire-level protocol for serialising your commands (equivalent to the method calls) from the client. This is application-specific, and can be done efficiently with the Chronicle tools.
Another issues to consider is what should the client do with the historical responses on startup? Some heartbeat systems so that the client knows that the server is alive. Archiving the old queues (VanillaChronicle
) makes it easier, but at some cost.
For more details on how to do this read these series of posts
Chronicle Queue v4.x does not currently support a synchronous mode. The best approach is to wait for a replicated message to be acknowledged.
A key assumption is that disk space is cheap, or at least it should be. Some organizations have unrealistic internal charging rates,
but you should be able to get 100
GB for about one hour of your time. This assumes retail costs for disks compared with minimum wage.
The organizational cost of disk is often 10-100x
the real cost, but so is your cost to the business.
In essence, disk space should be cheap, and you can record between a week and a month of continuous data, on one cheap drive. Nevertheless, there is less maintenance overhead if the Chronicle logs rotate themselves, and there is work being done to implement this for Chronicle 2.1. Initially, Chronicle files will be rotated when they reach a specific number of entries.
Single Chronicle Queue (single file per cycle) supports sub-microsecond latencies. If you use Wire, the typical latencies tend to be around one micro-second. You can still use raw writing of bytes if you need maximum performance.
With a tight reader loop, I see 100% utilization. Will there be processing capability left for anything else?
Two approaches for reducing CPU usage are;
-
combine tasks into the same thread.
EventGroup
in Chronicle threads helps to do this dynamically. -
use a Pauser such as a
LongPauser
to control how a thread backs off if there is nothing to do. There is aPauseMonitor
to allow you to periodically print how busy each thread is.
Chronicle Queue is designed to use virtual memory which can be much larger than main memory (or the heap). This is done without a significant impact on your system, and allows you to access the data at random, quickly. See Article with an example of a process writing 1 TB in 3 hours. This example shows how much slower it gets as the queue grows. Even after it is 1 TB in size, on a machine with 128 GB, it can still consistently write 1 GB in under 2 seconds.
While this does not cause a technical problem, we are aware this does concern people who can find this out-of-the-ordinary. We plan to have a mode which reduces virtual memory use; even if it is a little slower for some use cases.
Chronicle Queue is designed to persist messages, and replay them in micro-second time. Simple messages take as low as 0.4
micro-seconds.
Complex messages might take 10
micro-seconds to write and read. Also Chronicle Queue is designed to sustain millions of inserts and updates per second. For bursts of up to 10%
of your main memory, you can sustain rates of 1 - 3
GB/second written.
For example,l a laptop with 8
GB of memory might handle bursts of 800
MB at a rate of 1
GB per second.
A server with 64
GB of memory might handle a burst of 6.5
GB at a rate of 3
GB per second.
If your key system is not measuring latency in micro-seconds, and throughput in thousands-per-second, then it is not that fast.
It scales vertically. Many distributed systems can scale by adding more boxes. They are designed to handle between 100
and 1000
transactions per-second, per-node.
Chronicle is design to handle more transaction per-node, in the order of 100K
to 1M
transactions per second. This means that you need far fewer nodes; typically, between 10
and 100
times fewer.
Vertical scalability is essential for low latency, as having more nodes usually increases latency. Having one node which can handle the load of a data centre can also save money, and power consumption.
Chronicle recommends lots of high-speed RAM. This is because Chronicle uses the page cache and RAM is in effect a cache to the disk. There are two cases where having a high-speed disk will give you a real benefit:
If the rate of data that you are writing exceeds the disk write speed. In most applications this is unlikely to occur.
For Chronicle queues which write and read messages lineally across memory, we mitigate this situation with the use of the Chronicle pre-toucher. The pre-toucher ensures that the page is loaded into the page cache before being written into the queue.
For Chronicle Map, it is somewhat more complicated because Chronicle Map reads and writes your entries with random access across both the memory and disk. In this situation, if the entire map can be held within the page cache, then a read, or write, to the map will not have to access the disk. The operating system will work in the background ensuring that entries written to the page cache are propagated to the disk, but this is done via the operating system and is not on the critical path.
It follows that if you have quite a few maps, especially large maps, and your page cache is not large enough to hold all of these maps, then a read, or write, to a random entry may cause a cache miss. This in turn would cause a disk read or write. If you were going to install high-speed SSDs, Chronicle recommends that you use them to store the Chronicle maps and leave the slower cheap disks for the Chronicle queues. In addition, you should avoid using network attached storage, as this usually offers worst performance than local disks.
Use the following command:
$ for i in 0 1 2 3 4 5 6 7 8 9; do dd bs=65536 count=163840 if=/dev/zero of=deleteme$i ; done
163840+0 records in
163840+0 records out
10737418240 bytes (11 GB) copied, 5.60293 s, 1.9 GB/s
163840+0 records in
163840+0 records out
10737418240 bytes (11 GB) copied, 6.08841 s, 1.8 GB/s
163840+0 records in
163840+0 records out
10737418240 bytes (11 GB) copied, 5.64981 s, 1.9 GB/s
163840+0 records in
163840+0 records out
10737418240 bytes (11 GB) copied, 5.77591 s, 1.9 GB/s
163840+0 records in
163840+0 records out
10737418240 bytes (11 GB) copied, 5.59537 s, 1.9 GB/s
163840+0 records in
163840+0 records out
10737418240 bytes (11 GB) copied, 5.74398 s, 1.9 GB/s
163840+0 records in
163840+0 records out
10737418240 bytes (11 GB) copied, 8.24996 s, 1.3 GB/s
163840+0 records in
163840+0 records out
10737418240 bytes (11 GB) copied, 11.1431 s, 964 MB/s
163840+0 records in
163840+0 records out
10737418240 bytes (11 GB) copied, 12.2505 s, 876 MB/s
163840+0 records in
163840+0 records out
10737418240 bytes (11 GB) copied, 12.7551 s, 842 MB/s
To our knowledge, Spark Streaming is designed for real-time, but is looking to support a much lower message rate, and does not attempt to be ultra-low GC (minor GC less than once a day). We have not heard of any one using Spark in the core of a trading system. It tends to be used for downstream monitoring and reporting.
Chronicle has three types of excerpt, each optimised for different purposes.
ChronicleQueue queue = ChronicleQueue.singleBuilder(basePath).build();
// For sequential writes
ExcerptAppender appender = queue.acquireAppender(); // sequential writes.
// For sequential reads ideally, but random reads/write also possible
ExcerptTailer tailer = queue.createTailer();
You can write using a try-with-resource block:
try (DocumentContext dc = wire.writingDocument(false)) {
dc.wire().writeEventName("msg").text("Hello world");
}
You can write using a lambda which describes the message:
appender.writeDocument(wire -> wire.write("FirstName").text("Steve")
.write("Surname").text("Jobs"));
For example, you may want to write different types of messages to a Chronicle Queue, and process messages in consumers depending on their types. Chronicle Queue provides low level building blocks so that you can write any kind of message; it is up to you to choose the right data structure. For example, you can prefix the data that you write to a Chronicle Queue with a small header, and some meta-data. You can then use it as a discriminator for data processing. You can also write/read a generic object. This will be slightly slower than using your own schema, but is it a simple way to always read the type you wrote.
When you read an excerpt, it first checks that the index entry is there; the last thing that was written.
try (DocumentContext context = tailer.readingDocument()) {
if (context.isPresent()) {
Type t = tailer.read("message").object(Type.class);
process(t);
}
}
WireStore wireStore = queue.storeForCycle(queue.cycle(), 0, false);
System.out.println(wireStore.file().getAbsolutePath());
Chronicle Queue messages are immutable. In other words, once written, you should not attempt to modify existing messages. If you require existing messages to be modified, we would have to work with you on a consultancy arrangement to achieve this.
For the tailer, either replicated or not replicated, you can assume you are up-to-date when either isPresent()
is false
, or your read method returns false
The limit is about 1
GB, as of Chronicle 4.x.
The practical limit without tuning the configuration is about 16
MB.
At this point you get significant inefficiencies, unless you increase the data allocation chunk size.
You can access the bytes in wire as follows:
try (DocumentContext dc = appender.writingDocument()) {
Wire wire = dc.wire();
Bytes<?> bytes = wire.bytes();
// write to bytes
}
try (DocumentContext dc = tailer.readingDocument()) {
Wire wire = dc.wire();
Bytes<?> bytes = wire.bytes();
// read from the bytes
}
You can use isPresent()
to check that there is data to read.
try (DocumentContext dc = tailer.readingDocument()) {
if(!dc.isPresent()) // this will tell you if there is any data to read
return;
Bytes<?> bytes = dc.wire().bytes();
// read from the bytes
}
You can access native memory:
try (DocumentContext dc = appender.writingDocument()) {
Wire wire = dc.wire();
Bytes<?> bytes = wire.bytes();
long address = bytes.address(bytes.readPosition());
// write to native memory
bytes.writeSkip(lengthActuallyWritten);
}
try (DocumentContext dc = appender.writingDocument()) {
Wire wire = dc.wire();
Bytes<?> bytes = wire.bytes();
long address = bytes.address(bytes.readPosition());
long length = bytes.readRemaining();
// read from native memory
}
If you are writing bytes to a Chronicle Queue you will find that it occasionally adds padding to the end of each message. This is to ensure that each message starts on a 4-byte boundary which is a requirement for ARM architectures. NOTE: Intel requires that messages don’t straggle 64-byte cash lines. but aligning to 4 bytes also ensures 64-byte alignment and allows your Chronicle Queues to be shared between various different platforms.
For Chronicle Queue, the 4-byte alignment is now enforced, so there is now, no way to turn this feature on or off. This behaviour was changed on 21 April 2020 as part of #656
The writingDocument()
should be performed as quickly as possible because a write lock is held until the DocumentContext
is closed by the try-with-resources.
This blocks other appenders and tailers.
More dangerously, if something keeps the thread busy long enough(more than recovery timeout, which is 20 seconds by default) between call to appender.writingDocument()
and code that actually writes something into bytes, it can cause recovery to kick in from other appenders (potentially in other process), which will rewrite message header, and if your thread subsequently continues writing its own message it the will corrupt queue file.
try (DocumentContext dc = appender.writingDocument()) {
// this should be performed as quickly as possible because a write lock is held until the
// DocumentContext is closed by the try-with-resources, this blocks other appenders and tailers.
}
The recommended pattern for implementing a listener pattern, is to use the methodReader
/methodWriter
which can also take care of timestamps when.
For example, you may want a built-in Chronicle queue mechanism for asynchronous 'appender → tailer' notifications, where, upon receipt of a notification event, a given tailer is guaranteed to have at least one entry posted by appender ready for read. For the tailer, the only way it knows there is a message, is by reading/polling the end of the queue. If the appender and tailer are in the same process, you can use a different mechanism of your choice.
We would suggest you read these https://vanilla-java.github.io/tag/Microservices/ from the bottom up starting with part 1
.
A given Chronicle queue can safely have many readers, both inside and outside of the process creating it.
To have multiple readers of a Chronicle queue, you should generally create a new Chronicle queue per-reader, pointing at the same underlying journal. On each of these Chronicle queues, you will call createTailer
and get a new tailer that can be used to read it. These tailers should never be shared.
A less performant option, is to share a single Chronicle queue and tailer, and lock access with synchronized or ReentrantLock
. Only one tailer should ever be active at the same time.
You can have any number of writers. However, you may get higher throughput if you have only one writer at a time. Having multiple writers increases contention, but works as you would expect.
Not implicitly. We didn’t want to assume whether the appenders or tailers:
-
were running at the same time
-
were in the same process
-
wanted to block on the queue for either writing or reading.
If both the appender and tailer are in the same process, the tailer can use a pauser when not busy.
pauser = balanced ? Pauser.balanced() : Pauser.millis(1, 10);
while (!closed) {
if (reader.readOne())
pauser.reset();
else
pauser.pause();
}
In another thread you can wake the reader with:
pauser.unpause();
Chronicle has an advantage over other queuing systems, in that the consumer can be any amount behind the producer; up to the free space on your disk.
Chronicle has been tested where the consumer was more than the whole of main memory behind the producer. This reduced the maximum throughput by about half. Most systems, in Java, where the queue exceeds the size of main memory, cause the machine to become unusable.
Note
|
The Consumer can stop, restart, and continue, with minimal impact to the producer, if the data is still in main memory. |
Having a faster disk sub-system, helps in extreme conditions like these.
Chronicle has been tested on a laptop with an HDD with a write speed of 12 MB/s
, and an over-clocked hex core i7 PCI-SSD
card, which sustained a write speed of 900
MB/s.
They are almost the same, except files directory-listing.cq4t
was in earlier versions of Chronicle Queue, while metadata.cq4t
is applicable for Chronicle Queue 5.0 onwards.
The time Chronicle Queue rolls, is based on the UTC time, it uses System.currentTimeMillis()
.
When using daily-rolling, Chronicle Queue will roll at midnight UTC. If you wish to change the time it rolls, you have to change Chronicle Queue’s epoch()
time.
This time is a milliseconds offset, in other words, if you set the epoch
to be epoch(1)
then chronicle will roll at 1 millisecond past midnight.
Path path = Files.createTempDirectory("rollCycleTest");
SingleChronicleQueue queue = ChronicleQueue.singleBuilder(path).epoch(0).build();
We do not recommend that you change the epoch()
on an existing system, which already has .cq4
files created, using a different epoch()
setting.
If you were to set :
.epoch(System.currentTimeMillis()
This would make the current time the roll time, and the cycle numbers would start from zero.
You should try to avoid abruptly killing Chronicle Queue, especially if its in the middle of writing a message.
try (DocumentContext dc = appender.writingDocument()) {
// killing chronicle queue here will leave the file in a locked state
}
If you kill Chronicle Queue when its half way through writing a document, this can leave your Chronicle Queue in a locked state, which could later prevent other appenders from writing to the queue file.
Although we do not recommend that you $kill -9
your process, in the event that your process abruptly terminates we have added recovery code that should recover from this situation.
The number of messages that you can store depends on the roll-cycle; the roll-cycle determines how often you create a new Chronicle Queue data file. Most people use a new file each day, and we call this daily-rolling. The Chronicle index is a unique index that is given to each message. You can use the index to retrieve any message that you have stored.
When using daily-rolling, each message stored to the Chronicle Queue will increase the index by 1. The high bytes in the 64-bit index are used to store the cycle number, and the low bits to store the sequence number.
The index is broken down into two numbers:
-
cycle number - When using daily-rolling, the first file from epoch has cycle number of 1, and the next day it will have a cycle number of 2, and so on
-
message sequence number - Within a cycle, when using daily-rolling, the first message of each day will have a message sequence number of 1, and the next message within that day have a message sequence number of 2, and so on
Different roll-cycles have a difference balance between how many bits are allocated to the message sequence number, and how many of the remaining bits are allocated to the cycle number. In other words, different roll-cycles allow us to trade off the maximum number of cycles, for the maximum number of messages within the cycle.
With daily-rolling we use:
-
a 32-bit message sequence number - which gives us 4 billion messages per day, and
-
a 31-bit cycle number (reserving the high bit for the sign ) - allows us to store messages up to the year 5,881,421. With hourly rolling we can store messages up to the year 246,947.
If you have more than 4 billion messages per cycle you can increase the number of bits used for cycles and thus the number of messages per cycle, though reducing the number of cycles.
For example, you may have up to 1 trillion messages per day and you need 23-bit cycles to allow for up to the year 24,936. If we had rolled every second with 32-bit 4 bn messages per second, we would be running out in about a decade.
With hourly and daily-rolling it’s pretty limitless. Also, by changing the epoch
, you can extend the dates further, shifting the limit between the first and last cycle to 31-bits or 23-bits.
The following table shows the maximum number of messages per roll cycle.
RollCycle Name | Max Number of messages in each cycle in Decimal | Max Number of messages in each cycle in Hexadecimal | maximum messages per seconds over the length of the cycle ( on average ) |
---|---|---|---|
RollCycles.FIVE_MINUTELY |
1,073,741,824 |
|
3,579,139 |
RollCycles.TEN_MINUTELY |
1,073,741,824 |
|
1,789,569 |
RollCycles.TWENTY_MINUTELY |
1,073,741,824 |
|
894,784 |
RollCycles.HALF_HOURLY |
1,073,741,824 |
|
596,523 |
RollCycles.FAST_HOURLY |
4,294,967,295 |
|
1,193,046 |
RollCycles.TWO_HOURLY |
4,294,967,295 |
|
596,523 |
RollCycles.FOUR_HOURLY |
4,294,967,295 |
|
298,261 |
RollCycles.SIX_HOURLY |
4,294,967,295 |
|
198,841 |
RollCycles.FAST_DAILY |
4,294,967,295 |
|
49,710 |
LegacyRollCycles.MINUTELY |
67,108,864 |
|
1,118,481 |
LegacyRollCycles.HOURLY |
268,435,456 |
|
74,565 |
LegacyRollCycles.DAILY |
4,294,967,295 |
|
49,710 |
LargeRollCycles.LARGE_HOURLY |
4,294,967,295 |
|
1,193,046 |
LargeRollCycles.LARGE_DAILY |
137,438,953,471 |
|
1,590,728 |
LargeRollCycles.XLARGE_DAILY |
274,877,906,943 |
|
3,181,457 |
LargeRollCycles.HUGE_DAILY |
1,099,511,627,775 |
|
12,725,829 |
SparseRollCycles.SMALL_DAILY |
536,870,912 |
|
6,213 |
SparseRollCycles.LARGE_HOURLY_SPARSE |
17,179,869,183 |
|
4,772,185 |
SparseRollCycles.LARGE_HOURLY_XSPARSE |
4,398,046,511,103 |
|
1,221,679,586 |
SparseRollCycles.HUGE_DAILY_XSPARSE |
281,474,976,710,655 |
|
3,257,812,230 |
TestRollCycles.TEST_SECONDLY |
4,294,967,295 |
|
4,294,967,295 |
TestRollCycles.TEST4_SECONDLY |
4,096 |
|
4,096 |
TestRollCycles.TEST_HOURLY |
1,024 |
|
0 |
TestRollCycles.TEST_DAILY |
64 |
|
0 |
TestRollCycles.TEST2_DAILY |
512 |
|
0 |
TestRollCycles.TEST4_DAILY |
4,096 |
|
0 |
TestRollCycles.TEST8_DAILY |
131,072 |
|
1 |
See Replication
See Replication
The message will be lost, and it is truncated.
Most likely your read code does not match your write code. Using Wire means it can handle changes to fields, and to data types, transparently.
In Chronicle Queue v4, will an error such as IllegalStateException
appear when there is a high number of messages to write?
Chronicle Queue v4+ does not have the limitation of using just one thread. It supports any number of threads, with a single file per cycle.
It could be a race condition. When a memory mapping is truly freed, it cannot be accessed, or it will trigger a segmentation fault. The reason to suspect this, is that it should be free on a roll from one cycle to the next.
If you see the following error:
Caused by: java.lang.OutOfMemoryError: Map failed
at sun.nio.ch.FileChannelImpl.map0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at net.openhft.chronicle.core.OS.map0(OS.java:292)
at net.openhft.chronicle.core.OS.map(OS.java:280)
... 54 more
the problem is that it is running out of virtual memory, you are more likely to see this if you are running a 32-Bit JVM on 64-bit. One work around is to use a 64-bit JVM.
If an exception is thrown while you are holding the writingDocument()
, then the close()
method will be called on the
DocumentContext
which will release the lock, set the length in the header, and allow writing to continue.
If the exception was thrown halfway through writing your data, then you will end up with your data half-written in the chronicle queue.
If there is a possibility of an exception during writing, you should use something like the below.
This calls the DocumentContext.rollbackOnClose()
method to tell the DocumentContext
to rollback the data.
@NotNull DocumentContext dc = writingDocument();
try {
// perform the write which may throw
} catch (Throwable t) {
dc.rollbackOnClose();
throw Jvm.rethrow(t);
} finally {
dc.close();
}
Chronicle Queue supports Linux container technology. The below is in regards to our testing on docker.
You need to ensure that:
-
containers share IPC namespace (run with
--ipc=host
) -
containers share PID namespace (run with
--pid=host
) -
queues are mounted on bind-mounted folders from the host (i.e.
-v /host/dir/1/:/container/dir
)
Our performance tests have shown minimal performance degradation when compared to running directly on the host.
If your containers are running on separate hosts, or your queues are not bind-mounted from the host, then you will need to use Queue Replication.
Yes. They use different packages. Chronicle Queue v4 is a complete re-write so there is no problem using it at the same time as Chronicle Queue v3. The format of how the data is stored is slightly different, so they are not interoperable on the same queue data file.
Chronicle Queue supports CharSequence
, Appendable
, OutputStream
and InputStream
APIs. It also has a fast copy to/from a byte[]
and ByteBuffer
.
Chronicle Queue is designed to be faster 'with persistence' than other serialization libraries are even 'without persistence'. Chronicle Queue supports YAML
, JSON
, Binary YAML
, and CSV
.
To date, we have not found a faster library for serialization without a standardized format. Chronicle does not yet support XML
.
Where XML
is needed downstream, we suggest writing in binary format, and have the reader incur the overhead of the conversion, rather than slow down the producer.
The byte[] methods on StringUtils are designed to work only on those Java 9+ VMs that have the compact strings feature enabled, but not on ones that have non-compact strings. This is not specific to OpenJ9, and HotSpot should fail with Java9 (but it doesn’t because compact strings are enabled by default). Conversely, OpenJ9 should be able to run Chronicle Queue with compact strings. We can confirm (with limited testing) that Chronicle Queue is able to work on OpenJ9 VM with the -XX:+CompactStrings option enabled.
In summary, Chronicle Queue can be considered compatible with OpenJ9, provided the -XX:+CompactStrings option is used.
OpenJ9 version 0.12.1 and earlier requires the file descriptor limit to be manually adjusted to a higher value - for example, using the command ulimit -Sn 500
.
One of the challenges with NFS is the variation in behaviour/functionality from version to version, across Operating Systems, and also differences between how individual deployments have been configured. The main concerns with NFS relevant to a Chronicle Queue are:
-
support and behaviour of file locks (including fine-grained locking of distinct byte ranges within a file)
-
support and behaviour of memory mapping, and visibility of changes across separate mappings
Crucially, if you are only using the file on one host, then the OS layer for memory mapping comes before the file system, so the fact that the backing file resides on an NFS mount is less of a concern. There are however some nuances with NFS around the behaviour of syncs to disk (both in normal operation and importantly if an application crashes) which in turn requires slightly more defensive programming and consequently some degradation in performance is likely.
File locking will generally work provided you are using an up to date version of NFS (4+) and it has been suitably configured. There are additional elements to file locking through NFS which need to be handled (e.g. dealing with network interruptions with an NFS server which can cause locks to go stale/lost), but the basic mechanism works as needed.
So in summary using an NFS mount locally is fine, however care is needed to ensure the environment is correctly configured. The "local" caveat is very important.
Note
|
We would certainly recommend running a series of tests to confirm lock and memory map behaviour. Chronicle Software can work with you to set this up. |
In brief, our recommendation is that using local disk rather than NFS is always preferable in cases where appenders and tailers are on the same host. One option to consider may be using local disk for the shared appender/tailer data, then replicate this onto an NFS mount if passive copies are needed elsewhere. If you only have NFS available then using NFS locally can be fine subject to some caveats as above. Tests would be required to confirm correct behaviour in your specific environment.
Note
|
The above explanation assumes C++ Queue on Linux. Java Queue has similar functional requirements which should ultimately use similar native calls to C++ via the JVM, but this would need to be confirmed in your environment. |