Low-latency programming is poorly documented partly due to its value within industry; high-frequency trading is one of many professional fields where firms are striving to achieve latency reductions in the magnitude of nanoseconds. There are fragments of information available about what programmers can do to make their programs faster, but the strategies that give trading firms the competitive edge in the market are closely guarded secrets. The purpose of this work is to provide an educational resource for software engineers that are looking to design and write low-latency applications. This work presents two closely-related products that achieve this. The first product is the Low-Latency Programming Repository, which contains the theories behind various programming strategies that can achieve higher performance within an application. These theories are backed up by empirical evidence in the form of reproducible microbenchmarks that demonstrate these techniques in action. The common goal of each technique is to reduce the execution time of programs, whether it be through the careful management of the memory cache, or the compiler transformations performed by hand. The second product is the Java Modular Packet Processor, a multi-threaded application and library that processes packets using a network of components that perform individual operations. Packets are sent between components using the LMAX Disruptor, an esteemed data structure that serves as an alternative to queues. This product is used in conjunction with the Low-Latency Programming Repository to demonstrate a quantitative performance improvement of using lock-free alternatives to general-purpose data structures.
Lee et al. (Thu,) studied this question.