A typical kdb+ architecture (in a market data environment) is composed of multiple q processes sharing data with each other. These processes usually are: feed handlers, ticker plants, real-time subscribers (rdbs, pdbs, bars etc.), historical databases (hdbs), and gateways. Here is what that looks like:
While this is a popular architecture which has been deployed in many production environments, it presents various architectural decisions that should be considered:
Messaging – in the situation where a sender has multiple and increasing numbers of consumers, choosing to handle the data in an efficient manner to reduce load on the producer and any associated configuration (e.g. producing data once to be consumed by many receivers)
Slow consumers – when downstream applications struggle with the same data load that kdb+ can handle with little effort
Data resilience – what happens if the bar stats process in the architecture above crashes due to heavy load or network/equipment failure? How will your applications failover? Will the data in-flight be lost?
Sharing data with other teams – there are various choices of methods and APIs that can be deployed to expose data to different teams using different programming languages, with the associated support of in-house developed APIs.
Tightly coupled processes – if a bad design choice is made, the solution may become tightly coupled where it can increase the effort to modify and deploy one process without impacting all others.
Cloud migration – as companies expand to the cloud, there is a need to transmit data securely to processes (written in q or another language) in the cloud.
Lack of guaranteed delivery – when dealing with critical data such as order/execution flow, you want to guarantee that the published data was consumed by interested consumers.
What is Solace PubSub+?
Recently, Kx released Solace interface for kdb+ which will bring the capabilities of pub/sub messaging to kdb+ and much more! For those who don’t know, Solace has been serving the financial industry with its industry-leading event broker, PubSub+, for almost two decades. PubSub+ can be deployed in three different ways:
appliance – popular for high-throughput/low latency scenarios such as market data distribution
software – easy to deploy as a docker container or machine image on-prem or on cloud. Standard Edition is free to use even in production.
cloud – managed service with support for deployment in popular public clouds such as AWS, Azure and GCP. There is a free 60-day trial available!
If you are looking to try PubSub+, I highly recommend trying the free 60-day trial on PubSub+ Cloud. With a few clicks, you will have a PubSub+ instance running.
How can PubSub+ help kdb+ developers?
PubSub+ is loaded with tons of advanced features kdb+ developers can start leveraging right away. Here are some of the ways PubSub+ solves the challenges mentioned above:
Efficiently distribute and consume data – with PubSub+, kdb+ developers can use pub/sub messaging to efficiently fanout kdb+ updates at scale to both your front end and serverside applications. For example, q processes only need to publish data once and all other processes and clients can subscribe to the data they are interested in.
Slow consumers – PubSub+ has been specifically designed to handle slow consumers in such a way that they do not impact the performance of publishers or fast consumers.
Data resilience – kdb+ developers can use guaranteed messaging to ensure delivery of messages between applications even in cases where the receiving application is offline, or there is a failure of a piece of network equipment.
Super easy integration – PubSub+ supports a variety of open APIs and protocols which makes it extremely easy for applications to interact with PubSub+. Forget having to manage all the APIs yourself, simply publish data to PubSub+ and downstream applications can use one of many available APIs to consume that data.
Decoupling q processes – instead of having your q processes talking to each other directly in a tightly coupled manner, you can have them share data through PubSub+. This will help decouple your processes and allow you to deploy code faster.
Replay – PubSub+ comes with a powerful replay feature that allows you to replay your messages hours or days after their original delivery. You can choose to replay messages delivered to specific topics only. (Note: this only applies to messages delivered via Guaranteed Message delivery)
Cloud migration – With PubSub+ you can build an event mesh that lets brokers in different environments and regions to be linked together dynamically. With Event Mesh, you can have a q process publish data to PubSub+ Event Broker deployed on-prem, link the broker to another PubSub+ Event Broker deployed on the cloud, and have a different process consume that data locally in that cloud.
As you can see, PubSub+ brings a lot of advantages to your powerful kdb+ stack. And since PubSub+ is free to use, even in production, there’s no excuse to not use it. I highly recommend checking it out!
Himanshu Gupta is a Solutions Architect at Solace. He has experience working at both buy and sell side as a tick data developer. In these roles, he worked with popular timeseries databases kdb+ and OneTick to store and analyze real-time and historical financial market data for cross asset classes.