You’re out of control. I hate to be the bearer of bad news, but sometimes we need to hear the truth. You know Apache Kafka, you love Apache Kafka, but as your projects and architecture have evolved, it has left you in an uncomfortable situation. Despite its real-time streaming benefits, the lack of tooling for Kafka service discovery, a reliable audit tool, or a topology visualizer has led you to a place I call “Kafka Hell”. Let me explain how you got here in 4 simple, detrimental, and unfortunately unavoidable steps.
As we moved away from SOAP-based web services and REST became the predominant methodology for application interactions, there was a moment when many organizations faced the same challenges we face today with EDA and Apache Kafka.
Back then, SOA’s maturity brought about tooling which supported the ability to author, manage, and govern your SOAP/WSDL-based APIs. The tooling was generally categorized as “Service Registry and Repository.” The user experience sucked, but I bet you know that already!
Enter REST. Organizations which were/are technical pioneers quickly adopted the RESTful methodology; but since the tooling ecosystem was immature, they faced challenges as they moved from a handful of RESTful services to a multitude of them.
Sound like what we face with Kafka today?
Without these functions, we would have had chaos. I truly believe the momentum behind RESTful APIs would have died a slow, agonizing death without a way to manage and govern the overwhelming quantity of APIs. This reality would have led to constantly breaking API clients, security leaks, loss of sensitive information, and interested parties generally flying blind with respect to existing services. It would have been a dark and gloomy time.
I bet if we all had a dollar for every time our parents said, “You need to grow up,” when we were younger, we would all be millionaires. But that is exactly what we need to do as it relates to event streams, whether you are using Apache Kafka, Confluent, MSK, or any other streaming technology. If we take our queues (no pun intended) from the success of API management – and the API-led movement in general – we have a long way to go in the asynchronous, event streaming space.
Over the last few years, I have poured a lot of my professional energy into working with organizations who have deployed Apache Kafka into production, and who I would consider to be technical leaders within their space. What I have heard time and time again is that the use of Apache Kafka has spread like wildfire to the point where they no longer know what they have, and the stream consumption patterns are nearly 1 to 1. This means that while data is being processed in real time (which is great), they are not getting a good return on their investment. A stream only being consumed once is literally a 1 to 1 exchange, but the real value of EDA lies in being able to easily reuse existing real-time data assets, and that can only be done if they are managed and governed appropriately.
Another common complaint about Apache Kafka is the inability to understand and visualize the way in which event streams are flowing. Choreographing the business processes and functions with Apache Kafka has become difficult without a topology visualizer. One architect described it as the “fog of war” – events are being fired everywhere, but nobody knows where they are going or what they are doing.
Events in large enterprises rarely originate from a Kafka-native application; they usually come from a variety of legacy applications (systems of record, old JEE apps, etc.) or from new, modern, IoT sensors and web apps. Thus, we need end-to-end visibility in order to properly understand the event-driven enterprise.
We need to adopt the methodology as described by the key capabilities of an API management platform, but for the Kafka event streaming paradigm. We already have the equivalent of the API Gateway which is your Kafka broker, but are sorely lacking stream authoring and publishing tools, external/developer portals, and the reporting and analytics capabilities found in API management solutions today. Ironically, I would claim the complexity and decoupling that you find in a large organization’s EDA/Kafka ecosystem is more complex and harder to manage than synchronous APIs which is why we need an “event management” capability now more than ever!
I hope by now you’ve bought into the idea that you need to govern and manage your Kafka event streams like you do your RESTful APIs. Your next question is most like likely, “Sounds great Jonathan, but I don’t know what I even have, and I surely don’t want to have to figure it out myself!” And to that, I say, “Preach!” I have walked in your shoes and recognize that technical documentation always gets out of date and is too often forgotten as an application continues to evolve. This is the technical debt problem that can spiral out of control as your use of EDA and Kafka grows over time.
So, that is exactly why it is a requirement to automate Kafka service discovery so you can introspect what topics, partitions, consumer groups, and connectors are configured so that you can begin down the road to managing them like you do for your other APIs. Without the ability to determine the reality (what’s going on in runtime is reality, whether you like it or not), you can document what you think you have but it will never be the source of truth you can depend on.
A reliable Kafka service discovery tool with the requirements I listed above will be that source of truth you need.
Once you have discovered what you have with a Kafka service discovery tool, you’ll need to find a way to keep it consistent as things inevitably change. There needs to be continuous probing to ensure that as the applications and architecture change, the documentation is kept up to date and continues to reflect the runtime reality. This means that on a periodic basis, the Kafka service discovery tool needs to be run in order to audit and find changes, enabling you to decide if the change was intended or not. This will ensure the Kafka event streams documentation (which applications are producing and consuming each event stream) and the schemas are always consistent.
Thus, the path to solving the technical debt dilemma and design consistency problem with Apache Kafka is a Kafka service discovery tool.
I hope I’ve given you a little insight into why you are struggling to manage and understand your Kafka streams and what kind of tools the industry will need to solve these particular pain points. Recognizing the problem is the first step in solving it!
Solace has been taking a proactive role in developing the capabilities I outlined above, specifically for Kafka users: authoring, developer portal, metrics, service discovery, audit tool, etc. I encourage you to stay tuned and let us know if you agree that this type of capability is sorely needed! I am confident that soon you will be enabled to manage and govern your Apache Kafka event streams like you do your APIs. And won’t that be exciting!
Get started with PubSub+ Event Portal with a 60-day free trial.
Jonathan Schabowsky is a senior architect in Solace's Office of the CTO. His expertise includes architecting large-scale, mission critical enterprise systems in various domains, such as for the FAA, satellite ground systems (GOES-R) and healthcare. Recently, Jonathan has been focused on the use of event-driven architectures for microservices and their deployments into platform-as-a-services (PaaS) running within public clouds.[position] => [url] => https://solace.com/blog/author/jonathan-schabowsky/ ) )