The purpose of this blog is to summarize and demystify the best practices in creating a sound event topic hierarchy. If you are interested in more detailed documentation on the subject (complete with examples), you can check out this link.
Event-driven architecture, and event-driven microservices have proven to be valuable application design patterns. To achieve the promise of governed, enterprise-wide distribution of reusable events, the events need to be addressable and routable to the interested applications.
The core concept is that applications should collaborate by asynchronously reacting to event stimuli, versus polling for events or relying on an orchestration layer. To be able to get the right stimuli to the applications that need them, you need a way to asynchronously deliver events.
Beyond simple publish/subscribe messaging, eventing requires other attributes to fulfill the intended design concept. One of these attributes is a well-architected topic hierarchy that lets applications identify, attract and consume events they need.
What is an “Event”?
An event can broadly be described as a change notification. These changes can have a variety of forms, but all have the common structure of an action that has occurred on an object. An event could be a notification that the temperature of a sensor has crossed a threshold, or a change or result from one microservice consumed by another.
These change notifications or events should be able to be distributed to all interested systems – independently and asynchronously. An event producer does not need to know which applications specifically need to receive a notification, but the producer needs to explicitly describe which object has changed and how.
This is where event topic design comes in.
Event Topics and Subscriptions
Intermediary nodes (event brokers) of a system do not need to understand an entire event to take action, they just need to know how to act on topic information. This is similar to how IP Routers route internet traffic without inspecting the payload or in most cases not even the upper level headers. However, the event topic does not describe a destination like a URL or IP address, but instead it describes the content of the message payload in metadata.
The event topic allows data movement and data processing systems to make intelligent decisions without deserializing, decoding and interpreting or mapping the entire event.
Upon matching event topics to consumer subscriptions, dynamic forwarding paths can be woven across an event mesh (a network of event brokers) to deliver the correct events to consumers. An event topic is different from REST style point-to-point routing where the sender specifies the destination, in that the event producer only specifies the routing topic and the event brokers route to all eligible destinations.
Systems can either have managed subscriptions or consumer self-served subscriptions. These subscriptions add fine grained filtering so consumers can get exactly and only the data they need from the event mesh. Without fine grain filtering, intermediary processes would be responsible for receiving, filtering and re-publishing data making a frail lineage of intermediary processes. Adding intermediary processes for the specific purpose of applying business logic filtering can add unnecessary complexity and rigidity to the overall system.
With the ability to administratively control which producers are authorized to publish on a topic and which consumers are authorized tosubscribe to a topic subscription, the movement of data can be secured based on data access control policies.
Based on topics and subscriptions, policies can dictate how to handle events within the event broker with respect to priority, availability, replication and disaster recovery. Routing from producer to consumer can also be influenced by statically defined bridging policies.
Benefits of a Well-Designed Topic Architecture
Here are some key benefits that result from a well-designed topic architecture:
- Routing, filtering and governance can occur at multiple levels. As data flows across the event mesh, a decision can be made on which events are allowed to cross geographical boundaries. And once an event is in a location, secondary decisions can be made on which applications can receive them.
- Easier gathering of information across events. Wildcards simplify the ability to receive all event types pertaining to an order ID, for example, or all new orders regardless of originator or location.
- More easily understandable relationship between event and business functional domain. A topic hierarchy can contain information on the business unit down to action hierarchy so you can see things like finance/payroll/pay-adjustment in a topic hierarchy.
- Remove explicit need for orchestration. With proper eventing and flow of data, complex tasks can be choreographed and with eventual consistency, completed in a highly scalable, robust way.
- Remove locality and source silos. Barriers to data movement can be removed while maintaining positive data governance. An enterprise-wide topic architecture could allow events to flow across application domains.
- Data versioning. Reducing the regression risks of regression and allowing canary application instances and blue/green rollout scenarios.
- Fine grain filtering. Adding metadata properties to unique topics allows the event mesh to filter the events so consumers just get the events they need.
- Fine grain access control. Wildcard access control rules simplify defining entitlements by defining simple access control rules at some point in the topic hierarchy.
Best Practices for Designing a Topic Hierarchy
Topic hierarchies are built in a very similar manner to the science of taxonomy. Though taxonomy is mostly thought of as the biological classifications and subclassifications of living entities, the principals can be applied to any large data set. The main outcome is the ability to have a large berth of data and the ability to simply and logically drill down to the exact element for which you have an interest. This is achieved through taxonomy rank, where a species is described through a series of increasingly specific set of levels or ranks.
Without using messaging, you are already using hierarchical addressing today. If you go to a web page on a web site the URL is a hierarchical address of the form: application://sub-domain.domain/root-directory/sub-directorty/page (for instance: https://solace.com/solutions/industries/retail/).
Keeping with these concepts, we want to produce hierarchical topics that rank from least specific to most specific.
Does it need to be in the topic? If so, in which order?
Two tests should give you the answer.
The first test on whether to add a field into a topic is to consider whether it provides topic functionality. Would it be used for routing, filtering, ACL, or handling policy? If it is not likely to be used for any of these purposes it likely does not need to be in the topic.
Secondly, it is a good test when trying to determine which field is more specific than another to think about variability. The most variable are usually the most specific. This will naturally lead to a topic hierarchy that has most variable items at the end of the topic string.
For example, if we are creating a topic hierarchy for an event system that updates city bus positions, we would have objectId/location, where objectId would be the bus route number and location would be bus stop or lat/long. There are many more bus stops or possible lat/longs than bus routes.
Another example is a topic hierarchy for an event system that updates purchase orders per city. Here we would have objectId be either the purchase order or the SKU of the purchase and location be picked from a number of possible cities. Hopefully in a successful business there will be more actual sales than sales locations. So the topic would have location/objectId.
Below you will find a common topic structure and hierarchy, but this might have to be modified to meet your actual use case. It is important though to remain consistent within an application domain once you’ve decided on your hierarchy.
The Two Parts of Event Topic Structure: Root and Properties
An event topic structure consists of two parts: root and properties.
The “event topic root” contains enough information to describe the type of event that has occurred in static fields. A list of event topic roots would form a catalogue of events that can be produced and consumed. An event portal would contain the list of event type along with details about the event. Each event topic root would describe the event in as much detail as to be able to be mapped to a single data schema. But be careful; getting too specific too quickly will lead to topics that seem out of context to read and make merging of systems very difficult.
The event topic root of an event should be of the form: Domain/Object-type/Verb/Version/
|Organizational element responsible for these systems, being careful that these values need to survive internal organizational changes. Examples of Domain could be:
- CompanyName/BusinessUnit/ApplicationDomain: for multi-vendor systems
- BusinessUnit/ApplicationDomain: for inter BU integrations
- ApplicationDomain: for simple or intra BU integrations
|This field is for routing purposes. Suitable for distinguishing major and non-backward compatible changes to the event. It allows blue/green or canary deployments. The regular production consumers might subscribe to version 1, while canary consumers subscribe to version 2. There would be a separate version number within the event schema itself that would likely follow semantic versioning to denote any backward compatible changes of the actual event payload structure.
|Object - type
|Similar to the REST resource object. The ObjectId below would be similar to the href or id. Examples are: customer, customer accounts, product, payments.
|Actual actions taken on the object type - typically a past tense verb. Very similar to REST Verbs [POST, GET, PUT, PATCH, DELETE]. Examples are: created, deleted, exceeded, rejected
The second part of the topic are optional fields that further describe a particular event. The “event topic properties” part of the topic has fields that are dynamically filled when the producer publishes the event and are used to describe the specifics or unique attributes of this instance that would be used for routing and filtering.
The event topic properties part of a topic should be of the form: Locality/SourceId/ObjectId
|Geographical or structural location at which the event occurred. This may require multiple levels, such as latitude/longitude or region/location or just single level location.
|ID of the person or application that produced the event. This can be important for validating the authenticity of the event or to capture last message sent by a redundant sender. This ID could be as broad as a whole venue or as narrow as a specific IoT device.
|Anything that uniquely identifies the object instance that was acted on, for example a product SKU or Order ID. As discussed above the Object-type may describe the object type and then the ObjectId would identify the specific object.
Putting together an event topic root and event topic properties creates an event topic that describes the event with a series of fields from least specific to most specific:
Keep in mind that these are just best practices that need to be tailored to unique development and that each level might be skipped or expanded to multiple levels. The key point is that once you have made a decision on topic architecture it must be applied consistently across the object-type domain. The levels of the topic must be used consistently and have consistent meaning, otherwise routing and filtering will provide inconsistent results.
A well thought out topic hierarchy provides a simple description of the data within the event mesh. This simple description makes it easier to understand, use, and re-use an event.
Since the event topic describes the data available and the event topic subscription details which users or applications are interested in which pieces of data, a well-designed topic hierarchy leads to, or is derived from, a good data model. Understanding which applications consume data from which sources reveals a dependency graphs that is critical in understanding how an event mesh moves data.
Having well thought out topic space with human readable topic fields can provide a great visualization about data movement. PubSub+ Event Portal is a great tool to help with this visualization and to help create and manage not only your topic data model, but also your payload schemas, application relationships, and more.
In order to take full advantage of an event driven architecture which implements an event mesh, the events themselves must be routable, filterable, and subject to predefined governance rules. Hierarchical event topics allow these three properties to be applied at various points within the event mesh while always having the ability to deliver the raw events to the interested applications. Proper topic hierarchy also decouples the producer applications and the interested consumer applications, allowing the consumers to receive exactly the information they want without having to add re-publish steps into application based on business logic filtering.