Event-Driven Architecture definition:
Event-driven architecture is a way of building enterprise IT systems that lets information flow between applications, microservices and connected devices in a real-time manner as events occur throughout your business, instead of periodically polling for updates.
What is an “event” in event-driven architecture?
All of the things that happen within and to your enterprise are “events” – customer requests, inventory updates, sensor readings, the works. The value of knowing about a given event, and being able to react to it, degrades over time. The more quickly you can get information about events where they need to be, the more effectively your business can react to opportunities to delight a customer, shift production, and re-allocate resources.
That’s why event-driven architecture, which pushes information as events happen, is a better architectural approach than waiting for systems to periodically poll for updates, as is the case with the API-led approach most companies take today.
Event-driven architecture ensures that when an event occurs, information about that event is sent to all of the systems and people that need it. It’s a simple concept, but these events have quite the journey. They need to efficiently move through a multitude of applications written in many different languages, using different APIs, leveraging different protocols, and arriving at many endpoints such as applications, analytics engines, and user interfaces.
If they don’t get there? Connected “things” can’t connect, the applications and systems you rely on fail, and people across your company can’t react to situations that need their attention.
By introducing a middleman known as an event broker, event-driven architecture enables what’s called loose coupling of applications, meaning applications and devices don’t need to know where they are sending information, or where information they’re consuming comes from. Loose coupling doesn’t come without some difficulties, but can dramatically improve the agility and scalability of your system. In event-driven architecture, an event is sent without the expectation of anything in the way of a reply, except optionally an acknowledgment from the event broker. Why is this a good idea? Well, it means there is no dependency between the sender and receiver; they are decoupled.
There’s a good chance your enterprise is leaning into this event-driven future (perhaps more because it’s being pushed into it). The degree and rate of disruption we’re facing as technology leaders makes event-driven architecture a necessity, and all signs point to it becoming the primary way businesses build their IT systems over the next 3-to-5 years.
Let’s dive into a few commonly-asked questions about event-driven architecture.
What does an event-driven architecture look like?
This is an example of event-driven architecture from a retail perspective. No systems (inventory, finance, customer support) are polling to ask if there are any new events, events are simply filtered and routed in real-time to the services and applications that have registered their interest.
Where does an event broker fit in?
An event broker is middleware (which can be software, an appliance, or SaaS) that routes events between systems using the publish-subscribe messaging pattern. All applications connect to the event broker, which is responsible for accepting events from senders and delivering them to all systems subscribed to receive them.
It takes good system design and governance to ensure that events end up where they are needed, and effective communication between those sending events and those who need to respond. This is where tooling – such as an event portal – can help capture, communicate, document and govern the event-driven architecture.
Where does an event portal fit in?
Event portals let people design, create, discover, catalog, share, visualize, secure, and manage events and event-driven applications. Event portals serve three primary audiences:
- Architects use an event portal to define, discuss and review the events, data definitions, and application relationships.
- Developers use an event portal to discover, understand and reuse events across applications, lines of business and between external organizations.
- Data scientists use an event portal to understand event-driven data and discover new insights by combining events.
How do you make sure events are routed to the right applications?
Events are tagged with metadata that describes the event, called a “topic.” A topic is a hierarchical text string that describes what’s in the event. Publishers just need to know what topic to send an event to, and the event broker takes care of delivery to systems that need it. Application register their interest in events with a given topic by subscribing to that topic. They can use wildcards to subscribe to a group of topics that have similar topic strings. By using the correct topic taxonomy and subscriptions, you can fulfill two rules of event-driven architecture:
- A subscriber should subscribe only to the events it needs.
The subscription should do the filtering, not the business logic.
- A publisher should only send an event once, to one topic, and
the event broker distributes it to any number of recipients.
Note: Always use topic hierarchy best practices to ensure these rules are obeyed.
What’s a real-world example of this?
Let’s take the example of an update to a bill of materials (BoM) for a product being manufactured. A widget has had some quality problems because the rubber O-ring is too soft, so a more expensive and harder O-ring must be used. The master data management (MDM) system emits an event:
This has a payload that reflects the details of the change, for instance the old and new O-ring details such as part numbers. The enterprise resource planning (ERP) system needs to know about any BoM changes, so it will subscribe to these events:
Notice the wildcard
/>. This ensures that the BoM will receive all events under this string, not just the change to the widget or O-ring.
What other event-driven architecture concepts do I need to know about?
There are some key architectural concepts that need to be adhered to for event-driven architecture to work.
An event mesh is a configurable and dynamic infrastructure layer for distributing events among decoupled applications, cloud services and devices. It enables event communications to be governed, flexible, reliable and fast. An event mesh is created and enabled through a network of interconnected event brokers. In other words, an event mesh dynamically routes events to any application no matter where these applications are deployed in the world, in any cloud, on-premises or IoT environment. Technically speaking, an event mesh is a network of interconnected event brokers that share consumer topic subscription information, and route messages amongst themselves so they can be passed along to subscribers.
Applications may be hosted in different environments (on-premises, in the cloud) and in different geographies. In the case of the widget example above, the ERP could be in the cloud, while the MDM system is located on-premises in the USA. To avoid sending all events everywhere, we have to decide which event should be sent where – event routing. Fortunately, there’s an easy way of doing this – we just look at the topic. All we need is for brokers at the various locations to share information about what topic is being consumed where, and the brokers can decide on how and where events should be routed. That’s the role of the event mesh.
If you’re used to REST-based APIs, the concept of deferred execution can be tricky to comprehend. The essence of event-driven architecture is that when you publish an event you don’t wait for a response. The event broker “holds” (persists) the event until all interested consumers accept/receive it, which may be some time later. Acting on the original event may then cause other events to be emitted, which are similarly persisted.
So event-driven architecture leads to cascades of events, which are temporally and functionally independent of each other, but caused in a sequence. All we know is that event A will at some point cause something to happen. The execution of the logic consuming event A isn’t necessarily instant – its execution is deferred.
Imagine the widget example where the O-ring has been changed. The quality management team wants to know about component churn as it’s a good proxy for product quality, so they want to be notified of all changes to BoMs and all quality reports from suppliers; but, there’s a large number of BoM updates for every product, and you wouldn’t want to slow down O-ring updates while quality churns through those.
So you tackle this by sending all the BoM updates from MDM to the quality system and ERP – not caring in which order they are accepted or how long they take to be accepted.
Following on from this idea of deferred execution – where you expect something to happen later, but don’t wait for it – is the idea of eventual consistency. Since you don’t know when an event will be consumed and you’re not waiting for confirmation, you can’t say with certainty that a given database has fully caught up with everything that needs to happen to it, and don’t know when that will be the case. If you have multiple stateful entities, (in the widget case: the quality database, MDM and ERP) you can’t say they will have exactly the same state – you can’t assume they are consistent. However, for a given object (our BoM for our widget) we know that it will become consistent eventually.
Deferred execution and eventual consistency lead us to the concept of choreography. To coordinate a sequence of actions being taken by different services, you could choose to introduce a master service dedicated to keeping track of all the other services, taking action if there’s an error. This approach, called orchestration, offers a single point of reference when tracing a problem, but also a single point of failure and a bottleneck.
With event-driven architecture, services are relied upon to understand what to do with an incoming event, frequently generating new events. This leads to a “dance” of individual services doing their own things, but when combined, producing an implicitly coordinated response – hence the term choreography. The blog post Microservices Choreography vs Orchestration: The Benefits of Choreography talks about the concept in detail.
If you’re wondering about error conditions, well, errors are also events – so services that would be affected by such an error would be set up to consume them and react accordingly.
CQRS – Command Query Responsibility Segregation
A common way of scaling microservices is to separate the service responsible for doing something (command) and the service responsible for answering queries. Typically you have to answer many more queries than for an update or insert, so separating responsibilities this way makes scaling the query service easier.
Using event-driven architecture makes this easy since the topic should contain the verb – so you simply create more instances of the query service and have it listen to the topics with the query verb.
Event-driven architecture summary
There we have it, some main principles of event-driven architecture:
- Use event brokers to make sure the right things get the right events (creating an event mesh)
- Use topics to make sure you only send once and only receive what you need (event filtering);
- Use event broker persistence to allow consumers to process events when they’re ready (deferred execution);
- Remember this means not everything is up to date (eventual consistency);
- Use topics again to separate out different parts of a service (command query responsibility segregation).
Why do all this?
Responsiveness. Since everything happens as soon as possible and nobody is waiting on anyone else, event-driven architecture provides the fastest possible response time.
Scalability. Since you don’t have to consider what’s happening downstream, you can add service instances to scale. Topic routing and filtering can divide up services quickly and easily – as in command query responsibility segregation.
Agility. If you want to add another service, you can just have it subscribe to an event and have it generate new events of its own. The existing services don’t know or care that this has happened, so there’s no impact on them.
Agility again. By using an event mesh you can deploy services wherever you want: cloud, on premises, in a different country, etc. Since the event mesh learns where subscribers are, you can move services around without the other services knowing.
These advantages are especially relevant in use cases where a single change can have huge consequences, rippling all the way down chain. Being able to react to real-time information and being able to add new services and analytics quickly and easily, considerably enhances business processes and customer experience.