Governance in the World of Event-Driven APIs

Key Considerations and Requirements when Applying the Principles of API Management to Event-Driven Architecture

Application programming interfaces (APIs) have been around for a long time, but API management (APIM) per se only arrived on the scene in the last two decades. APIM aimed to give organizations control of development processes that often saw the uncontrolled proliferation of APIs. For example, developers would often build APIs without following enterprise standards or specifications, write redundant logic, and deploy APIs without security considerations in mind.

This became a liability as the more unsecured APIs you have, the more internal data and services you’re putting at risk. APIM introduced a design-first approach to building APIs, offered security policies one could apply to their APIs, and gave developers better control over where APIs were deployed. In short, APIM enabled the effective governance of enterprise assets and processes, and added a layer of security to the enterprise’s digital façade.

APIM also gave people better visibility into existing assets. This increased the reuse of APIs, which reduced duplication of work and accelerated time to value.

Although APIM has evolved, it has done so primarily in the realm of synchronous communications, benefiting mostly SOAP and more recently RESTful microservices and APIs. Conversely, the realm of asynchronous communications however has greatly lagged in the adoption of an API Management strategy. As such, even though event-driven APIs have garnered a lot of interest in the market in recent years, not much has been done in the space to bring the same type of governance as it exists in the Synchronous space.

In the following sections I will explore why the need for governance is critical for the event-driven world, and what solutions need to evolve to fill that gap.

Challenges Governing Event-Driven APIs

Before we explore what APIM and governance could look like for the realm of event-driven APIs, it is important to understand the differences between RESTful APIs and their event-driven counterparts.

Semantic Differences Between RESTful and Event-Driven APIs

RESTful APIs utilize a variety of actions/verbs called methods: get, post, put, and update.
Event-driven APIs only use two verbs: publish and subscribe.

Structural Differences Between RESTful and Event-Driven APIs

RESTful APIs have resources that map into one endpoint, and only have to support one protocol: HTTP.
Event-driven APIs have channels (topics or queues) with two endpoints: a producer and a consumer. They need to support a variety of transport protocol bindings, e.g. AMQP, Apache Kafka, JMS, and MQTT.

It’s these structural differences that make the governance of event-driven APIs more difficult than that of RESTful APIs. In the world of RESTful APIs you need to apply security policies to one endpoint, but for event-driven interactions you need to apply policies at the producer and consumer ends of the channel. And since there is only one protocol for REST (HTTP), there are no binding or implementation implications. For event-driven interactions, different transport protocols/brokers have different capabilities, which impacts the implementation and enforcement of security policies. For instance, Kafka only supports topics, whereas JMS supports topics and queues. Similarly, they offer varying degrees of support for authentication and authorization that need to be considered. Also, only some protocols/brokers support throttling and rate limitation.

It’s clear that governing event-driven APIs requires a new approach and toolset.
Here is a high-level rendition of the components and flow:

Potential Solution for Governing Event-Driven APIs

Effective governance blueprint of RESTful APIs requires 4 things:

API Manager: responsible for facilitating the definition of policies and their assignment to APIs.
Gateways: responsible for the enforcement of policies on the API endpoints.
Clients: the systems that actually invoke the APIs.
Auditing Tool: responsible for ensuring that the artifacts designed are the same as those deployed

The governance of event-driven APIs requires the same participants, but they play different roles. The API manager and the gateway have similar capabilities, but differ in the types of policies available and their application (i.e. two endpoints vs. one), and clients can be producers or consumers.

There are a couple of ways to approach the challenge of governing event-driven APIs:

Intermediated/Indirect: Follow the RESTful APIM model of having an API manager and a gateway.
Dis-intermediated/Direct: Still have an API manager, but not a gateway, and enforcing policies directly at the producer and consumer endpoints.

In the following sections I will analyze and explain both options.

Intermediated Governance of Event-Driven APIs (Gateway based)

Intermediated governance refers to an architecture that has a component that intercepts the traffic between a client and an API, and in the process applies governance rules (aka policies). That component is an API gateway. Essentially the gateway hosts and manages API proxies and enforces policies on those proxies. I.e. before routing the traffic from the client to the required API, it will apply the necessary security and traffic shaping constraints.

As previously indicated, this process is rather simple when you’re talking about synchronous RESTful APIs. For the asynchronous event-driven domain, however, things are more involved. Even though there will still be a gateway in the middle, as opposed to having just a client and an API, and needing to restrict access only to the client, you now have a producer and a consumer, and you need to restrict access to both. Another consideration for an event-driven API gateway is the need to support multiple protocols. Producers and consumers will leverage the same transport protocol (Solace/Kafka/JMS/MQTT/AMQP etc.), but since they’re not all the same, the gateway needs to know which policies apply to which protocol.

This diagram shows some difference between gateways in RESTful and event-driven environments:

Pros and Cons of Intermediated Governance

The key benefit of a gateway is that it adds a layer of security between the consumer and the resource; a layer which can be scaled independently from the rest of the infrastructure. Its capabilities can also be evolved without impacting either producer or consumer code. There is also a separation of concerns in terms of enabling security architects to design both the governance model and the gateways that enforce it.

There are two disadvantages of using gateways in event-driven APIM:

They add a layer of operational management, and additional cost.
They increase latency – even though it’s not always noticeable — due to the interception process.

A gateway-based architecture is ideal for scenarios in which cost and latency aren’t key considerations, and gateway-based systems are easier to manage since they do not incur or require additional development costs

Dis-Intermediated Governance of Event-Driven APIs (non-Gateway based)

Alternatively, you can implement governance of event-driven APIs without a gateway. You still need a “manager” to handle the definition of policies and their assignment to various channels, but may not need a gateway if producers and consumer can:

Access the API manager to download the policies associated with the channels they are connecting to.
Have the ability to enforce those policies locally.

This diagram shows a disintermediated governance model:

In this instance, you are essentially delegating policy enforcement to producers and consumers. For instance, when a developer generates the code scaffolding for a given AsyncAPI, they would have at their disposal a “Governance SDK” or library they could incorporate into their implementation. The SDK would take the form of an embedded agent that could then perform the tasks of:

Interacting with the API manager to get the policies that the producer/consumer needs to apply to the channels they interact with
Emitting stats/metrics to the API manager to visualize the behavior/performance of the producer/consumer.

In essence the governance agents would enable a seamless interaction with the event-driven infrastructure without really impacting the development process.

Pros and Cons of Disintermediated Governance

Similar to the intermediated governance model, this achieves the same type of separation of concerns, but enforcement will be done by the clients (producers/consumers) not a gateway. The advantage is that there’s less operational impact on the infrastructure, as there are no gateways to deploy and provision, so less infrastructure overhead and cost, and that there’s no latency impact on the data flow.

The downside of the direct governance model is that it puts the onus on the developers to incorporate governance into their applications and microservices. Depending on how the governance SDK is implemented (i.e. if it is configuration driven), one option could be to incorporate the addition of the governance agent into the DevOps/CICD pipeline by leveraging an aspect-oriented (AOP) framework. This would remove the need for the developers to manually add the governance logic to their codebase.

Policy Types & Implementation Considerations

The types of policies applicable to event-driven APIs are similar to their counterparts in the RESTful arena. Here are some policies that could be applied to the governance of event-driven APIs:

Authentication: Ensure a client is who they say they are when trying to access a broker or event mesh.
Authorization: Ensurie that authenticated parties have been granted access to specific resources like channel and events. E.g. a producer may only have access to a certain set of channels, and only able to publish/consume a certain sub-set of events.
Volumetric (Rate Limitation/Throttling): Limiting the number of events per second a given channel allows.
Circuit Breaker: Block access to a channel if a client violates an SLA a certain number of times.
Segmentation: Limit access to a subset of events on a given channel.
Content Filtering: Only allow events with certain payload patterns on a channel.
Validation: Ensure that events on channel follow a certain structure/schema.
White/Black Listing: Control which clients can connect to a given broker/node in the event mesh.

The Importance and Advantages of a Virtual Security Layer

A major challenge with policy enforcement for event-driven APIs is that it needs to be applied over a wide variety of transport protocols/brokers that don’t have the same capabilities. For instance most brokers have some authentication and authorization mechanism via an access control list (ACL), but most don’t support validation, volumetric, and actually most of the other policy types outlined above, at least not to the same degree.

To apply consistent governance across all transports/brokers, you almost have to have a virtual security layer that can translate abstract policies into the native capabilities of the brokers. There are two ways to implement a “standardized” approach to governing event-driven APIs:

Only implement policies that a broker can natively sustain, e.g. throttling policy for Solace, Kafka etc.
This is easy, but means architects and administrators need to know the capabilities of most brokers.
Implement all polices as a virtual model where in cases where there are no native broker capabilities, the policy agent will provide the necessary enforcement. This takes more work up front, but will make the lives of your security architects easier as they’ll be able to think in terms of security constraints and patterns, without needing intimate knowledge of individual broker capabilities.

Event-Driven API Management Capabilities

In the RESTful world, APIM is usually part of suite of products, i.e. an API portal that contains:

Design Tool that enables users to create the API specification, e.g. OAS, RAML etc.
Catalog where users can publish and discover API specifications.
Manager that allows users to secure the APIs via a wide array of policies.

In terms of API lifecycle management, users design an API, publish it to the catalog, and then once it’s deployed, you use the API manager to secure the API endpoint.

Managing event-driven APIs is similar. The difference is in the type of assets that are built, published and managed. Borrowing from the AsyncAPI specification, the key assets to be managed in terms of security are channels, publishers, and subscribers. In addition, there will be the various policy types that the manager will allow to be associated with the various channels, publishers and subscribers. The manager may also deal with certificate management to ensure that the connections to the brokers are TLS compliant wherever possible.

Governance Ecosystem Integration Patterns

A blueprint for governance of event-driven APIs can be built from the following “Lego blocks”:

Catalog
Manager
Gateway
Governance agent
Auditor

Essentially the first aspect of any governance solution will be the need for a repository of events and APIs, with the APIs describing the flow of events over a given set of channels that are bound to a given set of brokers and protocols. The API manager would enable the design of a security scheme that gives control over the assets defined in the catalog. Lastly, enforcement of the security scheme comes into play, with the option of taking an intermediated or disintermediated approach. Once the infrastructure is in place, you need to continuously monitor the run-time artefacts against the design-time artefacts.

Most enterprises have a well-defined CI/CD process that lets them control asset development and previsioning/deployment. Sometimes errors do occur, however, or somebody sidesteps the CI/CD process resulting in assets being deployed and used that are not under management. For example, deploying consumers/producers without governance agents, or manually creating topics/queues. These “un-managed” assets could pose a security risk. That’s why you need an “auditor” component that monitors traffic and makes sure the production system matches the design/architecture, and alerts administrators about anomalies.

This diagram shows a sample governance blueprint:

Today there are no technology providers that offer all the components indicated above, so to assemble a governance solution that meets all of your needs you need to use a combination of off-the shelf products, custom code and/or open source components.

An open ecosystem for the governance of event-driven APIs would allow for a great degree of flexibility and interoperability. For instance, there are vendors that offer catalogs for event-driven APIs, and API managers, and gateways that support some degree of enforcement across event-driven APIs.

In an open ecosystem for the governance of event-driven APIs:

Catalogs should support the ability to import/ingest different kinds of document sets (e.g. AsyncAPI and CloudEvents) from other catalogs or repositories. They should also have discovery modules that can scan and import event and infrastructure definitions from various brokers.
Managers should allow the importation of externally defined policies, i.e. offer an SDK developers can use to create and design their own policies. This is especially valuable when paired with governance agents, where developers can define policies and build their own enforcement in the governance agent.
Gateways should have APIs that allows them to be managed and provisioned by any API manager. Depending on the gateway’s openness, users can follow the governance agent model, where external parties can provide enforcement code for new externally defined policies.
Auditors should be able to monitor various brokers for inconsistencies in configurations and event flows.

In short, governing event-driven APIs is most successful when it’s based on an open ecosystem where vendors and the developer community collaborate to expand the ecosystem’s capabilities.

Coexistence of Governance Solutions for RESTful and Event-Driven APIs

Throughout this article I have highlighted the differences between the management of RESTful and event-driven APIs. Despite the differences, you need to do both. Most enterprises would much prefer to have a unified governance solution that lets them effectively govern and model their RESTful and event-driven APIs with a “single pane of glass”.

Very few vendors offer the tools you need to manage both kinds of APIs, but over time that number will increase. The reality is that not many have the knowledge and the ability to fully support both paradigms, and it will be challenging to provide a high degree of unification. The most likely scenario (and this goes back to interoperability and integration), will be that APIM solutions for RESTful and event-driven APIs will offer ways to extend their capabilities and to include each other in their user experience. Alternatively, there can be a higher-order interface (i.e. a portal) that presents those APIM capabilities in one place.

It is difficult to predict how the market for Unified/Universal API Management will evolve but for the short term most likely enterprises will have to entertain two separate technology stacks. Eventually certain mature vendors will emerge that will natively support both paradigms, however even in that case there will likely be complementary technologies that will augment or complement existing capabilities.

Auditing

Auditing is often an afterthought in designing governance solutions, but it’s a critical feedback mechanism that ensures that the entire ecosystem is operating in a balanced and controlled way.

Auditing is not trivial to set up as it requires:

The ability to interact with multiple brokers to get access to the number of infrastructure components defined (queues, topics, bridges etc.)
The ability to act as a “sniffer” or an event sink to look at the event types on the wire and see if they are consistent with their respective event definitions
The ability to interact with EDA-centric API managers and event catalogs to compare the live configurations/event data to their intended architecture.

As such, an auditor needs to be able to “speak” multiple protocols, perform context-based matching, and potentially even look at event patterns. Although there are auditing solutions today, they are not designed to cover such a range of capabilities, as such an auditing solution may involve multiple products/components.

Summary

Event-driven architecture has been around for decades, and its architectural patterns are well known and implemented in many products, but governance has always taken a back seat. In the last few years, event-driven architecture has emerged as the de facto standard way of building distributed applications. We are at an inflection point, where we will see technologies that support it soar in demand. This proliferation of event-driven architecture solutions will introduce governance requirements that will need to be addressed very soon.

I’ve introduced here the requirements and pros and cons of various approaches. Achieving comprehensive governance across RESTful and event-driven APIs is not a trivial task, and requires careful consideration and architecting. Although API specifications in the event-driven space have begun to mature (e.g. AsyncAPI and CloudEvents), standards like OAuth and OIDC that are well defined and implemented in the RESTful arena are still coming together for event-driven APIs. Although the two paradigms are different, functional policies are in fact similar (albeit implementation and enforcement will vary by broker).

In short, we are at the beginning of the governance journey for event-driven architecture, but it is a journey that will accelerate over the next few years. It is important that enterprises consider how they will govern event-driven APIs soon and start exploring solutions now even though they will evolve over time. Ultimately, assembling a solution for the governance of event-driven APIs will be based on the requirements at hand and the needs of the enterprise. Lastly, specific vendors will emerge as leaders in the space, but it is important to remember that adoption will be driven by ease of use and embracing interoperability and openness.

Explore other posts from categories: API Management | For Architects

Bruno Baloi

Bruno has over 25 years of experience in IT, in a wide variety of roles (developer, architect, product manager, director of IT/Architecture, ), and is always looking to find ways to stimulate the creative process.

Bruno often takes unorthodox routes in order to arrive at the optimal solution/design. By bringing together diverse domain knowledge and expertise from different disciplines he always tries to look at things from multiple angles and follow a philosophy of making design a way of life. He has managed geographically distributed development and field teams, and instituted collaboration and knowledge sharing as a core tenet. He's always fostered a culture founded firmly on principles of responsibility and creativity, thereby engendering a process of continuous growth and innovation.

See all posts written by Bruno Baloi