Cloud Native Services & Data Lakes Integration RDPs

Home > Blog > For Developers

Most modern enterprise IT systems use cloud-native services to add functionality, or improve capacity, performance, and flexibility, but the integration of such services can be challenging. RESTful HTTP is frequently used to establish communications between cloud-native services and other applications following a microservices architecture.

While the RESTful HTTP standard is well suited for communicating between microservices and cloud-native infrastructure, the myriad of different cloud services and microservices extension headers, message formats, and even the use of the REST target varies greatly based on the cloud service and microservice.

In a typical datacenter deployment, it’s fairly easy to translate between REST message, target, and headers, as you can create a new microservice that translates these differing formats. But there is a cost: the microservice must be scaled to fit with peak demand for this translation. Clearly, this is something serverless cloud solutions are well suited to handle, but injecting serverless cloud solutions requires the applications, middleware, and cloud service to integrate with the different serverless capabilities. This is unlikely to be an issue if the source or destination cloud service is from the same cloud vendor as the serverless service, but other products and services must be integrated for this solution to work well.

Solace PubSub+ Event Broker version 9.11 makes it easier to integrate almost all cloud-native services by reducing the need for translation! In the case of data lakes, it eliminates the need for translation code altogether when publishing from the event broker to the data lake! You need to decide what events belong in the data lake and how you want to label them, so the integration with these data lakes is “set and forget” – you only need to do it once!

Aligning REST Delivery Points with AWS, Azure, and Google Cloud Native Services

PubSub+ Event Broker can ingest and deliver messages using REST. When delivering messages via REST, PubSub+ uses a REST Delivery Point (RDP) as a webhook to publish messages to webhook endpoints, which can be in any cloud service. As documented in the earlier blog post Integrating PubSub+ Event Broker: Cloud with Lambda via API Gateway by Himanshu Gupta, Solace has provided tight integration with serverless capabilities. By replacing Himanshu’s Lambda code with code that handles the Solace events and converts them to messages for another AWS service, you can see how serverless integration enables Solace event brokers to publish to any AWS service.

The key to this integration is Solace’s growing array of authentication and authorization techniques including OAuth 2, OpenID Connect, JSON Web Token (JWT), and Amazon Signature Version 4 (with PubSub+ Event Broker release 9.13.0). Trust between Solace PubSub+ RDP and the serverless cloud services has enabled integration with both Azure and Google, in addition to AWS lambda as described by Himanshu.

With Microsoft Azure, Solace PubSub+ RDP integrates directly with Azure Functions to enable a microservice for Azure Functions to any target Azure Service. Similarly, on Google Cloud, Solace has integrated with both Google Functions and Google Run’s serverless capabilities. More detail on these and other broker integrated solutions can be found in Solace’s PubSub+ Integration Hub.

Unleashing the Power of Metadata to Simplify the Integration of Cloud Services

Earlier in this article, I mentioned the “fairly easy” solution of creating a microservice to translate the headers, targets, and messages between source and destination formats – that was before I added serverless and serverless integration requirements to accommodate scale and cost with, admittedly, some additional complexity. When we looked at how headers and targets were mapped in the majority of these microservices, it was clear the answer was a mix of fixed text and metadata from messages. Often, the target was fixed or a mix of metadata like a portion of the topic or a message id. Almost all HTTP extension headers are fixed text, and their values tend to be fixed or variable with user id, timestamp, user property, or portions of the PubSub+ topic as components, depending on the service.

Our latest release, PubSub+ 9.11, enables customers to map metadata into RDP header values and targets and to generate RDP extension headers that match what the end service requires using substitution expressions. Substitution expressions allow customers to configure REST targets and header values to include a mix of text, message metadata variables, and functions to modify their format and encoding. Substitution expressions include the following:

Topic (Full or a field)
UUID, msg id, sender id, sequence number, correlation id identifiers
Message Type
Random numbers
Select User Properties
Null value detection
Sender/Current Timestamp with UTC/Local/UNIX time formats
- Date/Year/Month/Day/Time/Hour/Minute/Millisecond parsing available for UTC/Local
- Options to change separation characters like “:” and “-“
Encoding of the above in Base16/32/64/64url or url
- UPPER/lowercase options for Base16/32; url
- Padded/Not Padded for Base32/64/64url

Let’s take a deeper look in this next section.

Substitution Expressions – An Example

To add a custom header, go to the queue-binding of the RDP and create a request-header by specifying the name of the HTTP REST Header. After specifying the header, you can add a value to the header by specifying the header-value. In the example below, I am creating the header “x-ms-client-request-id” and specifying that the uuid from each message should be the header value for the message.

Similarly, to define the file or object name, identify that the target should be evaluated as a substitution string, using the “request-target-evaluation” command. The default value is none, which means the target is taken verbatim. So, this command is used to identify that “substitution-expression” evaluation of the target should be performed.

Then use “post-request-target” with the string you want evaluated to generate the target/file/object name. In the example below, the file name for each message is made up of the URL encoded topic followed by the message ID, and the timestamp (msec. since epoch) for the message all separated by a ‘-‘.

The possibilities for substitution expressions are broad and not always obvious. For example, different functions could be called by different messages in Google run by adding a topic field in the target. For example, if the 3^rd field of my topic expression varied between “order” and “review” which could be added to the target and used by Google Run, if configured to do so, to trigger different processes.

In addition to cloud services, these types of capabilities can more broadly be used in conjunction with services like Workato, ASAPIO, and Boomi when communicating via a REST interface.

These substitution expressions enable the translation microservices to focus on the translation of the message body. This is a huge simplification without the need to code the parsing of Solace metadata for headers and topics! Additionally, if the service doesn’t have a specified message body format, Solace can directly integrate with the cloud-native service – like data lakes!

Direct Data Lake Integration with PubSub+ RDP

Among these cloud services, data lakes have emerged as a key service to integrate with enterprise event meshes (a network of connected event brokers). They don’t have predefined message body requirements. They take the message body and store it with a file/object name derived from the target!

Data lakes enable the retention and analysis of events through analytics, data science, and machine learning. These events represent key data points, highlighting changing conditions within business processes, in customer experience, financial transactions, infrastructure status, application performance, and more. Retaining these events is important for the sake of regulatory compliance, troubleshooting, and auditing purposes. Retention invites further scrutiny including trend analysis, anomaly detection, correlation, a/b testing, and root cause analysis to drive insights to reduce service-impacting events and increase sales events.

Take the examples in the figure above. The top event flow shows the RDP message sent to Azure Data Lake (Gen 2) where the customer decides the file name. The event has been configured as the full topic, followed by the MsgId (“https://”, hostname not included for simplicity). The Azure extension header x-ms-version is defined with a fixed value of “2017-11-09”. The Azure extension header x-ms-blob-type is defined with a fixed value of “BlockBlob”. The Azure extension header x-ms-date is defined value of the timestamp when PubSub+ RDP published the message. The Azure extension header x-ms-client-Request-ID is defined as the uuid.

Alternatively, this next example shows a simplified Google Storage implementation where the object name is defined in the REST Target. Here the customer has chosen to use the last topic field followed by the message id followed by the timestamp when PubSub+ RDP published it. In addition, GCP extension header x-google-custom-time is defined as the timestamp when PubSub+ RDP published the message.

Leveraging Amazon Signature Version 4 Authentication enables direct integration between Solace PubSub+ Event Broker and Amazon S3. While it does not require any special headers, it does require target substitutions to form a dynamic file name, as described above, in order to deliver events into files on Amazon S3. The example below shows the substitution combining the 3^rd field of the topic and the message timestamp through a ${topic(3)}-${ts()} substitution expression to create the filename for the event to be stored on Amazon S3.

Conclusion

Solace PubSub+ Event Broker version 9.11 and beyond simplifies the integration of cloud native services by reducing the need for translation! In the case of data lakes, it eliminates the need for translation code and providing direct “set it and forget it” integration. You, the customer, need to decide what events belong in the data lake and how you want to label the events as files/objects in the data lake and provide the definitions of extension headers required by the cloud native service.

More detail on broker integrated solutions can be found in Solace’s PubSub+ Integration Hub:

Explore other posts from category: For Developers

Rob Tomkins

See all posts written by Rob Tomkins

Configuring a Hybrid IoT Event Mesh for Streaming Asset Sensor Data into an Azure Data Lake with Ansible

In previous blog posts, we've talked about one way to use Ansible to automate the

Learn more

Use an Event-Driven Data Mesh to Avoid Drowning in the (Data) Lake

For much of the last decade, enterprises fought against data silos, isolated persistence stores holding

Learn more

Extend Azure Messaging Services with an Event Mesh

Two of the most widely used messaging services in the Azure Cloud platform are Azure

Learn more

Bringing Asynchronous Messaging to a RESTful world with Solace and Apigee

In order to demonstrate the power of combining traditional API management concepts and technologies with

Learn more

New Click-to-Connect Wizard Makes It Easy to Integrate Azure and AWS Services for a Personalized Retail Experience

Most enterprises use cloud services in their event-driven architecture to remain competitive and continually deliver

Learn more