In this Post

    In today’s data-rich world, the effective use of data has become a cornerstone of organizational success. Data, when shared and harnessed properly, can provide valuable insights, drive strategic decisions, improve company agility, improve operational efficiency, and create competitive advantage.

    Data sharing is therefore desirable and sharing of changes to data at the time an event/action makes changes to that data provides real, tangible value to the business. However, the act of data sharing cannot be a ‘free-for-all’. Controlled data sharing is essential

    In this piece I’ll explore how organizations can:

    1. Use Solace PubSub+ platform to facilitate the real-time distribution of data across organizations in a way that’s both controlled and documented.
    2. Use Vision Group’s DataSense product to discover the data landscape and ensure that applied controls are effective both at the time of application and over time as requirements change.

    to ensure compliance with legal and ethical standards, protect sensitive information, and maintain trust among stakeholders. Organizations must navigate a complex landscape of regulations, standards, and best practices to share data responsibly.

    Subscribe to Our Blog
    Get the latest trends, solutions, and insights into the event-driven future every week.

    Thanks for subscribing.

    Effective data sharing, and control, is also complicated by the fact that organizations typically do not know the current state of their data landscape with any certainty. For example:

      1. Where is the data located?
      2. Who owns the data?
      3. How do you prove data integrity?
      4. How do you prove data provenance?
      5. Who can access it?
      6. How do users access the data?
      7. With what authority is access granted?

    This presents organizations with a complex challenge: How do you make your organization more responsive and agile by embracing EDA, when you do not know all the detail about who in the organization communicates today and what information is exchanged?

    In this piece I’ll explore how organizations can:

    1. Use Solace PubSub+ Platform to facilitate the real-time distribution of data across organizations in a way that’s both controlled and documented.
    2. Use Vision Group’s DataSense product to discover the data landscape and ensure that applied controls are effective both at the time of application and over time as requirements change.

    Solace: Event-Driven Architecture

    Event-driven architecture (EDA) is a design pattern that enables the effective sharing of data, in real-time, between participants that generate the data and those that consume it. This allows the timely and flexible sharing of information required by organizations that want agility and flexibility so they can react to events and trends, and drive change within their markets.

    Solace PubSub+ Platform is an industry leading event-driven integration and streaming platform. The platform and its component products are shown in the following figure.

    chart showing the Solace PubSub+ platform

    Figure 1: PubSub+ Platform

     

    PubSub+ Event Brokers enable the efficient transport of events throughout organizations. PubSub+ Event Brokers can be connected to form an event mesh that dynamically routes information from sources to destinations by automatically learning and building a subscription routing table. Event Broker deployment options encompass traditional datacenters, private clouds, public clouds, or any combination of environments. This allows organizations to implement hybrid and multi-cloud deployments where this makes sense.

    It also allows for application portability where, for example, an application may currently exist on premises but has a roadmap to cloud deployment. In this example simply deploying the application in the new cloud location will result in the subscriptions being injected from the cloud location and corresponding routing table updates within the mesh. As a result, the information required by the application will reach it in its new location and any events and data that it produces can be injected into the mesh as usual with the certainty that it will reach the intended destination. No painful reconfiguration of the mesh or any other communicating application is required – less change, less time, less risk.

    From an integration perspective, broad support for open APIs and protocols, as well as a comprehensive set of micro-integrations and iPaaS connectors, provide a flexible set of on-ramps and off-ramps for the event mesh. This allows diverse systems to be integrated into the EDA landscape, with the required documentation and control, in a simple, straightforward manner.

    While PubSub+ Event Broker is the foundation and the ‘run-time’ for an effective EDA deployment it is not enough on its own. As outlined earlier you need to know what events are flowing where, and what format they are in. To be effective, the flows and attributes need to be described in an unambiguous manner. EDA decouples the producing and consuming applications by design, but it should not inhibit effective communication between application teams within an organization.

    PubSub+ Platform also provides a rich set of tools to provide the required documentation in a manner that removes ambiguity for those needing to interact over the event mesh using EDA principles. PubSub+ Event Portal provides a catalogue and design-time view of the schemas, events, event APIs and applications/microservices that interact over the event mesh.

    the example application components, product catalogue, order management system, warehouse, e-commerce, customer relationship management, marketing operations and analytics communication via events

    Figure 2: EP Designer View – Microservice Interaction and Events.

    The figure above shows the example application components, product catalogue, order management system, warehouse, e-commerce, customer relationship management, marketing operations and analytics communication via events. The description of an application component can be exported into an AsyncAPI definition that can then be used to generate code via one of the many open-source code generation tools that exist. This provides a low-code environment for EDA and event-driven integration.

    A typical application lifecycle, using PubSub+ Platform, is shown in the following figure.

    PubSub+ Event Portal flowchart

    Figure 3: Event lifecycle management with PubSub+ Event Portal

    1. Use the event catalog to discover the events that you need to subscribe to or publish along with the schema of the events.
      1. For example, subscribe to event types A, B and C, publish events D and E.
    2. Design your application component by binding the events to the application.
    3. Export your design and use AsyncAPI code generation tools to provide low-code application implementations.
    4. Use workflows to integrate with your current change request (CR) process and have the designs approved for deployment on, or promotion through, environment(s).
    5. Deploy the application release and generate the required runtime configuration to support the application.
      1. ACL Restrictions – only subscribe to event types A, B and C, only publish events D and E
    6. Audit the runtime to detect runtime and design-time anomalies.
    7. Improve / iterate.

    EDA and Data

    Solace PubSub+ Platform provides a full and comprehensive platform for EDA and the management of the lifecycles of the schemas, events, event APIs and application components/microservices it supports. PubSub+ Event Portal encourages team collaboration, event reuse and provides unambiguous documentation about events and their business context.

    PubSub+ Platform has an ‘Audit’ process that allows discovery of runtime components, flows, and attributes for import into the design time. PubSub+ Platform integrates with existing development tools, e.g. GitHub, so that developers can specify objects, e.g., schema, using their normal development tool pipeline and have definitions automatically populate in the Event Portal catalog for discovery and use by other teams.

    Solace will eventually provide a schema registry and serializers/de-serializers to further the path toward true low-code/no-code deployment.

    What else is desirable?  What challenges remain?

    From an EDA perspective PubSub+ Event Portal’s design-time view, with audit of the runtime, gives you an accurate snapshot of the landscape in terms of application components, integration flows, and events. What about the underlying data?

    This is where combining DataSense from VisionGroup can bring benefits.

     

    DataSense: Data Management, Lineage, and Security

    DataSense is a powerful data management platform that enables organizations to gain deep insights, control, and governance over their data assets.

    DataSense has a range of features that allow an organization to increase awareness of where its data is, how it moves and who is using it. The platform allows automated data tracking and can be used to actively control the data governance process. It provides a clear and intuitive view of how data moves from its source to various stages of processing and ultimately to its destination. This visualization is crucial for understanding, analyzing, and optimizing the flow of data, ensuring that it aligns with business objectives and requirements.

    high-level illustration of DataSense

    Figure 3: DataSense Data Management, Lineage and Discovery

     

    The figure above shows a high-level illustration of DataSense. The figure shows the data ingest portion which includes applications and databases as well as the event mesh. The input feeds a data catalog and intelligent processes for determining the data flows, lineage, detecting/discovering new data and alerting for data patterns that are out of the ordinary.

    There a few key attributes of DataSense that set it apart from other lineage platforms and make it an ideal partner to PubSub+ Platform:

    1. DataSense works on real data rather than metadata catalogues.
    2. DataSense uses sophisticated hashing/indexing to be able to build its data catalogue. It stores these ‘references’ and does not copy data to the DataSense catalogue which would result in ‘another copy’.
    3. DataSense intercepts data flows to be able to generate the hashes and keys for storage.
    4. DataSense stores data hashes/keys along with temporal information so that changes in behavior of data within the organization can be tracked.
    5. DataSense stores the data hashes/keys against each endpoint reporting data usage. As such the system can infer relationships based on what it learns as endpoints report. The process is automatic and requires minimal configuration.
    6. DataSense has a view of the data that is current. This means that changes to data use/detection of data anomalies (location/time) can be detected quickly and flagged to system operators in near real-time.

    In terms of working with PubSub+ Platform, DataSense has a real-time view of the ‘data landscape’. It is able to ‘monitor’ flows in real-time and provide a complementary view of the data, that flows as events across the event mash, identifying anomalies and data governance as they arise.

    Use Cases

    Unexpected Schema

    An application or group of applications makes changes to a schema without recording the change in Event Portal’s catalog. Here’s what that looks like:

    • DataSense learns the type of data (hashes) passing on a flow
    • DataSense learns the location of the endpoint (where available)
    • DataSense infers its temporal relationship between arrival of hash/key items.
    • DataSense detects a difference in the data passing on a flow when compared to the historical hashes/indexes stored for the flow.
    • DataSense generates an alert based on the data hash/key change.

    Data Paths

    An application sends data to a data-lake for onward processing for regulatory compliance. There are two issues with this:

    1. Typical system latencies are around 10s, today >180s.
    2. Data is not showing up in the data lake today

    Here’s what that looks like:

    • DataSense learns the type of data (hashes) passing on a flow.
    • DataSense learns the location of the endpoint (where available)
    • DataSense infers its temporal relationship between arrival of hash/key items.
    • DataSense detects a difference in the data passing on a flow when compared to the historical hashes/indexes stored for the flow.
    • DataSense generates an alert based on change in temporal behavior.

    Geofencing

    An organization migrates an application to cloud. Some of the flows include PII data and break data residency rules and cannot be sent to the cloud. Here’s what that looks like:

    • DataSense learns the type of data (hashes) passing on a flow.
    • DataSense learns the location of the endpoint (where available)
    • DataSense infers its temporal relationship between arrival of hash/key items.
    • DataSense detects a difference in the data passing on a flow when compared to the historical hashes/indexes stored for the flow.
    • DataSense generates an alert based on location of data.

    Conclusion

    Solace PubSub+ is a complete platform that enables organizations to deploy effective EDA at scale across the whole enterprise encompassing hybrid and multi-cloud.

    PubSub+ Event Portal adds a capability to provide unambiguous documentation of, Schemas, Events, Micro-Services and Micro-Integrations that increases event reuse as well as collaboration and understanding between teams.

    PubSub+ Platform allows the design to be pushed to the run-time so that the flow of events across the event mesh is governed and compliant.

    Adding DataSense to PubSub+ provides a real-time validation of the PubSub+ design time goals as the runtime is operating. Data flows are learned by DataSense and can be compared to the design intent and errors flagged.

    During system operation DataSense can continue flagging issues, e.g., unknown/wrong data sets, abnormal system performance, or, data outside organization / geographic boundary, as they occur. The alerts allow corrective action to take place and help minimize exposure to potential regulatory non-compliance issues.

    Solace PubSub+ and DataSense enable organizations to effectively and efficiently unlock the true value of enterprise data by facilitating the control, governance and monitoring of data being shared across the enterprise.

    Mathew Hobbis

    Mat joined Solace in 2005 and currently runs the technical operations within EMEA. He has extensive expertise within the messaging space and has worked to evangelize the appliance form factor for messaging within the region. Following successful sales Mat has work closely with all of the major accounts and partners within the region providing guidance on messaging and application architecture. Prior to joining Solace Mat worked for Alcatel through the acquisition of Newbridge Networks. At Alcatel/Newbridge Mat held a Systems Architect role and was responsible for the design and successful implementation of many large networks encompassing diverse technologies as optical transmission, ATM, IP, Ethernet, NGN and Mobile for many large clients. Before making the shift into the vendor space Mat held numerous positions at NTL and BT (UK Carriers), building and operating large networks and services for business and residential customers.