In this Post

    As enterprises embrace generative AI (GenAI) to drive innovation and efficiency, it’s becoming clear that these workloads have unique characteristics that demand special attention. The asynchronous nature of GenAI tasks presents both challenges and opportunities for IT leaders looking to harness this transformative technology.

    Those challenges exist even with GenAI chatbots leveraging large language models (LLMs), but as the complexity and sophistication of workloads increase with retrieval augmented generation (RAG) and collaboration between AI agents and people, they become even more pronounced. This makes choosing the right baseline architecture for GenAI initiatives crucial.  In this post, I’ll explore why event-driven architecture (EDA) is the ideal solution for effectively managing GenAI workloads.

    Subscribe to Our Blog
    Get the latest trends, solutions, and insights into the event-driven future every week.

    Thanks for subscribing.

    The Asynchronous Nature of GenAI Workloads

    LLM Integrations: Navigating Latency Challenges

    LLMs are at the heart of many GenAI applications, but their inference response times can vary significantly. This latency introduces a critical challenge for systems that rely on immediate responses. As more models are unveiled, different models might be selected for different tasks, i.e. semantic routing can choose the most cost-effective model for the task at hand. For example, you might use OpenAI o1 for deep reasoning or math, Anthropic Sonnet for text summarization, and Google Gemini Flash for casual conversation, each of which has its own latency characteristics.

    For example, in an enterprise documentation system, an LLM generates summaries for uploaded documents. The analysis time varies depending on document complexity and the model type but it is indeed not instant. An asynchronous approach allows the system to acknowledge uploads immediately and process LLM requests in the background, maintaining responsiveness regardless of analysis duration. Once complete, the system updates metadata and notifies relevant components by publishing an event, all without blocking other operations.

    Agent Tasks: Managing Varied Task Execution Times

    GenAI agents often perform a wide range of tasks, from data analysis to content generation. These tasks can have vastly different execution times, making it difficult to predict when results will be available.

    Picture an agentic AI-powered content creation tool that employs several specialized agents to generate a blog post: One agent researches topics, another creates an outline, a third writes sections, and a fourth proofreads. These agents work asynchronously, with execution times varying from seconds to minutes based on task complexity and agent availability, so the system must coordinate their activities without blocking or excessive wait times.

    Human Interaction: Accommodating Real-World Timelines

    Many GenAI applications involve human interaction at various stages. Users may input requests, review AI-generated content, or provide feedback. These interactions are inherently asynchronous, as humans operate on their own timelines.

    For example, imagine a GenAI-powered contract review system, where the AI quickly analyzes a new contract and generates recommendations, but a human needs to approve their changes before the process can continue. This approval might not happen immediately – the reviewer could be in meetings, traveling, or focused on other priorities. The system needs to accommodate this unpredictable human availability, holding the AI-suggested changes in a pending state until the reviewer can assess and approve them, which could be hours or even days later.


    Why EDA is the Ideal Solution for GenAI

    Event-driven architecture (EDA) is uniquely suited to address the challenges posed by asynchronous GenAI workloads. At its core, EDA is designed to handle asynchronous processes, making it a natural fit for the unpredictable nature of GenAI tasks.=

    Asynchronous Processing

    EDA’s fundamental design is built around asynchronous event processing. This aligns perfectly with the inconsistent and unpredictable latency inherent in LLM responses. Instead of waiting for an LLM to generate a response, the system can emit an event and continue processing other tasks, enhancing overall system responsiveness.

    Non-Blocking Operations

    In an event-driven system, operations don’t block while waiting for responses. This is crucial for managing both LLM latency and the varied execution times of agent tasks. The system remains responsive, processing other events and steps of the workflow without blocking the progresses and resources while AI-generated results are pending.

    Event-Based State Management

    EDA, coupled with an orchestrator agent, helps with managing the state of long-running, asynchronous processes. For GenAI workflows that may involve multiple steps and potential human interactions, EDA and the orchestrator agent can facilitate progress through a series of events, allowing for seamless resumption of tasks regardless of the time intervals between steps.

    Decoupled Communication

    EDA facilitates loose coupling between all components of the system, not just those involving human interaction. This decoupling is crucial for managing the asynchronous nature of GenAI workloads. Any component—be it an LLM, an AI agent, a data store, or a user interface—can submit requests as events and retrieve results when they’re available, without maintaining long-lived connections, holding on to resources or resorting to constant polling. This approach ensures that slow or variable-latency operations in one part of the system don’t block or degrade performance elsewhere.

    Integration Strategies for EDA-based GenAI Systems

    Integrating EDA-based GenAI systems into existing enterprise architectures requires careful planning. Here are two key strategies that are particularly effective:

    Event Mesh

    Implement an event mesh to facilitate seamless event distribution across diverse environments (on-premises, cloud, hybrid). This allows GenAI components to interact with existing systems regardless of their location or underlying technology. An event mesh can efficiently route events between GenAI services, traditional applications, and data stores, enabling a truly interconnected ecosystem. It provides the necessary infrastructure to handle the asynchronous nature of GenAI workloads and ensures that events are delivered reliably and in real-time across the entire enterprise landscape to provide the context necessary to the LLMs to make their outcome relevant to the enterprise.

    API Gateway with Event-Driven Capabilities

    Use an API gateway that supports both synchronous (typically REST) and event-driven paradigms with a bridge to an EDA platform. This allows for a flexible mix of synchronous and asynchronous communication patterns, enabling systems to use the most appropriate approach for each interaction. The gateway can handle the translation between synchronous API calls and asynchronous events, providing a versatile interface for both synchronous and asynchronous communications. This approach is particularly useful when integrating GenAI workloads with existing systems, allowing each component to communicate in its native paradigm while the gateway manages the interoperability.

     

    In addition to the above key components, governance and visibility are important aspects to consider. The power of event-driven AI means it’s essential to lock down access to resources to users or systems. And likewise, if other systems are not aware of the awesome capabilities available, you won’t get the maximum return on your AI investment.

    Final Thoughts

    As we continue to push the boundaries of what’s possible with GenAI, it’s clear that our architectural choices must evolve to meet the unique challenges posed by these innovative technologies. I hope this post has helped you better understand why EDA is the best way to address the challenges caused by the inherently asynchronous nature of GenAI workloads. To recap:

    • LLM integrations, agent tasks, and human interactions in GenAI systems all introduce varying degrees of latency and unpredictability, making traditional synchronous architectures less suitable.
    • EDA’s core principles—asynchronous processing, non-blocking operations, event-based state management, and decoupled communication—align perfectly with the needs of GenAI workloads.
    • Integrating GenAI with the help of an event mesh or API gateways with event-driven capabilities can help you implement EDA in enterprise environments with governance and visibility

    For IT leaders looking to stay ahead in the rapidly evolving landscape of AI and GenAI, embracing EDA is becoming a necessity. By adopting EDA principles and technologies, organizations can create robust, scalable, and adaptable AI systems that can keep pace with exciting advancements in this field. As GenAI capabilities increasingly become key differentiators for businesses, those who successfully leverage EDA to handle the asynchronous nature of these workloads will be well-positioned to unlock the full potential of these transformative technologies.

    Ali Pourshahid
    Ali Pourshahid
    Chief Engineering Officer

    Ali Pourshahid is Solace's Chief Engineering Officer, leading the engineering teams at Solace. Ali is responsible for the delivery and operation of Software and Cloud services at Solace. He leads a team of incredibly talented engineers, architects, and User Experience designers in this endeavor. Since joining, he's been a significant force behind the PS+ Cloud Platform, Event Portal, and Insights products. He also played an essential role in evolving Solace's engineering methods, processes, and technology direction.
     
    Before Solace, Ali worked at IBM and Klipfolio, building engineering teams and bringing several enterprise and Cloud-native SaaS products to the market. He enjoys system design, building teams, refining processes, and focusing on great developer and product experiences. He has extensive experience in building agile product-led teams.
     
    Ali earned his Ph.D. in Computer Science from the University of Ottawa, where he researched and developed ways to improve processes automatically. He has several cited publications and patents and was recognized a Master Inventor at IBM.