The previous blog post in this series explained how many capital markets firms have ended up with a multitude of messaging technologies to handle different data movement requirements, the problems associated with that, and how Solace technology helps simplify such environments. This post drills in to how big data is leading companies in other industries down this same path.
According to Gartner, as of June 2014, 73 percent of organizations have or will be investing in big data in the next two years. While big data is intended to support both structured and unstructured data, many organizations start by analyzing their structured transactional data because it’s easier for them to understand, and can be more readily applied to optimizing operational efficiency.
During this phase, enterprises focus on how the big data technology will extract value from their data. They load all kinds of information into their big data lake using whatever tools are at hand or easiest to add to the mix. This allows them to start running analytics ASAP, with the problem of loading data in a demanding and unpredictable production environment left for later.
This can result in getting data into the Hadoop infrastructure in a variety of different ways: FTP here, Flume there, maybe some Kafka somewhere else.
Some enterprises start with the messaging technology (or technologies) they use to link legacy applications but envision a completely separate and additional infrastructure based on Flume or Kafka that looks like this. This architecture shows a cascading Flume deployment intended to groom all data for two HDFS files into two distinct writers from many different sources.
This means multiple data movement infrastructures go into production – all of which are point solutions that only solve the problem of getting data into Hadoop, and they can’t be used for other functions. Even for just big data you could end up with multiple technologies—you may start with cascading Flume instances or FTP because they’re simple and batch loading is sufficient, but when you then transition to streaming analytics, you need yet another technology to deal with the velocity of big data.
If you already use messaging for existing applications, why not use that same technology to stream this new data into Hadoop? The answer depends on the messaging you are using today:
What about Solace users? How do they move to a big data world? Here are a few real world customer examples:
These are critical systems for their business such as regulatory reporting, settlement, risk, etc. Solace technology provides high rate, real-time pub/sub with the best slow consumer control in the industry and integrates with Flume, Spark, Storm and DataTorrent to name a few. This means a big data lake and stream processing can be added to the same bus as new consumers without any impact on other applications, without creating bridges to new transports and without needing significant initial and ongoing investment in an alternate transport system. These clients were able to focus on the big data technology and analytics – data acquisition was trivial.
In all cases, our clients did not need to deploy yet another data movement architecture to feed their big data initiative. This “single data movement” fabric has been made possible for Solace users due to the best of breed pub/sub performance, slow consumer control, robustness and manageability of Solace technology as well as its ability to integrate with both existing and new application technologies.
This is how Solace users are able to avoid the problem of ending up with many messaging products. Solace messaging is not built for a single use case, but rather it is built leveraging networking concepts and technologies to create an enterprise-wide application messaging fabric capable of supporting a wide variety of application demands. This allows our customers to focus on their applications – not the plumbing.
But what if I already have a few middleware technologies, and adopting Solace would make it the “yet another messaging product”? How can Solace help then? Stay tuned for part 3 of this series…
Shawn McAllister is responsible for the strategy and delivery of the Solace PubSub+ Event Streaming and Management Platform. He leads a team of incredibly talented engineers and architects in this endeavor.
McAllister has worked with many of our clients to help them adopt an event-driven architecture and to learn first-hand their needs as input to the innovation built into the PubSub+ Platform. He has participated in the definition of various OASIS messaging protocol standards, including MQTT 3.1.1, MQTT 5.0, and AMQP1.0.
Before joining Solace, McAllister led software, hardware, and test engineering teams at Newbridge Networks (later Alcatel Canada), where he was responsible for developing features on ATM and Ethernet switches as well as the 7750 Multiservice IP Router.
McAllister holds a Bachelor of Mathematics from the University of Waterloo, with majors in both Computer Science and Combinatorics/Optimization.[position] => Chief Technology Officer & Chief Product Officer [url] => https://solace.com/blog/author/shawnmcallister/ ) )