Big data gives companies unprecedented insight into their business by capturing information about all kinds of events and transactions. Using tools like Flume, Spark and Storm to ingest data into file datastores like Hadoop HDFS, or NoSQL datastores like HBase, Cassandra or MongoDB, they can perform real-time and after-the-fact analytics on all types of data.
At the same time, their businesses run on a complex IT infrastructure composed of custom and commercial applications written in many languages, linked via a variety of ESBs, API gateways and web servers, running in cloud, legacy and hybrid environments.
Many of the interactions between these applications, the logs they generate and the data they store are interesting events that should be captured and analyzed, but do you really want to deploy yet another one-off data movement technology to do this? How does that make your environment less complex and more robust?