Reason 6: Appliances remove the operational blindfold

This is part 6 in a 10 part series: 10 reasons for the growth in middleware appliances. The series summarizes what we’ve learned from our clients about what they value in appliances and why they selected Solace.

One of the least appreciated jobs in IT is that of the operations team that keeps vital systems running. They inherit applications, figure out how to keep them running, scale them when they’re successful, prepare for any contingency and try to quickly correct unanticipated problems before anybody notices. And nobody knows their names until there’s a problem.

But the vendors are on their side, right? With those sexy dashboards that let them do crossword puzzles until a buzzer goes off or a red light flashes. Well, not exactly. If you talk to anyone who has run a high-performance messaging infrastructure, they’ll tell you those dashboards only help if you can afford the peformance hit of turning them on.

Monitoring the Production Messaging Environment

Software middleware messaging products are generally capable of gathering statistics or updating log files, and you can set the level of detail you want to see, but doing that tracking means interacting with the operating system, which adds incrementally more overhead to the work of sending and receiving messages. So each turn of the dial means your infrastructure slows down or has a little less capacity. Our customers tell us that for popular JMS products, turning on statistics monitoring impacts latency and throughput by up to 50%, so customers with serious performance requirements leave it off.

That means you can take advantage of monitoring during development and in preparation for deployment, but when it’s time to go live you have to turn it off and hope your testing was sufficient. I feel like software vendors should accompany descriptions and demos of their dashboards with the disclaimer car companies use in their commercials: “Do not attempt. Professional driver on a closed course.” Trying to figure out what is happening when problems arise in a production environment, when you have no operational visibility, is like flying an airplane into the clouds without instruments or air traffic control. It’s not a lot of fun.

Monitoring with Hardware: “What’s your vector Victor?”

Solace messaging, simply put, doesn’t have this problem. Solace deployments let operators crank monitoring configurations to the max with the confidence that it won’t affect latency or throughput. That’s because special purpose hardware like FPGAs and network processors excel at doing work in parallel, and with hardware there’s no operating system to get in the way. What’s more, the appliance also controls all network operations, so the visibility isn’t limited to messaging-level statistics—Solace’s appliance also provides visibility into lower-level details like network congestion or disconnects.

One of the most fun stages of our customer trials or deployments is when we show the operations team that they can now fly their plane with the instrument panel turned on. You can imagine what better (and happier) pilots that makes them!

Monitoring Peer-to-Peer Environments

So far, we’ve talked about how software monitoring and Solace’s equivalent are handled for broker based messaging. The problem is even worse if you’re trying to see what’s happening in a peer-to-peer configuration, as would commonly be used for market data distribution. Not only are peer-to-peer environments subject to the same lack of visibility problems as described above, but even if you could capture this data, you’d have to capture it everywhere (at every peer), and as described above, that would slow down execution everywhere. Of course, in peer-to-peer, the endpoints already do all the filtering of the multicast firehose down to just the data needed at that node, which introduces futher work (and delays) for that application.

If there are hundreds of nodes generating and receiving information, and none of them are giving you any operations detail,  where do you start to look for the source of the problem when something goes wrong?

With Solace, market data delivery is done using unicast from the appliance to each subscriber. Each subscriber gets a filtered feed of just the data they need, so no client-side filtering is required. From a monitoring perspective, there is a central monitoring point, and there is no penalty for full monitoring and visibility. If you’re on the operations team responsible for keeping these systems working, that’s a huge improvement in terms of seeing the health of the system.

No More Flying Blind

For many of our customers, this difference in operational visibility was a key reason they chose Solace. On more than one occasion, our customers have (initially) refused to believe what they were seeing when we showed them their applications delivering as many as 10 million messages per second with real time monitoring of users, connections, queue depths and more. Realizing Solace delivers that kind of visibility in a production environment is a game changer for them, and a real competitive edge for us.

On to Reason #7…