In the first post of this series I introduced the structure of Solace messages, and in part two I talked about how application header and user-defined properties can be used to decorate messages to aid processing the message data. Now it’s time to talk about how the data itself is structured – how to put the payload in the message.
Don’t I Just Add the Data?
At first glance this question seems trivial: don’t I just put the data in the message? That’s the point of a data agnostic messaging system, surely? Well, yes… and no. I hope you weren’t expecting a different answer!
Solace PubSub+ Event Brokers will transport whatever data you choose to put in your message (provided it fits within the maximum message sizes, 64MB for direct and 30MB for guaranteed). But there are other considerations:
- Is your network heterogeneous? For instance, are mobile devices part of your messaging infrastructure? If so, you need to be aware of differences between architectures such as endianness and type sizes.
- How will you control the payload structure? What if the data format needs to change?
- What performance requirements are there? A low latency message exchange, say for publishing financial market data, should impose a minimal parsing overhead. At the other extreme, an IoT system may need the simplest possible encoding choice to reduce remote endpoint cost and power consumption.
Options for Encoding Payload Data
There are plenty of formats you could use to encode data. If you are using a text format, XML gives you the flexibility of a schema and data format, at the expense of considerable verbosity and requiring XML en/de-coding. JSON is native to Javascript and easily written to and read from other languages, but again it is a text format – easy for humans to read but relatively inefficient in terms of message size and processing requirements. Gzip compression can address the message size issues, but the processing requirements still remain. I provide more details on using text formatted messages in the post What a Load of JSON: XML and other Text Data in The Solace APIs.
You could, of course, just create a binary blob – in java, write a byte buffer to a message. This has the advantage of high speed read/write and very little overhead – so it’s the choice for performance (e.g. if you’re looking for low latency, this would probably be the fastest as the messages will be small and quickly written and read). However, there’s a dependency between your message format and the messaging code which makes this option difficult to manage and maintain: changing the message format cannot be managed or versioned separately to your code. If you have a heterogeneous computing environment you’ll have to be careful to make sure your binary representation is read correctly on different machine architectures.
If you need to use an external data format you’ll also have this problem – if the format changes, you’ll have to change your code even if you don’t want to use the what’s been changed.
Solace APIs offer Structured Data Types (SDTs). We’ve covered the basics of SDTs in the previous post where we added SDTs to a user-defined header property. However, the payload data can also be structured, which takes care of any architectural differences for you, and provides a way to structure the data which may save you time. I explain that in a post where I talk about 3rd party data serialization systems.
Data serialization packages, such as Google Protocol Buffers and Avro offer an alternative. The message format is separate from the serialization API, so changes to the message format need not break messaging code. The serialization API deals with architectural differences. Data encoding is binary, and so efficient. The intermediate API means it will not be as fast as a binary blob, but this has to be traded off against ease of use, maintainability and manageability. However, both the Solace API and serialization API must now be part of your deployment.
Mixing Encoding Options
Suppose your diverse requirements mean you need to use several different payload encoding options. How do you ensure that, for instance, a text encoded payload doesn’t break an application that’s expecting a binary blob?
There are two ways Solace helps you here:
- Use message decoration (for example a User Defined Message Header Field, such as
Encoding type=text-json-ourformat-1
). This adds to the message size, and requires a pre-processing step at the consumer; - Separate the topic hierarchy via data format. Include the payload encoding type in the topic string, for example:
/myApplication/directMessage/json/object1/v1.3/…
It may be necessary to mix encoding styles within the same topic space, which makes option 2 difficult to implement. If you can implement both you will have extra security – it will be more difficult to publish the wrong message data encoding and break consumers.
Solace API Message Types
In the previous posts we’ve glossed over the fact that the Solace API can provide different message types, so it is time to consider them. Message types vary across protocols and API languages, but there are 4 basic Solace message types:
- Binary – unstructured byte blob. You are responsible for ensuring the consumer interprets it correctly;
- Text – structured, a string;
- Map – an SDT map;
- Stream – an SDT stream.
Message Types in the APIs
The Java and JavaScript APIs provide different message types for use by the developer: BytesMessage (Binary), MapMessage, StreamMessage and TextMessage. JMS supports these with an additional ObjectMessage for serializable objects.
The C and C# APIs are more restrictive in that they provide only binary payload (attachment) types, however helpers are provided to populate the binary payload with typed data, for instance solClient_msg_createBinaryAttachmentMap()
and solClient_msg_getBinaryAttachmentMap()
in C. The Java RTO API does not have these helpers so you will need to use text or binary.
Sending Typed Messages
Sending a message of a particular type is fairly simple: we simply create a message of the correct type and send it:
MapMessage msg = JCSMPFactory.onlyInstance().createMessage(MapMessage.class); producer.send(msg, topic);
Receiving Typed Messages
On the reception side, though, we have to be more careful. It’s tempting to assume all incoming messages will be a specific type, but of course that’s not suitable for production code. At the very least we need to behave gracefully in the case we get an unexpected message:
public void onReceive(BytesXMLMessage message) { if (message instanceof StreamMessage) { … // do something } else { log(“Unexpected message ignored, type: “ + message.getClass()); } …
Of course it’s possible you are expecting messages of different types and need to handle each type.
A quick word on instanceof()
: It is generally considered bad practice to use instanceof
, as polymorphism is a preferred way to act upon objects of different types. However, in this case, the type of the message fundamentally changes how we deal with its contents.
Providing Hints to the Receiver
As I mentioned in the previous post, User Defined Header Fields provide a way to prompt the receiver about the contents of a message without forcing the receiver to examine the entire message contents. Let’s take an example. Imagine we have a sensor network of thousands of sensors periodically sending readings. There are different sensors types, which are sending binary readings on different topics. To allow for different versions of the same sensor type, we have a topic hierarchy level for the version number:
sensors/temperature/v2.3.004/readings
However, sensors go wrong. We’ll probably have a separate topic for error messages:
sensors/temperature/v2.3.004/status/errors
The sensors send the error as a string, so we’ll need a TextMessage
, but at the same it would be helpful to log the error in our readings database. Here’s what this might look like on the sender side:
BytesMessage readingMsg = JCSMPFactory.onlyInstance().createMessage(BytesMessage.class); TextMessage statusMsg = JCSMPFactory.onlyInstance().createMessage(TextMessage.class); SDTMap statusMap = JCSMPFactory.onlyInstance().createMap(); if (reading.isError()) { statusMap.putBoolean(“ReadingError”, true); statusMsg.setText(reading.ErrorString()); producer.send(statusMsg, errorTopic); readingMsg.setData(… // You’d probably put nothing in here } else { statusMap.putBoolean(“ReadingError”, false); readingMsg.setData(reading.getReading()); } readingMsg.setProperties(statusMap); producer.send(readingMsg, readingTopic);
On the receive side, I can use this hint to deal with the readings with errors:
public void onReceive(BytesXMLMessage message) { if (message instanceof BytesMessage) { // Get the hint from the User Defined Header Property try { if (message.getProperties().getBoolean(“ReadingError”)) { reading.putData(0); // Probably – whatever suits here reading.error = true; } else { reading.putData((BytesMessage) message.getData()); reading.error = false; … } } catch (SDTException e) { // Probably no SDT in the message, so this message wasn’t intended //for this receiver. } // catch } else if (message instanceof TextMessage) { // This is probably a status update – in this example, this should // not be received because we use a separate topic. } else { log(“Unexpected message type, ignoring”); } // if message instanceof …
Summary
There are four main Solace message types: Binary, Text, Map and Stream. If performance is your only concern, you’ll probably use Binary. If you’re concerned about data traversing heterogeneous networks, map and stream messages abstract this complexity away. However, the lack of references in SDTs mean more complex data will probably need to be in either native format or in a data serialization package such as Google Protocol Buffers. Payload data formats should be separated in the topic hierarchy. Sometimes it isn’t possible to have only one message type in a topic, in which case message decoration using User Defined Header Properties can be very helpful.
To learn more about text data in the Solace APIs, check out this new post about XML/JSON and Solace.