Thursday 5 November 2015

C++ Lightweight template based encoders

I recently had to implement an encoder that could translate an ActiveMQ MapMessage into a a small number of C++ object. The traditional approach is to create a class that knows how to decode various field types and derive all the message classes from this and have them decode themselves from a MapMessage, I wanted to avoid dragging all the baggage associated with encoding/decoding into each message class object and do it another way.

A while ago I experimented with creating classes that act as a sort of template that defines the encoding format for a message. The template object is static and really just directs an encoder/decoder so it know how to populate or encode the message object. I used a simple version of this technique to solve this problem and I thought it is interesting enough to describe here in case I need it another time.

The solution I came up with looks a lot like how you would do this in Java using reflection but uses C++ and templates.

Edit: Since writing this article I went through the process of porting the code to g++ (it was originally developed using Microsoft Visual Studio 2008). Unfortunately this highlighted quite a lot of deficiencies in Visual Studio's template support so I have amended the article to cover this.

Function Pointer

The problem is how can the template object modify or read the message object contents without exposing everything (without breaking encapsulation). The answer is to use function pointers to operate the accessor functions. Say we have a class like this:

class TestMessage
{
public:
    const std::string getName() const;
    void setName( const std::string& name );
};

Then we can provide access to the name field for any instance of this class if we know what functions to call. We can use a pointer to the getName() and setName() function to provide access to the name field of any instance of a TestMessage. It's like a key you can use to unlock this field from any TestMessage. So if we define two function pointer types like this;

typedef const std::string& (TestMessage::*GetterFunc)() const;
typedef void (TestMessage::*SetterFunc(const std::string&);

Now we can get the function pointer values by doing this:

GetterFunc getter = &TestMessage::getName;
SetterFunc setter = &TestMessage::setName;

Then say we have an instance of a TestMessage, we can get or set the name field like this:

TestMessage messageInstance;

std::string nameValue = (messageInstance.*getter)();
(messageInstance.*setter)(nameValue);

Say we had a message template class that store the setter and getter function pointer for each field, we could then use these to set the message content when decoding or read the content when encoding. The template class just has to store the function pointers and can apply them to each message instance it decodes or encodes.

The tricky part is that the function pointer type varies with the message type and with the field type. We need some way of unifying this all and being able to configure a list of fields to get/set per message.

MessageTemplate template

The answer is to create a MessageTemplate base-class that is a C++ template with the message type as a parameter. Some something like this:

template<class MessageType>
class MessageTemplate
{
public:
    virtual void toFields( 
        const cms::MapMessage& mapMessage,
        MessageType&           message  const;

    virtual void fromFields(
        cms::MapMessage&       mapMessage,
        const MessageType&     message) const;

    ....
};

Now to perform these two operations, this class needs to know how to translate a message field into a message class field (a field adapter). These will vary by field type but we need a base-class for all of these so we can stuff them into a list and then iterate over them all to perform the conversion. The base FieldAdapater will looks like this:

template<class MessageType>
class FieldAdapterBase
{
public:
    FieldAdapterBase(const char *fieldName,const bool optional) 
        : m_fieldName(fieldName),m_optional(optional)
    {
    }

    virtual void toField(
        const cms::MapMessage& mapMessage, 
        MessageType& message) const=0;

    virtual void fromField(
        cms::MapMessage&       mapMessage, 
        const MessageType&     message) const=0;

protected:

     const char *m_fieldName;
     const bool m_optional;
};


In addition to the field name, the FieldAdapterBase stores if the field is optional or not. This allows for fields that don't have to always be present. The class provides a common interface for converting a single field regardless of field type.

The next step is to extend this to provide the function pointer bindings and to allow this to vary with field type. Hence we define this class:

template<class MessageType,typename FieldType>
class FieldAdapter : public FieldAdapterBase<MessageType>
{
public:
    typedef const FieldType& (MessageType::*GetterFunc)() const;
    typedef void (MessageType::*SetterFunc)(const FieldType&);

    FieldAdapter( 
        const char *fieldName,
        GetterFunc  getter,
        SetterFunc  setter,
        const bool  optional ) 
            :   FieldAdapterBase<MessageType>(fieldName,optional),
                m_getterFunc(getter),
                m_setterFunc(setter)
        {}
    
protected:

    GetterFunc  m_getterFunc;
    SetterFunc  m_setterFunc;
};

Here we define the function pointer types and we store function pointers to allow conversion. There is still no conversion going on yet. This class just provides a convenient common base for all the converters so we don't have to define the function pointer type over and over.

One problem with this class is that returning or passing const references doesn't make sense for POD types (integers etc). To make this easier we also define a POD version of the FieldAdapter:

template<class MessageType,typename FieldType>
class PODFieldAdapter : public FieldAdapterBase<MessageType>
{
public:
    typedef const FieldType (MessageType::*GetterFunc)() const;
    typedef void (MessageType::*SetterFunc)(const FieldType);

    PODFieldAdapter( 
        const char *fieldName,
        GetterFunc  getter,
        SetterFunc  setter,
        const bool  optional ) 
            :   FieldAdapterBase<MessageType>(fieldName,optional),
                m_getterFunc(getter),
                m_setterFunc(setter)
        {}
    
protected:

    GetterFunc  m_getterFunc;
    SetterFunc  m_setterFunc;
};

A FieldAdapter for encoding string types into the ActiveMQ message would then look like this:

template<class MessageType>
class StringFieldAdapter 
    : public FieldAdapter<MessageType,std::string>
{
public:
    StringFieldAdapter( 
        const char *fieldName,
        typename FieldAdapter<MessageType,std::string>::GetterFunc  getter,
        typename FieldAdapter<MessageType,std::string>::SetterFunc  setter,
        const bool  optional=false,
        const CYBERTRUST::String& defaultValue =
            CYBERTRUST::String::emptyString) 
        : FieldAdapter<MessageType,CYBERTRUST::String(
              fieldName,getter,setter,optional),
          m_defaultValue(defaultValue)
    {
    }

    virtual void toField(
        const cms::MapMessage& mapMessage, 
        MessageType& message) const
    {
        if ( this->skipOptional(mapMessage) )
        {
            return;
        }

        try
        {
            std::string value =  
                  mapMessage.getString(this->m_fieldName);
            (message.*this->m_setterFunc)(value);
        }
        catch( const cms::CMSException& e )
        {
            handleMissingFieldException(e);
        }
    }

    virtual void fromField(
        cms::MapMessage& mapMessage, 
        const MessageType& message) const
    {
        std::string value((message.*this->m_getterFunc)());

        if ( !this->m_optional || (value != m_defaultValue))
        {
            mapMessage.setString(
                 this->m_fieldName,
                 value);
        }
    }

protected:

    std::string m_defaultValue;
};

As you can see I skipped a few details like handling exceptions. The exception handling function is actually part of the base but all it does is translates the exception to something meaningful to my application.

A few other notes on this code:

  • Initially the type of the setter/getter function pointer was just GetterFunc and SetterFunc. When I compiled the code on g++ I found that I had to use the full template name AND the typename keyword to make sure the compiler knew I was talking about a type. The problem is that the compiler doesn't fully compile the templates until instantiation.
  • Whenever the code refers to an inherited data member or function I have to use the 'this->' construct as (again) the compiler doesn't fully compile the code until later.
So now we need to go back to the MessageTemplate<> class to see how to make these field adapters work. The application can sub-class a MessageTemplate and register a field adapter for each field. We need to provide a means of registering each type of field and for stuffing all the adapters in a list and processing them all. So now the MessageTemplate<> class looks like this:

template<class MessageType>
class MessageTemplate
{
public:
    virtual void toFields( 
        const cms::MapMessage& mapMessage,
        MessageType&           message  const
   {

        typename FieldAdapters::const_iterator nextFieldAdapter;

        for(    nextFieldAdapter=m_fieldAdapters.begin();
                nextFieldAdapter!=m_fieldAdapters.end();
                nextFieldAdapter++ )
        {
            (*nextFieldAdapter)->toField(mapMessage,message);
        }
   }

    virtual void fromFields(
        cms::MapMessage&       mapMessage,
        const MessageType&     message) const

    {
        typename FieldAdapters::const_iterator nextFieldAdapter;
        for(    nextFieldAdapter=m_fieldAdapters.begin();
                nextFieldAdapter!=m_fieldAdapters.end();
                nextFieldAdapter++ )
        {
            (*nextFieldAdapter)->fromField(mapMessage,message);
        }
    }

protected:


    void registerStringField(
        const char     *fieldName,
        typename StringFieldAdapter<MessageType>::GetterFunc getter,
        typename StringFieldAdapter<MessageType>::SetterFunc setter,
        const bool     optional=false,
        const std::string& defaultValue=std::string() )
    {
        m_fieldAdapters.push_back( 
            new StringFieldAdapter<MessageType>(
                fieldName,
                getter,
                setter,
                optional,
                defaultValue));
    }

    ....


    typedef std::vector< boost::shared_ptr< FieldAdapterBase<MessageType> > > FieldAdapters;

    FieldAdapters m_fieldAdapters;
};

To apply this to our TestMessage class we would define a MessageTemplate class as follows:

class TestMessageTemplate : public MessageTemplate<TestMessage>
{
public:
    TestMessageTemplate()
    {
       registerStringField(
           "Name",
            &TestMessage::getName,
            &TestMessage::setName );
    }
};

Then to use use this we do something like this:

TestMessageTemplate testMessageTemplate;

TestMessage testMessage;
// Init the testMessage contents

cms::MapMessage *mapMessage = session->createMapMessage

testMessageTemplate.fromFields(mapMessage,testMessage);

Advanved Topics

Enumerated Types

A bunch of the fields were enumerated types so I created a converter to specifically deal with these. The code is common but the enum type will vary by field.

template<class MessageType,typename EnumType>
class EnumFieldAdapter : public PODFieldAdapter<MessageType,EnumType>
{
public:
    EnumFieldAdapter( 
        const char      *fieldName,
        typename PODFieldAdapter<MessageType,EnumType>::GetterFunc      getter,
        PODFieldAdapter<MessageType,EnumType>::SetterFunc      setter,
        const bool      optional,
        const EnumType  defaultValue ) 
        :   PODFieldAdapter<MessageType,EnumType>(
                fieldName,getter,setter,optional),
            m_defaultValue(defaultValue)
    {
    }

...

};

Note that this is derived from the PODFieldAdapter as enums wouldn't normally be passed by reference. Then in MessageTemplate we define a register function like this:

    template<typename EnumType>
    void registerEnumField(
        const char *fieldName,
        typename EnumFieldAdapter<MessageType,EnumType>::GetterFunc getter,
        typename EnumFieldAdapter<MessageType,EnumType>::SetterFunc setter,
        const bool optional=false,
        const EnumType defaultValue=(EnumType)0)
    {
        m_fieldAdapters.push_back( 
            new EnumFieldAdapter<MessageType,EnumType>(
               fieldName,
               getter,
               setter,
               optional,
               defaultValue));
    }

Then say we had methods like this in out TestMessage class

enum EnumType
{
   ENUM_VALUE1,
   ENUM_VALUE2
};

const EnumType getEnumValue() const;
void setEnumValue(const EnumType value);

In our MessageTemplate class we could register this field as follows:

registerEnumField<TestMessage::EnumType>(
    "Enum_Field",
    &TestMessage::getEnumValue,
    &TestMessage::setEnumValue);

Subclasses

In my case the message types had a lot of commonality and it made sense to define base message classes and sub-classes. For example all of the response messages had common fields for the status of the operation and for returning error codes if the operation failed.

Say we had a ResponseMessage class and then another SpecialResponse that derives from this, we want the SpecialResponseMessage template to covert both the ResponseMessage fields and the SpecialResponseMessage field.

To make this work I created a class for sub-class templates as follows:

template<class ParentMessageType,class MessageType>
class MessageSubClassTemplate : public MessageTemplate<MessageType>
{
public:
    MessageSubClassTemplate( 
        const MessageTemplate<ParentMessageType>& parentTemplate ) 
            :   MessageTemplate<MessageType>(),
                m_parentTemplate(parentTemplate)
    {
    }

    void toFields(
        const cms::MapMessage& mapMessage, 
        MessageType& message) const
    {
        m_parentTemplate.toFields(mapMessage,message);
        MessageTemplate<MessageType>::toFields(mapMessage,message);
    }

    void fromFields(
        cms::MapMessage& mapMessage, 
        const MessageType& message) const
    {
        m_parentTemplate.fromFields(mapMessage,message);
        MessageTemplate<MessageType>::fromFields(
            mapMessage,
            message);
    }

protected:

    const MessageTemplate<ParentMessageType>& m_parentTemplate;
};

The basics of it are:
  • The MessageSubClassTemplate knows the MessageType has a base-type and that base type has a MessageTemplate of its own.
  • MessageSubClassTemplate derives from MessageType to implement the sub-class field conversion.
  • MessageSubClassTemplate takes as a constructor parameter a reference to the parent message class template.
  • When converting the fields, the MessageSubClassTemplate first converts the base class fields and then converts the sub-class fields (as normal).
To use this you create the base class template as before and then you create the sub-class template like this:

class SpecialResponseTemplate 
    : MessageSubClassTemplate<ResponseMessage,SpecialResponse>
{
public:
    SpecialResponseTemplate(
        const MessageTemplate<ResponseMessage>& parentTemplate )
    : MessageSubClassTemplate<ResponseMessage,SpecialResponse>(
          parentTemplate)
     {
         // register the SpecialReponse fields
     }
};

To use the SpecialResponseTemplate you construct it passing in the parent template and then use it like normal.

ResponseMessageTemplate responseMessageTemplate;
SpecialResponseTemplate specialResponseTemplate(
                           responseMessageTemplate);

SpecialResponse message;
responseMessageTemplate.toFields(mapMessage,message);

More

Other things worth doing might be to implement aggregate messages where you can have some-fields within a contained object.

At the moment you have to know what type of message to expect but it would be nice to define some factory mechanism where the system figures out what to instantiate for you and then runs the conversion. Sort of like a container for templates that does the encoding.

Anyway that's it for now but I'm sure this technique could be applied to all sorts of encoding and conversion problems.

No comments:

Post a Comment