Strengths of the C++ Middleware Writer

The C++ Middleware Writer has advantages over the Boost Serialization library in the following areas:
Usage Performance Support for Messages Boost Intrusive Support

Usage:

  1. We provide full generation of marshalling functions. Users of the Boost Serialization library must maintain serialize functions for their types by hand. Here's an example from the Boost Serialization library tutorial:
    class bus_stop
    {
        friend class boost::serialization::access;
        template
        void serialize(Archive & ar, const unsigned int version)
        {
            ar & latitude;
            ar & longitude;
        }
        gps_position latitude;
        gps_position longitude;
    protected:
        bus_stop(const gps_position & lat_, const gps_position & long_) :
        latitude(lat_), longitude(long_)
        {}
    public:
        bus_stop(){}
        virtual ~bus_stop(){}
    };
    
    
    Derived classes should include serializations of their base classes.
    
    class bus_stop_corner : public bus_stop
    {
        friend class boost::serialization::access;
        template
        void serialize(Archive & ar, const unsigned int version)
        {
            // serialize base class information
            ar & boost::serialization::base_object(*this);
            ar & street1;
            ar & street2;
        }
        ...
    };
    

    With the C++ Middleware Writer, marshalling functions are written and maintained automatically. There's no need to make parallel changes to functions by hand when data members are added or removed from a type. Here's how that example from the Boost tutorial is written with our approach:

    class bus_stop
    {
        gps_position latitude;
        gps_position longitude;
    protected:
        bus_stop(const gps_position & lat_, const gps_position & long_) :
        latitude(lat_), longitude(long_)
        {}
    public:
        bus_stop(){}
        virtual ~bus_stop(){}
    
        bool Send(Buffer*, bool = false) const;     
        bool Receive(Buffer*);                   
    };
    
    
    class bus_stop_corner : public bus_stop
    {
        ...
    public:
        bool Send(Buffer*, bool = false) const;     
        bool Receive(Buffer*);                   
    };
    

    The Boost Serialization version has 12 more lines than our version and these are small classes.

  2. There's no need for friend declarations. Did you notice the friend declarations in the Boost Serialization version? Our approach writes the marshalling functions based on the content of the types involved. So there's no need to add friend declarations.

  3. On line access to multiple versions of the softare. Last but not least, our on line approach allows us to efficiently push out new releases and fixes without people having to go through the download/build cycle "one more time."

Performance:

Testing notes

For testing purposes we always build the Boost Serialization library with variant=release and link=static. The tests are run three times in a row using a semicolon on Linux to separate the executions. The fastest time of the three is the only one used in the calculations.

Save/Send tests built with gcc version 4.4.2 and Linux 2.6.31.5

This set of programs serialize/send a list of ints.

  Boost version
  Ebenezer version
  Generated code included by Ebenezer version

With -O3 optimization and 500,000 elements as input to both programs, the Boost version (using Boost 1.41) is generally between 1.8 and 1.9 times slower than the Ebenezer version and the stripped Boost Serialization executable is more than four times larger than the stripped Ebenezer executable.

The following programs send a list of ints and a deque of ints.

  Boost version
  Ebenezer version

With -O3 and 500,000 elements, the Boost Serialization version is between 2.2 and 2.3 times slower than the Ebenezer version. The stripped Boost Serialization executable is more than four times larger than the stripped Ebenezer executable.

Save/Send tests built with MSVC++ 10 and Windows Vista

The Boost Serialization versions (using Boost 1.41) of both of the above tests are between 3.3 and 4.2 times slower than the corresponding Ebenezer version when -O2 optimization is used. These tests were run with five million and fifty million as input. The Boost Serialization executables are both more than two times larger than the Ebenezer versions.

  Boost/Windows version
  Ebenezer/Windows version

Load/Receive tests built with gcc version 4.4.2, -O3 optimization, Boost 1.41, and Linux 2.6.31.5

The following programs load/receive a list of ints.

  Boost version
  Ebenezer version
  Generated code included by Ebenezer version

With a command line argument of 500000 for both programs, the Boost Serialization version is between 1.4 and 1.5 times slower than the Ebenezer version and the stripped Boost Serialization executable is more than five times larger than the stripped Ebenezer executable.

The following programs load/receive a set of ints.

  Boost version
  Ebenezer version

With a command line argument of 500000 for both programs, the Boost Serialization version is generally between 2.7 and 2.9 times slower than the Ebenezer version and the stripped Boost Serialization executable is more than five times larger than the stripped Ebenezer executable. The Ebenezer version of this test uses a hinted insert function.

Performance Summary

A few have asked for some interpretation of the results. To our knowledge no C++ compiler has a switch to output the code they generate based on a template library like Boost Serialization. Without this transparency, it is difficult to know if the relatively poor performance of the Boost library is due to the library or the compilers.

We're interested in hearing how the approaches stack up when using other compilers or operating systems. Do we have any strong competition?

Support for Messages:

  1. The Middle language we use offers support for messages that is not available with the Boost Serialization library. For example, the following Middle code has two messages.

    msg_manager
      (unsigned char, vector<char*>, string, unsigned char, bool)  @msg_id_middleware_request
    
      (short, string)  @msg_id_result
    }
    
    Here are the first few lines the C++ Middleware Writer writes given that input:
    // Generated by the C++ Middleware Writer version 1.8
    
    #include <string>
    #include <string.h>
    #include <vector>
    #include <Buffer.hh>
    #include <MarshallingFunctions.hh>
    
    unsigned int const msg_id_middleware_request = 1201;
    unsigned int const msg_id_result = 1202;
    unsigned int const max_msg_id = 1203;
    

    Notice the msg_id... constants that are part of the output. This approach of naming the messages with @msg_id... helps keep things straight about a message's purpose and the constants are useful in developing applications.

    Additionally, when @msg_id... is used, the generated Send functions post the value associated with their message id before posting anything else.

  2. We extend the message support by having an option that automates the calculation of message lengths. The Boost Serialization library has no message length calculation functionality, since it has no message concept in the first place.

Support for Boost Intrusive Containers:

Only the C++ Middleware Writer has support for Boost Intrusive containers. Neither Boost Intrusive nor Boost Serialization offers serialization support for these most useful containers.

Weaknesses of the C++ Middleware Writer

The primary weakness we have is that we don't support some C++ features. We have limited support for user defined templates, we don't yet support multiple inheritance, nested classes and some other features of the language. We are working on shortening this list and G-d willing we'll have support for multiple inheritance by March, 2010.


Home