Engineering

Here be dragons

XMPP, has a long history at AeroFS. It has been a key component of our peer-to-peer overlay network for the past 5 years. We’ve used it for peer discovery, multicast across LAN boundaries, and as a signaling channel to establish peer-to-peer connections though our relay service.

On the one hand, it got the job done. On the other hand, working with XMPP has been often unpleasant as it is very complex, rather verbose, and suffers from severe extensionitis.

ejabberd is the leading implementation and comes with excellent support for XMPP and its myriad of extensions. It is stable but memory-hungry, with a resident set reaching about 500kB per user, quickly dwarfing our other services as the number of user grows.

Its CPU usage is also considerably higher than we would expect for such a low-traffic service. To make matters worse, for a while it was also leaking memory like a sieve, and we had to rely on a cron job to restart it daily!

On the client side, the very thought of touching the XMPP code filled most of our engineers with dread. Hence, the idea of moving away from it altogether was a strong undertow: eclipsed by activity on the surface but latching onto anyone unfortunate enough to dive into the depths of the transport layer.

Suddenly a wild intern appears

Over the summer, one of our interns was working on a new feature in the transport layer. It quickly became apparent that his work would require passing somewhat sensitive information through XMPP. This was a significant departure from our use of XMPP until then and crucially it meant we could no longer get away with using an unauthenticated connection.

AeroFS devices use [client certificates](/security/spec/] to authenticate against most of our internal services. We were hoping to do the same for XMPP and, unsurprisingly, among the tangle of XEPs, one appeared to us as a beacon of hope.

We spent a few days of work updating to a much newer version of smack, even submitting a patch upstream to improve support for client certificate authentication, only to learn that ejabberd did not support the required extension. More precisely, the open source edition only supports certificate authentication for server-to-server connections. We considered switching to an alternate XMPP server but ultimately decided against it as the transition cost would have been significant.

Going postal

The frustration was strong and all our misgivings about XMPP surged to the surface. This was right at the time when our foray into Go was starting to bear fruit, the week was almost over and I had no plans for the coming weekend.

The temptation of scratching two itches at the same time was overpowering. I set out to find or draft a protocol specification that would cover all our use cases. The overarching goal was simplicity and it had to be taken to a minimalist extreme to fit in a single weekend.

Protocols like MQTT and STOMP are interesting but not quite simple enough. The EventSource specification is closer to what I was aiming for and a coworker had experimented with it in the past with some success. Unfortunately, it does not allow requests and events to be interleaved on the same connection.

The final specification borrows heavily from popular text-based protocols like HTTP, SMTP, and STOMP but it is significantly smaller and simpler, as evidenced by its short ABNF grammar:

message     = ( request | response | event ) LF

request     = "LOGIN" SP id SP id [ SP payload ]
            | "CLOSE"
            | "PING"
            | "PONG"
            | forwardable

response    = code [ SP payload ]

event       = "000" SP id SP ( forwardable | "PING" | "PONG" )

forwardable = "SUBSCRIBE" SP id [ SP "PRESENCE" ]
            | "UNSUBSCRIBE" SP id
            | "UCAST" SP id SP payload
            | "MCAST" SP id SP payload
            | "BCAST" SP payload
            | compat

compat      = verb [ SP id ] [ SP payload ]

code        = 3DIGIT
verb        = 1*UPALPHA
id          = 1*ID
payload     = 1*PAYLOAD

ID          = UPALPHA | LOALPHA | DIGIT
            | "." | ":" | "@" | "/" | "_" | "-" | "+" | "=" | "~"
PAYLOAD     = <any 8-bit value, except US-ASCII LF>
UPALPHA     = <any US-ASCII uppercase letter "A".."Z">
LOALPHA     = <any US-ASCII lowercase letter "a".."z">
DIGIT       = <any US-ASCII digit "0".."9">
SP          = <US-ASCII SP, space (32)>
LF          = <US-ASCII LF, linefeed (10)>

We christened it the “Stupid-Simple Messaging Protocol” or SSMP.

As its name suggests, SSMP is meant to be incredibly simple and deliberately forgoes some advanced features offered by other open messaging protocols, such as message acknowledgements and wildcard subscriptions. Our key design goals were:

  • Text-based, for easy debugging
  • Interleave requests/responses and server events on a single connection
  • Simple enough that a complete and efficient client or server can be written
    in pretty much any programming language within a few hours

The reference implementation was written in Go in a matter of hours and quickly integrated into our build system thanks to our gockerize tool.

An alternate implementation was written in Java, for use in the AeroFS desktop client, in roughly the same amount of time.

Results

Replacing all uses of XMPP in the desktop client took longer than writing the client and server code as it required some careful refactoring. Theses changes were shipped in the appliance starting with version 1.1.

Unsurprisingly, this new server follows the trend we’ve noticed in our previous uses of Go:

  • smaller, more maintainable code
  • much reduced memory footprint
  • much reduced disk footprint
  • reduced CPU usage

Obviously, a fair share of these improvements stem not from language choice but from a drastically simpler design. However, implementing SSMP in both Go and Java makes a convincing case that Go is the more readable and maintainable of the two, which would still be extremely valuable even without the stark difference in resource usage.

This experiment was a resounding success and we eventually transitioned the rest of our internal Pub/Sub uses to this new protocol and the consolidation further reduced the appliance resource requirements.

Going forward

Today we’re happy to release SSMP to the community:

We hope that they will prove useful to others and we look forward to hearing your feedback and suggestions. Don’t hesitate to reach out to us at oss@aerofs.com or on GitHub.
— Hugues & the AeroFS Team.

PS. We’re hiring!