At SecondMarket, we’re moving towards a development model where not only are the deployments continuous but where deploys incur no downtime. Users should not notice if we take a portion of our servers out for maintenance, even if they’re logged into the site and have an active session. We decided to tackle this problem by persisting Java sessions to external storage. This allows another Jetty to take over serving of existing sessions if we decide to take down a Jetty for maintenance.
Evaluating Options: JDBC, Mongo or Memcache?
There are a number of options for persisting sessions in Jetty to an external engine. The oldest and most well-known technology is to use an SQL database via the JDBC session manager. We already use PostgreSQL as our main relational database, but it’s a critical part of our infrastructure and we weren’t sure we wanted to put session data on the same system; session data definitely doesn’t have the same criticality. We were also concerned about the performance implications of a relational database in this use-case.
Instead, we looked at two NoSQL options via the Jetty NoSQL session plugin available in Jetty 8: MongoDB and memcached. We ultimately settled on MongoDB, not only because it’s the reference implementation for NoSQL sessions but because we have operational experience with MongoDB. (We store some non-critical information in it, like news feeds about companies on the platform.)
Configuring the SessionManager and SessionIDManager
There are two managers to configure in Jetty: the session manager and the ID manager:
- The session ID manager ensures that session IDs are unique across all webapps hosted on a Jetty instance, and thus there should only be one session ID manager per Jetty instance.
- The session manager handles the session lifecycle (create/update/invalidate/expire) on behalf of a web application, so there is one session manager per web application instance.
I decided to configure the session ID manager for each Jetty instance using a separate XML file external to Jetty, calling it jetty-mongo-sessions.xml. This way I could either include or not include it in Jetty’s start.ini as circumstances required. Here’s what I used:
(I’m not a Jetty wizard so I realize I probably could have done this with more <Set> clauses rather than <Call> clauses. Feel free to edit my Gist if you can improve the syntax.)
Naturally, this file is written by Chef. In my Jetty sessions recipe, I do a search on all nodes that are Mongo replicaset members, and dynamically build the ArrayList. I then configure the session manager for each webapp in its context file with something like this:
Odds and Ends
A couple other odds and ends and we’re ready to go:
- The NoSQL extensions don’t come with the core Jetty distribution, so you have to download them from either Codehaus or Eclipse’s website (depending on which variant you use)
- The MongoDB Java driver has to appear somewhere in Jetty’s class path. I built an RPM for it that drops it in /usr/share/java, and then I just symlink it into Jetty’s lib directory.
- start.ini has to include the nosql extensions in OPTIONS= in addition to specifying the jetty-mongo-sessions.xml as another config file to read.
Firing It Up
If you start up Jetty now, it should connect to MongoDB and automatically create both the database and collection to hold the session data.
Pitfalls and Warnings
All this worked fine when we were running on a development environment, but apps started to break once deployed to a clustered Mongo environment. A couple things we discovered:
- One of our apps is written in Lift and we are using some features of Lift that are incompatible with clustered session storage. For speed, the Lift developers have made these features work only with memory-based sessions, so we have had to turn off Mongo sessions for this app; it can’t be clustered.
- Mongo is an eventually-consistent database system, so if you’re writing data to a replicaset (master) and then read from a slave, you may or may not get the data you just wrote. That’s because in order for the data to make it to the slave, it has to be written to the master’s journal and then replicated across the wire to the slave. So I’d strongly recommend not turning slaveOk on.
- More seriously, we discovered that developers were using in-memory sessions to store long-lived objects, rather than using a distributed object cache like EHcache. Session storage is supposed to be a short-lived stack onto which one pushes things needed for the next page view, where they’re popped off. When using in-memory sessions, direct manipulation of the HTTPSession object (via setAttribute()) leads to correct behavior: session data is magically updated. But with NoSQL or JDBC sessions, the session isn’t persisted to the backing store immediately, only when dirty and after the active request count goes to zero. In a distributed cluster without session affinity, this can cause inconsistency: node 1 writes data to the session, the user is sent on their way, only to hit node 2, which tries to read the session from disk & a race condition occurs because the session hasn’t been written there yet by node 1. The long-term solution, of course, will be for us to implement the aforementioned distributed object cache. At the last minute we were forced to set setSavePeriod(-2) in our Jetty config to force session data to be persisted to disk every time setAttribute() is called. (Thanks to my colleague Cal Watson for finding this. He was nominated for, and won, the monthly SecondMarket Peer Bonus award for his -2.)
Using MongoDB as a backing store for session data is absolutely feasible; the NoSQL extensions provided by Jetty are high-quality. In retrospect, with the misuse of session data above, there was no other solution that would have worked for us, so we accidentally happened across the one that was ideal. Had we implemented JDBC (or even Memcache) sessions, we would have been in serious trouble; the sessions collection is about 5.6 GB. This sort of raw, random data turns out to be an excellent fit for Mongo.
Implementing disk-based sessions also moves us one step closer to zero-downtime continuous deploys, and we’re looking forward to cleaning up the rest of our architecture to make that a reality.