Saturday, 3 March 2012

Mule ESB - Resilient JMS

We use the Community Edition of Mule ESB for several tasks including processing earthquake messages.  A critical function is to read earthquake information messages from SeisComP3.  These are output to a spool directory as SeisComPML.  They are converted to a simple XML format and put onto the messaging (in this case JMS provided by ActiveMQ).  The seiscomp-producer Mule project that does this is available on github.

I've been working on converting the seiscomp-producer project from Mule 2 to Mule 3 and revisiting resilience in the ESB.  In this post we'll look at the ActiveMQ JMS connectors and getting the application to start with ActiveMQ down and survive ActiveMQ restarts without having to restart Mule.  Retries for failed connectors are available in the Enterprise Edition of Mule or by using the common retry policies with the Community Edition.  As it turns out we don't need either with ActiveMQ.

Testing with Mule 3

I'm testing with Mule 3.2.0 and ActiveMQ 5.5.1.  Mule 3 makes it very easy to hot deploy multiple applications into the Mule server.  To help with testing seiscomp-producer I've created a very simple Mule application that uses a quartz timer to send a message to the logs every 5 seconds.

The seiscomp-producer uses two JMS ActiveMQ endpoints to send messages.  One durable (for the quake messages that must be delivered) and one non-durable for its own heartbeat messages.


In this case seiscomp.producer.amq.url=tcp://localhost:61616

With ActiveMQ running when we start Mule the seiscomp-producer starts fine, connects to JMS, and starts sending its heart beat messages to the soh JMS topic.


However, there are problems with this configuration:

  • If ActiveMQ is down when Mule starts the seiscomp-producer is not started.  
  • If ActiveMQ is restarted while Mule is running the seiscomp-producer looses it's connection to JMS and never reconnects.

The common retry policies address these issues by adding retries to Mule Community Edition connectors.  To get this functionality with Mule 3.2.0 and ActiveMQ 5.5.1 all we have to do is set the broker URL in the connectors to failover:

... brokerURL="failover:(${seiscomp.producer.amq.url})" ...

Now if we start Mule with ActiveMQ down Mule starts ok (the chatter app starts logging) and when ActiveMQ starts then seiscomp-producer connects to JMS and starts sending messages.  Similarly, if we restart ActiveMQ while Mule is running then seiscomp-producer will reconnect to JMS.

Restarting ActiveMQ closes the socket nicely and give a chance for the client to notice the stopped connection.  What about if ActiveMQ just goes away (e.g., the network connection is physically lost without a nice shutdown).  We can test this by dropping packets using iptables.  Thanks to Richard Guest (a GNS Science coworker on the GeoNet Project) for the suggestion and iptables rule to drop packets.   This time we run ActiveMQ on a remote server.  Once the connection is up we drop all packets coming from the remote host.  From the perspective of the client ActiveMQ is gone but there has been no nice shutdown to close the socket.  Here 'server' is the remote host that ActiveMQ is running on.

  iptables -I INPUT -s 'server' -j DROP

With packets being dropped the client times out after the channel has been inactive for to long and attempts to reconnect using the failover URL.  This goes on until we stop dropping packets simulating the ActiveMQ server becoming available again and then the client reconnects.

  iptables -D INPUT -s 'server' -j DROP

Using the failover broker URL you can define a list of hosts to try to connect to.  These are either selected randomly from the list or append randomize=false  to always use the first broker in the list

 brokerURL="failover:(tcp://amq1.com:61616,tcp://amq2.com:61616)?randomize=false"

If the first broker fails then the client connects to the second broker.  For ActiveMQ 5.5.1 it does not then reconnect to the first broker when it becomes available again.  It looks like keeping the first broker as the preferred one should be available as a feature in ActiveMQ 5.6 using priorityBackup.

The final connectors look something like:


Using ActiveMQ and MULE3 we have easily achieved fault tolerant messaging with a range of options for controlling failover.

Finally, we're dealing with a very low message throughput.  We rely on message storage in Mule and ActiveMQ to keep important messages when the connection is lost.  If you try this approach with a high message throughput do plenty of testing to ensure that messages do not get lost.

2 comments:

  1. excellent piece of information, I had come to know about your website from my friend kishore, pune,i have read atleast 8 posts of yours by now, and let me tell you, your site gives the best and the most interesting information. This is just the kind of information that i had been looking for, i'm already your rss reader now and i would regularly watch out for the new posts, once again hats off to you! Thanx a lot once again, Regards, mulesoft training in hyderabad

    ReplyDelete
  2. Excellent website. Lots of useful information here, thanks in your effort! . For more information please visit

    Mulesoft online training hyderabad

    ReplyDelete