Lately I've been configuring a production server which basically is composed of a mongrel cluster and some other daemons that we use (backgroundrb, i.e.). To keep alive both mongrel instances and our daemons we needed some monitor tool to help us know when something went wrong and restart those processes if needed.
For that matter, I investigate the following tools: Monit and God.
Monit
Monit (http://mmonit.com/monit/) is a great utility for managing and monitoring, processes, files, directories and devices on a Unix system.
Installation
It's very straight-forward:
- Download the latest version on http://mmonit.com/monit/download/
- tar zxvf monit-x.y.z.tar.gz
- cd monit-x.y.z
- ./configure
- make && sudo make install (sudo if necessary)
After installing it, we need to create a configuration file. Monit would look for this file in the following locations:
- ~/.monitrc
- /etc/monitrc
- /.monitrc
#.monitrc
set daemon 30
set logfile /home/myuser/monit/monit.log
set httpd port 9111
allow remotehost
allow admin:admin # Allow Basic Auth
check process mongrel_cluster_3010 with pidfile "/var/www/myapp/current/tmp/pids/mongrel.3010.pid"
start program = "/usr/local/bin/ruby /usr/local/bin/mongrel_rails start -d -e production -a 127.0.0.1 -c /var/www/myapp/current --user myuser --group deploy -p 3010 -P tmp/pids/mongrel.3010.pid -l log/mongrel.3010.log"
stop program = "/usr/local/bin/ruby /usr/local/bin/mongrel_rails stop -p 3010 -P /var/www/myapp/current/tmp/pids/mongrel.3010.pid"
if failed port 3010 protocol http # check for response
with timeout 10 seconds
then restart
group mongrel
check process mongrel_cluster_3011 with pidfile "/var/www/myapp/current/tmp/pids/mongrel.3011.pid"
start program = "/usr/local/bin/ruby /usr/local/bin/mongrel_rails start -d -e production -a 127.0.0.1 -c /var/www/myapp/current --user lokkedc --group deploy -p 3011 -P tmp/pids/mongrel.3011.pid -l log/mongrel.3011.log"
stop program = "/usr/local/bin/ruby /usr/local/bin/mongrel_rails stop -p 3011 -P /var/www/myapp/current/tmp/pids/mongrel.3011.pid"
if failed port 3011 protocol http # check for response
with timeout 10 seconds
then restart
group mongrel
You can check the complete documentation of the commands allowed for monit in http://mmonit.com/monit/documentation/monit.html.
But I just wanted to show you a simple example that checks for the availability of the mongrel instances (in this case I just configured a couple of those). As you may notice, I have to configure each instance separatly (besides they both belongs to the same group, so I can start/stop all of those at the same time).
Other thing to notice, is the events that I want to monitor. In this case I've configured to restart a given mongrel's instance (the fragment that takes care of this is highlighted in the code below) when there is no response from it. But there are many other events, such us memory usage, cpu time, that can be measured to alert the administrator (vía email).
Run it!
To start it we just execute: moint. And we can see how the logs starts to populate:
[UYT Jun 5 19:12:15] info : monit: generated unique Monit id 15fdb0bdb830b0a114a3831d995ec32e and stored to '/home/myuser/.monit.id'
[UYT Jun 5 19:12:15] info : Starting monit daemon with http interface at [*:9111]
[UYT Jun 5 19:12:15] info : Starting monit HTTP server at [*:9111]
[UYT Jun 5 19:12:15] info : monit HTTP server started
[UYT Jun 5 19:12:15] info : 'willy' Monit started
[UYT Jun 5 19:12:34] info : Shutting down monit HTTP server
[UYT Jun 5 19:12:35] info : monit HTTP server stopped
[UYT Jun 5 19:12:35] info : monit daemon with pid [21560] killed
[UYT Jun 5 19:12:35] info : 'willy' Monit stopped
[UYT Jun 5 19:13:42] info : Starting monit daemon with http interface at [*:9111]
[UYT Jun 5 19:13:42] info : Starting monit HTTP server at [*:9111]
One of the greatest things about monit is that it offers a web interface (as you may notice from the configuration file). Which looks pretty nice!
God
God its a newer tool written in ruby and available as a rubygem. For this matter, it is easier to install it also (i.e. [sudo] gem install god).
Usage
We need to create a configuration file, which is entirely written in ruby. This is one the best things about God, because it helps us reduce duplication among configurations, as you can see in the following example (which is intended to monitor the same mongrel cluster mentioned above):
#myapp.god
RAILS_ROOT = "/var/www/myapp/current"
%w{3010 3011}.each do |port| God.watch do |w|
w.name = "mongrel_cluster_#{port}"
w.group = 'mongrels'
w.interval = 30.seconds
w.start = "mongrel_rails start -c #{RAILS_ROOT} -p #{port} \ -P #{RAILS_ROOT}/tmp/pids/mongrel.#{port}.pid -d -e production"
w.stop = "mongrel_rails stop -P #{RAILS_ROOT}/tmp/pids/mongrel.#{port}.pid -e production"
w.restart = "mongrel_rails restart -P #{RAILS_ROOT}/tmp/pids/mongrel.#{port}.pid -e production" w.start_grace = 10.seconds
w.restart_grace = 10.seconds
w.pid_file = File.join(RAILS_ROOT, "tmp/pids/mongrel.#{port}.pid")
w.behavior(:clean_pid_file)
w.start_if do |start| start.condition(:process_running) do |c|
c.interval = 5.seconds
c.running = false
end
end
end
As you can see, I can refactor the mongrel's configuration into just one block which is instantiated for each mongrel's port. Besides that, I've also reduce the amount of configuration using RAILS_ROOT constant in many settings (reducing error prune also).
Run It!
To start our God monitoring tool, we just exectue: god -c myapp.god
After which, we can see the log using the following command: god log mongrels (to show the logs relative to our group of mongrel instances). Which should display something like this:
I [2009-06-08 12:59:16] INFO: mongrel_cluster_3010 move 'unmonitored' to 'up'
I [2009-06-08 12:59:16] INFO: mongrel_cluster_3010 moved 'unmonitored' to 'up'
I [2009-06-08 12:59:16] INFO: mongrel_cluster_3010 [trigger] process is not running (ProcessRunning)
I [2009-06-08 12:59:16] INFO: mongrel_cluster_3010 move 'up' to 'start'
I [2009-06-08 12:59:16] INFO: mongrel_cluster_3010 before_start: deleted pid file (CleanPidFile)
I [2009-06-08 12:59:16] INFO: mongrel_cluster_3010 start: mongrel_rails start -c /var/www/myapp/current -p 3010 -P /var/www/myapp/current/tmp/pids/mongrel.3010.pid -d -e production
I [2009-06-08 12:59:27] INFO: mongrel_cluster_3010 moved 'up' to 'up'
I [2009-06-08 12:59:27] INFO: mongrel_cluster_3010 [ok] process is running (ProcessRunning)
Conclusion - My Choice
Any of those tools gets the job done pretty well. But I decided to use monit, because it is easier/faster for me to manage the process using a web-interface instead of logging into the server via ssh.
Besides that, God seems to be a great tool as it provides a way to reduce duplication (because its configuration is defined with ruby code).
1 comentario:
This is my first visit here. I found much informative stuff on your blog, especially its discussion. Form the tons of comments and posts, I guess I am not the only one having all the enjoyment here. Keep up the good work.
Publicar un comentario