# DRb as a server for long-running web processes: creating a website in Ruby that is responsive and progressive.

posted 2012-Feb-2
— updated 2012-Feb-3

Let’s say that you have a Sinatra web application allowing users to kick off a computationally-intensive command. Perhaps it’s a build system. Perhaps it’s a simulated annealing solution to solving sports scheduling. If you do something like this:

get '/start' do
@result = do_long_running_thing()
haml :results
end


…then your users will visit the URL and wait minutes before anything happens. Very likely their web browser will timeout and close the connection. Even worse, other visitors to your website will be stalled, waiting for the single-threaded Ruby process that is your web site to finish what it was working on. This is clearly not acceptable.

You could sort of fix the problem of locking out other users by running many website processes behind a reverse proxy, but that’s not a scalable solution: are you really going to have one ruby process for each possible concurrent visitor? Further, it still doesn’t provide a good experience for the user of the page.

You could kick off a Thread in the web server process, but this feels dirty and fragile to me. Instead, here is how I solve this problem:

1. I want the visitor kicking off the long-running command to immediately see a page letting them know that it’s running. This provides good feedback, and frees up the server to handle other requests.

2. I run the long-running command in a completely different process, an entirely separate Ruby program. I run each new command in its own thread (on the DRB server) so that the DRb server itself remains responsive.

3. I provide a way for the Sinatra web application to poll the other process and get status updates on the command. The web page makes periodic AJAX requests to the server, the server asks the other process for an update and responds to the AJAX request with some JSON, and the web page updates the progress.

webserver.rb

require 'sinatra'
require 'haml'
require 'drb'
require 'json'

DRBSERVER = 'druby://localhost:9001'
MCP = DRbObject.new_with_uri(DRBSERVER)

class MyServer < Sinatra::Application
set :haml, :format => :html5

get '/' do
@title = "Welcome to the MCP"
@finished,@running = MCP.processes.partition{ |o| o[:results] }
haml :home
end

get '/start' do
@process_id = MCP.start_new_long_running_thingy
@title = "Process ##{@process_id} Running"
haml :start
end

get '/status' do
content_type :json
MCP.status_for(params[:process_id].to_i).to_json
end

get '/results/:process_id' do
@title = "Results for Process ##{params[:process_id]}"
@results = MCP.results_for( params[:process_id].to_i )
haml :results
end
end


mcp.rb

require 'drb'

DRBSERVER = 'druby://localhost:9001'

module MasterControlProgram
@scheduler = Mutex.new
@process_by_id = {}
def self.start_new_long_running_thingy
@process_by_id.length.tap do |process_id|
process = Process.new
@process_by_id[process_id] = process
process.go
end
end

def self.status_for( process_id )
if process = @process_by_id[process_id]
process.status
end
end

def self.results_for( process_id )
if process = @process_by_id[process_id]
process.results
end
end

def self.processes
@process_by_id.map do |id,process|
if r = process.results
{ id: id, results: r }
else
{ id: id, status: process.status }
end
end
end
end

class MasterControlProgram::Process
def initialize
@percent_done = 0.0
@status = :starting
@results = nil
@data_accessor = Mutex.new
@start = Time.now
end

def status
# Ensure that nobody is changing the status while we read it
@data_accessor.synchronize do
{ percent_done: @percent_done, status: @status }
end
end

def go
# silly simulation of process
# will take on average 10 seconds to complete
states = %w[ globbing_dirs aggregating_data undermixing_signals
damping_transients detecting_resonances
emptying_buffers computing_final_result ].map(&:to_sym)
until @percent_done >= 1.0
sleep rand * 1
# Ensure that nobody is reading the status while we change it
@data_accessor.synchronize do
@status = states[ (states.length * @percent_done).floor ]
@percent_done += rand * 0.03
end
end
@data_accessor.synchronize do
@percent_done = 1.0
@status = :complete
end
@results = {
signal_strength: [:excellent,:moderate,:poor].sample,
score: rand * 100
}
end
end
end

DRb.start_service( DRBSERVER, MasterControlProgram )


views/home.haml

- one = @running.length == 1
%p There #{one ? :is : :are} #{@running.length} process#{:es unless one} running right now.

%p Want to <a href="/start">start a new process</a>?

- unless @finished.empty?
%table
%caption Finished Processes
%tr
%th ID
%th Signal Strength
%th Final Score
%tbody
- @finished.each do |data|
%tr
%th <a href="/results/#{data[:id]}">#{data[:id]}</a>
%td= data[:results][:signal_strength]
%td= "%0.3f" % data[:results][:score]

%p TODO: We should list all the running processes, and their statuses. We should have JavaScript polling for all the running processes and giving live status updates here.

:javascript


views/start.haml

%p
%span#pct 0.0%
done
%p
Status:
%span#status
%p <a href="/">Return Home</a>

:javascript
var $pct =$('#pct'),
$status =$('#status');

// poll the server every second
setInterval(function(){
$.getJSON('/status',{process_id:#{@process_id.to_json}},function(data){$pct.html((data.percent_done * 100).toFixed(1)+"%");
\$status.html(data.status.replace(/_/g,' '));
if (data.percent_done >= 1){
location.href = '/results/#{@process_id}';
}
});
},1000);


views/results.haml

%p Signal Strength: <b>#{@results[:signal_strength]}</b>
%p Final Scoring: <b>#{"%.3f" % @results[:score]}</b>
%p <a href="/">Return Home</a>


views/layout.haml

!!! 5
%html
%meta(charset='utf-8')
%title= @title
:css
table { border-collapse: collapse }
caption { background:#eee; border-bottom:1px solid #ccc; font-weight:bold }
th, td { padding:0.1em 0.5em; border-bottom:1px solid #ccc }
%body
%h1= @title
#content= yield


config.ru

# This Rackup file helps start the web server from thin start
# if you are using Thin (or another Rack-based web server)
require ::File.join( ::File.dirname(__FILE__), 'webserver' )
run MyServer.new


You can start the DRb server and web server with:

ruby mcp.rb &
ruby webserver.rb


You don’t have to start the MCP before you start the webserver. The line of code:

MCP = DRbObject.new_with_uri(DRBSERVER)


tells the web server how to connect to the DRb server when it needs to; it does not attempt to connect immediately. You can even restart the DRb server if necessary, and your web server will reconnect at the right time.

I used the short name MCP in the web server not only as a geeky reference to Tron, but also to illustrate the fact that although this object acts just like the MasterControlProcess module running on the other end of the DRb connection, it is not the same object. I did not include the mcp.rb file in webserver.rb; the DRb connection sends messages over the wire and the actual MasterControlProcess on the other end handles them and passes the response back over the wire.

I used a Thread to do the bulk of work in the Process so that the DRb server remains responsive. I used a Mutex to ensure that when the MasterControlProcess is reading the information it is not being changed at the same time by the process. I did not wrap reads the the results in a Mutex because I assumed that the thread will have completed its work—and written to the results—by the time anyone gets around to asking for the results.

 Harold 06:32PM ET2012-Feb-03 Wow, the syntax highlighting on this page looks awesome. Gavin Kistner 02:58PM ET2012-Feb-05 @Harold Why thank you! It’s inspired by (colors copied directly via screenshot) the Cobalt theme for TextMate by Jacob Rus.