DRb as a server for long-running web processes: creating a website in Ruby that is responsive and progressive.
Let’s say that you have a Sinatra web application allowing users to kick off a computationally-intensive command. Perhaps it’s a build system. Perhaps it’s a simulated annealing solution to solving sports scheduling. If you do something like this:
get '/start' do @result = do_long_running_thing() haml :results end
…then your users will visit the URL and wait minutes before anything happens. Very likely their web browser will timeout and close the connection. Even worse, other visitors to your website will be stalled, waiting for the single-threaded Ruby process that is your web site to finish what it was working on. This is clearly not acceptable.
You could sort of fix the problem of locking out other users by running many website processes behind a reverse proxy, but that’s not a scalable solution: are you really going to have one ruby process for each possible concurrent visitor? Further, it still doesn’t provide a good experience for the user of the page.
You could kick off a Thread in the web server process, but this feels dirty and fragile to me. Instead, here is how I solve this problem:
-
I want the visitor kicking off the long-running command to immediately see a page letting them know that it’s running. This provides good feedback, and frees up the server to handle other requests.
-
I run the long-running command in a completely different process, an entirely separate Ruby program. I run each new command in its own thread (on the DRB server) so that the DRb server itself remains responsive.
-
I provide a way for the Sinatra web application to poll the other process and get status updates on the command. The web page makes periodic AJAX requests to the server, the server asks the other process for an update and responds to the AJAX request with some JSON, and the web page updates the progress.
Without further ado, the code:
webserver.rb
require 'sinatra' require 'haml' require 'drb' require 'json' DRBSERVER = 'druby://localhost:9001' MCP = DRbObject.new_with_uri(DRBSERVER) class MyServer < Sinatra::Application set :haml, :format => :html5 get '/' do @title = "Welcome to the MCP" @finished,@running = MCP.processes.partition{ |o| o[:results] } haml :home end get '/start' do @process_id = MCP.start_new_long_running_thingy @title = "Process ##{@process_id} Running" haml :start end get '/status' do content_type :json MCP.status_for(params[:process_id].to_i).to_json end get '/results/:process_id' do @title = "Results for Process ##{params[:process_id]}" @results = MCP.results_for( params[:process_id].to_i ) haml :results end end
mcp.rb
require 'drb' require 'thread' DRBSERVER = 'druby://localhost:9001' module MasterControlProgram @scheduler = Mutex.new @process_by_id = {} def self.start_new_long_running_thingy @process_by_id.length.tap do |process_id| process = Process.new @process_by_id[process_id] = process process.go end end def self.status_for( process_id ) if process = @process_by_id[process_id] process.status end end def self.results_for( process_id ) if process = @process_by_id[process_id] process.results end end def self.processes @process_by_id.map do |id,process| if r = process.results { id: id, results: r } else { id: id, status: process.status } end end end end class MasterControlProgram::Process attr_reader :results def initialize @percent_done = 0.0 @status = :starting @results = nil @data_accessor = Mutex.new @start = Time.now end def status # Ensure that nobody is changing the status while we read it @data_accessor.synchronize do { percent_done: @percent_done, status: @status } end end def go # silly simulation of process # will take on average 10 seconds to complete states = %w[ globbing_dirs aggregating_data undermixing_signals damping_transients detecting_resonances emptying_buffers computing_final_result ].map(&:to_sym) Thread.new do until @percent_done >= 1.0 sleep rand * 1 # Ensure that nobody is reading the status while we change it @data_accessor.synchronize do @status = states[ (states.length * @percent_done).floor ] @percent_done += rand * 0.03 end end @data_accessor.synchronize do @percent_done = 1.0 @status = :complete end @results = { signal_strength: [:excellent,:moderate,:poor].sample, score: rand * 100 } end end end DRb.start_service( DRBSERVER, MasterControlProgram ) DRb.thread.join
views/home.haml
- one = @running.length == 1 %p There #{one ? :is : :are} #{@running.length} process#{:es unless one} running right now. %p Want to <a href="/start">start a new process</a>? - unless @finished.empty? %table %caption Finished Processes %thead %tr %th ID %th Signal Strength %th Final Score %tbody - @finished.each do |data| %tr %th <a href="/results/#{data[:id]}">#{data[:id]}</a> %td= data[:results][:signal_strength] %td= "%0.3f" % data[:results][:score] %p#notice This page auto-refreshes every few seconds. %p TODO: We should list all the running processes, and their statuses. We should have JavaScript polling for all the running processes and giving live status updates here. :javascript setTimeout(function(){location.reload()},2500);
views/start.haml
%p %span#pct 0.0% done %p Status: %span#status %p <a href="/">Return Home</a> :javascript var $pct = $('#pct'), $status = $('#status'); // poll the server every second setInterval(function(){ $.getJSON('/status',{process_id:#{@process_id.to_json}},function(data){ $pct.html((data.percent_done * 100).toFixed(1)+"%"); $status.html(data.status.replace(/_/g,' ')); if (data.percent_done >= 1){ location.href = '/results/#{@process_id}'; } }); },1000);
views/results.haml
%p Signal Strength: <b>#{@results[:signal_strength]}</b> %p Final Scoring: <b>#{"%.3f" % @results[:score]}</b> %p <a href="/">Return Home</a>
views/layout.haml
!!! 5 %html %head %meta(charset='utf-8') %title= @title %script(type='text/javascript' src='http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js') :css table { border-collapse: collapse } caption { background:#eee; border-bottom:1px solid #ccc; font-weight:bold } th, td { padding:0.1em 0.5em; border-bottom:1px solid #ccc } %body %h1= @title #content= yield
config.ru
# This Rackup file helps start the web server from `thin start` # if you are using Thin (or another Rack-based web server) require ::File.join( ::File.dirname(__FILE__), 'webserver' ) run MyServer.new
You can download the above files here.
You can start the DRb server and web server with:
ruby mcp.rb &
ruby webserver.rb
You don’t have to start the MCP before you start the webserver. The line of code:
MCP = DRbObject.new_with_uri(DRBSERVER)
tells the web server how to connect to the DRb server when it needs to; it does not attempt to connect immediately. You can even restart the DRb server if necessary, and your web server will reconnect at the right time.
I used the short name MCP
in the web server not only as a geeky reference to Tron, but also to illustrate the fact that although this object acts just like the MasterControlProcess
module running on the other end of the DRb connection, it is not the same object. I did not include the mcp.rb
file in webserver.rb
; the DRb connection sends messages over the wire and the actual MasterControlProcess
on the other end handles them and passes the response back over the wire.
I used a Thread
to do the bulk of work in the Process
so that the DRb server remains responsive. I used a Mutex
to ensure that when the MasterControlProcess is reading the information it is not being changed at the same time by the process. I did not wrap reads the the results
in a Mutex
because I assumed that the thread will have completed its work—and written to the results—by the time anyone gets around to asking for the results.
Harold
06:32PM ET 2012-Feb-03 |
Wow, the syntax highlighting on this page looks awesome. |
Gavin Kistner
02:58PM ET 2012-Feb-05 |
@Harold Why thank you! It’s inspired by (colors copied directly via screenshot) the Cobalt theme for TextMate by Jacob Rus. |