November 14, 2018
The GitHub Readme describes Falcon as, "... *a multi-process,
The gist: Falcon aims to increase
Most of us are familiar with Threads. A Ruby process can have multiple Threads which are coordinated and executed by the Ruby VM. A Fiber can be thought of like a more lightweight Thread, but the Ruby VM doesn’t handle the Fiber scheduling - the
Time for Falcon to spread its wings and show us what it’s got! We’ll test Falcon with a simple Rails 5 app running on Ruby 2.5, running in production mode.
We need some way to simulate ActiveRecord queries, network IO, and calls to native extension C code - all typical things average Rails applications do.
For all endpoints, we accept a sleep_time parameter in the URL, designating how long to sleep.
We’ll use PostgreSQL’s pg_sleep function to simulate slow SQL queries:
class AverageController < ApplicationController
def slow_sql
ActiveRecord::Base.connection.execute("select * from pg_sleep(#{sleep_time})")
end
end
We’ll use Net:
class AverageController < ApplicationController
def remote_io
uri = URI.parse("http://localhost:8080/#{sleep_time}")
Net::HTTP.get_response(uri)
end
end
The HTTP server listening on the remote end is written in Go and sleeps for sleep_time before returning a 200 status and a very minimal body stating how long it slept before returning the response:
# sleepy_http.go
package main
import (
"fmt"
"log"
"net/http"
"strconv"
"time"
)
t, _ := strconv.Atoi(r.URL.Path[1:])
time.Sleep(time.Duration(t) * time.Millisecond)
fmt.Fprintf(w, "Slept for %d milliseconds", t)
}
fmt.Println("Listening on port 8080")
http.HandleFunc("/", handler)
log.Fatal(http.ListenAndServe(":8080", nil))
}
I wrote a small C native extension module that simply sleeps for the specified time in C before returning to Ruby. There’s
class AverageController < ApplicationController
def cworkwith_gvl
CFoo::MyClass.do_work(sleep_time)
end
def cworkwithout_gvl
CFoo::MyClass.do_work_without_gvl(sleep_time)
end
end
Falcon can either be used in forking or threaded mode. In forking mode, a single thread per forked worker is created. In both modes, many
We’ll use siege to make concurrent requests against our endpoints. The options we’ll use for
First up, slow_sql with each SQL request taking 1 second:
$ siege -c 50 -r1 'http://localhost/slow_sql/1'
Transactions: 50 hits
Availability: 100.00 %
Elapsed time: 10.08 secs
Transaction rate: 4.96 trans/sec
Wait a second - if Falcon is able to serve requests while we’re waiting for the SQL to return, we should be seeing about 1 second of elapsed time.
Ok, we’ll come back to the SQL test. What about network IO?
$ siege -c 50 -r1 'http://localhost/remote_io/1000'
Transactions: 50 hits
Availability: 100.00 %
Elapsed time: 11.09 secs
Transaction rate: 4.51 trans/sec
Same results as our SQL test. Again, we should be seeing about 1 second of elapsed time.
It turns out I forgot to mention a critical characteristic of
In short: in order for Falcon to achieve its concurrency, you need to use libraries that are made to be ‘async aware’ of Falcon’s async reactor.
Fortunately, the author of Falcon has also created some async libraries for common things like Postgres and HTTP. Let’s use those to see how that improves concurrency!
All we need to do is use
gem 'async-
And the results?
$ siege -c 50 -r1 'http://localhost/slow_sql/1'
Transactions: 50 hits
Availability: 100.00 %
Elapsed time: 1.07 secs
Transaction rate: 46.73 trans/sec
That’s more like it! All 50 requests were being served concurrently.
Adding async-http and our remote_io endpoint now looks like:
class AverageController < ApplicationController
def remote_io
endpoint = Async::HTTP::URLEndpoint.parse("http://localhost:8080")
client = Async::HTTP::Client.new(endpoint)
client.get("/#{sleep_time}")
end
end
The results:
$ siege -c 50 -r1 'http://localhost/remote_io/1000'
Transactions: 50 hits
Availability: 100.00 %
Elapsed time: 1.08 secs
Transaction rate: 46.30 trans/sec
So if we just replace some libraries with async aware libraries, we should get at least the same, if not better, concurrency than with Puma using the same number of threads, right?
So far we’ve tested the endpoints that have async aware libraries that play nice with Falcon. What happens when we throw in an endpoint that does work that is not Falcon async-friendly?
For this
$ siege -c 5 -r1 'http://localhost/cworkwithout_gvl/10000'
Transactions: 5 hits
Availability: 100.00 %
Elapsed time: 10.02 secs
Transaction rate: 0.50 trans/sec
Ok, no surprise there. What about the async endpoints that should only take 1 second?
$ siege -c 50 -r1 'http://localhost/remote_io/1000'
Transactions: 50 hits
Availability: 100.00 %
Elapsed time: 10.35 secs
Transaction rate: 4.83 trans/sec
Uh oh. Our 5 requests that triggered non-async work ended up blocking all of our async endpoints for 10 seconds!
Puma is the default web server for Rails 5. Puma is a threaded
One big difference of threads vs
The biggest lesson here is that when a request is accepted by Falcon, it is immediately handled in a new
Want to read about how Ruby might improve
How does Falcon limit the number of Fibers it serves at one time?
Would Puma with five threads and one worker also block in the ‘Well, Not Quite’ scenario?
If Puma was configured with enough threads to handle all concurrent connections in these same scenarios, would it perform better/worse/the same as Falcon?
Does Falcon’s async reactor remind you of something you’ve seen before?
How do Thread local variables behave in Fibers? Are they also Fiber local?
Do the chances of having deadlocks or race conditions increase when using Fibers vs Threads?