Photograph is a really simple gem that provides to take screenshots of webpages as they are rendered in webkit. Give Photograph the url you want and that’s all.
The interesting part is how easy this was to code !
{% end_excerpt %}
Throwing gems at it
It’s about using Capybara with Polteirgeist (which wraps PhantomJS) to take the screenshot itself, then use MiniMagick to crop if needed. Adding some Ruby around it makes the Ruby API :
{% highlight ruby %} artist = Artist.new :url => “http://jhchabran.com”, :wait => 2
artist.shoot! do |image| send_file image.path, :type => “png” end {% endhighlight %}
Quite easy isn’t it ? Cropping can be done through the optionals
parameters :x :y :w :h
.
Deciding when to take the screenshot is probably the only tricky part.
You can either specify a timer through :wait
or wait for some dive
to appear with :selector => ".page"
for example.
As a webservice
As photograph after all the layers ends by running webkit, it can be used to produce screenshots reflecting exactly the rendering got an a platform. The use case we had that led to coding photograph some months ago was requiring OSX rendering. As we had an iPad client rendering rich content fetched from the backend, we had to rely on screenshots when listing the different pages to avoid fully rendering them which would have been very costly, especially since listing don’t need any interaction at all.
Well, as we were obviously not hosting our webservice on an OSX machine but on Heroku, a thin Sinatra layer was added to make calls from Heroku to Photograph, which was hosted on a Mac.
GET
http://photograph.somewhere.com/shoot?url=http://jhchabran.com&selector=.page
And it answers with a png within 1 to 3 seconds. Wrap that into a Delayed or Resque Job and problem solved.
Need more photographs ? Spawn more Sinatra instances !
Good and bad parts
What is really interesting there is the fact the code is so simple that
there’s almost no room for bugs besides those that may be carried by the
libraries photograph’s relies on. Obviously, Sinatra and Capybara are
robusts, the only small issue we had was on Capybara-Webkit which had
its webkit-server
dying after being online for some hours. Switching
to PhantomJS thanks to Polteirgeist solved the problem.
But such simple code comes with its limitations. Having GET requests that takes more than 2 seconds can be irritating. Plus scaling can be achieved in a much better way than having one webkit instance running per Sinatra.
As our use case require tons of screenshots, we finally switched to Url2Png which worked great so far. As we were already working with SAAS everywhere it really made sense to add some cash there and let people focused on that problem solving it for us.
Nevertheless if taking a few screenshots is all you need, firing some photograph instance is probably the simplest way to achieve it.
Upcoming
So far, I’m really surprised to see how simple all of this was to write. Photograph had been successfully used in production on two apps. Experience shows that adding some features to scale it would improve the whole usability and decrease the amount of code required to use it.
I’m currently thinking of adding Resque to photograph, having one
phantomjs instance per worker thus making scaling easy as COUNT=5 QUEUE=* rake resque:workers
. The screenshot would be provided
afterward with a POST uploading it to the web app that needs it, with
the url specified in a callback
parameter.
There also might be some work to be done to detect window.onload
instead of the crappy wait
timer, to speed up the whole process.
Getting it
Source are available here
and as a gem gem install photograph
then photograph -h 192.168.0.1 -p 8080
for example.