Entries Tagged 'programming' ↓

ActsAsRenderer Brings Output to Models

Judging by the fact that there are several posts about this topic out in the wild, and that I have come across a need for it more than once, I thought it would be helpful to wrap up this functionality into a plugin and put it out into the world. Give a warm welcome to ActsAsRenderer!

Before you go off on a tirade about the evils of violating MVC, let me first say I know the arguments and I agree with you. However, in a world of complex systems where not everything is done via full-stack HTTP, there are legitimate reasons to output data directly from models, and ActsAsRenderer helps you do it.

With ActsAsRenderer, you get four cool new functions.

For your model class, you get render_file and render_string. For your instances, you get render_to_file and render_to_string.

Probably the most common (and legitimate) use of this kind of functionality is for rendering data out of a Rails script (say with script/runner). Since that environment is not a full-stack HTTP view of the world, it’s a real pain to render any kind of structured output. Not anymore! With acts_as_renderer in your model, you can render your views and give your model the voice it’s been lacking!

I’ve had this need come up several times. Most recently, I built a server configuration management system using Rails. While it is nice to preview the rendered configuration files using Rails-over-HTTP, it is also essential to be able to write those same configuration files out to the filesystem. In another case, I had a background DRb process that needed to be able to render templated output to the filesystem. I had to go build a mock-controller and do some pretty unsavory things; all of that would have been obviated with acts_as_renderer.

Now, I can simply say:

class Server < ActiveRecord::Base  acts_as_renderer

  def build_configuration    CLIENT_CONFIG_FILES.each do |f|      render_to_file("configs/#{f}", "#{config_dir}/#{f}.conf")    end  endend

The render_to_file function renders the templates located in configs (under app/views by default) and writes them to the files specified in the config_dir; it’s also smart enough to know that render_to_file is being called from a ‘server’ instance and sets @server accordingly. So my templates in configs are simply:

; Configuration Snippet for Server <%=@server.description%>

<%= render :partial => 'configs/queue', :collection => @server.queues %>

Please do think before using this plugin. It can be used for some seriously evil violations of good MVC design practice, and you are responsible for your own actions. However, this can also be used to make your existing designs *much* more robust and elegant, and I encourage you to use it where that is true.

It’s ready to drop in. Everything is there, including tests. Enjoy!

NOTE: Version 1.0 only supported Rails 2.0; I just added version 1.01 which will work with either Rails 1.2.x or 2.0.x. Please feel free to ping me with any questions.

acts_as_renderer at RubyForge

Automatic Asset Minimization and Packaging with Rails 2.0.x

With the recent release of Rails 2.0, many of us are reviewing our approaches to common problems. Many new features have been added to Rails, and some old tricks are either no longer necessary or no longer work.

I am developing a project with Rails 2.0 and am getting close to putting it into production. A recurring issue for today’s web developers is that of asset packaging, or the combination of multiple site assets into a single file. Specifically, we’re talking about Javascript and CSS.

A given “Web 2.0” (a term I wish had recently been found dead in a cramped apartment in Brooklyn) site might have a half dozen Javascript or CSS files to deliver to a user, and web browsers are not all that efficient at retrieving them. Each one requires a separate TCP connection to the server, and many browsers are only capable of getting two of these files concurrently. This means delays for your users.

In Rails 2.0 (and previously in Edge Rails), it’s possible to combine multiple Javascript and CSS files using the javascript_include_tag and stylesheet_link_tag functions in your html.erb files; simply add :cache => true to the parameters like this:

<%= javascript_include_tag 'prototype', 'effects', :cache => true %><%= stylesheet_link_tag 'main', 'shop', 'form', :cache => true %>

With :cache => true and when running in your production environment, Rails will automatically combine your Javascript and CSS assets into single files (all.js and all.css, respectively) and significantly reduce your site’s load time.

However, this really only solves part of the problem. A common technique used to further improve site performance is to compress Javascript and CSS by removing unnecessary whitespace and comments. I am not sure why this wasn’t included as part of Rails’ built-in caching features, but it seemed to me it should be easy to add.

Turns out I was mostly right. Google “javascript minimization” (or minification) and you’ll see it’s a pretty hot topic. The Asset Packager plugin from Scott Becker does this, as well as CSS compression, but is targeted at Rails 1.x and doesn’t really make sense in the face of Rails 2.0.

So I set out to solve this problem in an elegant way for Rails 2.0. Asset Packager uses a Ruby script called jsmin.rb by Uladzislau Latynski which is based on jsmin.c by Douglas Crockford. The thing is, jsmin.rb is not a class or library, but rather a standalone executable that operates on stdin and stdout. Asset Pacakger actually forks a ruby shell process to do its Javascript minimization, and this seemed like folly if it could be done internal to Rails.

Accordingly, I modified jsmin.rb to operate as a singleton class and with a class method you could pass Javascript data to. Then it was simply a matter of monkey patching this function into ActionView::Helpers::AssetTagHelper, home of javascript_include_tag and stylesheet_link_tag.

I also wanted to add in CSS compression, which turned out to be easy. The javascript_include_tag and stylesheet_link_tag functions both use the same underlying functions to package their assets, so it was a simple case of replacing them with equivalents that do compression appropriately, based on whether we are dealing with CSS or JS.

config/initializers/javascript_minimization.rb:

module ActionView  module Helpers    module AssetTagHelper      require 'jsminlib'

      def compress_css(source)        source.gsub!(/\s+/, " ")           # collapse space        source.gsub!(/\/\*(.*?)\*\/ /, "") # remove comments        source.gsub!(/\} /, "}\n")         # add line breaks        source.gsub!(/\n$/, "")            # remove last break        source.gsub!(/ \{ /, " {")         # trim inside brackets        source.gsub!(/; \}/, "}")          # trim inside brackets      end

      def get_file_contents(filename)        contents = File.read(filename)        if filename =~ /\.js$/          JSMin.minimize(contents)        elsif filename =~ /\.css$/          compress_css(contents)        end      end

      def join_asset_file_contents(paths)        paths.collect { |path|          get_file_contents(File.join(ASSETS_DIR, path.split("?").first)) }.join("\n\n")      end

    end  endend

By simply modifying join_asset_file_contents to use our new function get_file_contents instead of File.read, we quickly get to the heart of the matter. CSS files get compress_css run on them, while Javascript files get JSMin.minimize run on them. Your :cache => true Javascript and CSS assets will now be gloriously combined and compressed!

Note that the above monkey patch requires jsminlib.rb, which you can download here. It is just a modified version of the original jsmin.rb, and you will want to put it into your Rails lib directory.

A good next step would be to further enhance get_file_contents to do Javascript obfuscation, which allows for the replacement of variable names and thus even further compression; it also tends to make Javascript code nearly incomprehensible and thus harder to steal, which may be desirable for some developers. I haven’t found any native Ruby ways to do this yet, but it seems to me that this would be a good place for a C extension (or similar), and that this should all be put into a tiny and lightweight plugin.

I’m always amazed at how easy it is to bend Rails (and Ruby) to one’s will, and in this case it’s really quite elegant and straightforward. I’d love to hear your ideas about how to take this idea forward, potentially even including it in Rails itself.

Download the files here:

Hacking Freerice.com: A Program to Feed the World

While I was working on some changes to Twittervision yesterday, I saw someone mention freerice.com, a site where you can go quiz yourself on vocabulary words and help feed the world. How? Each word you get right gives 10 grains of rice to, one hopes, someone who needs it.

The idea is that you will sit there for hours and look at the advertising from the do-gooder multinationals who sponsor it. Which I did for a while. I got up to level 44 or so and got to feeling pretty good about Toshiba and Macy’s.

It occurred to me though that my computer could also play this game, and has a much better memory for words than I do. In fact, once it learns something, it always chooses the right answer.

So I wrote a program to play the freerice.com vocabulary game. In parallel. 50 browsers at a time. Sharing what they learn with each other. Cumulatively.

It’s a multithreaded Ruby program using WWW::Mechanize and Hpricot. Nothing terribly fancy, but it does learn from each right and wrong answer, and after just a few minutes seems to hit a stride of about 75-80% accuracy. And a rate of about 200,000 grains of rice per hour (depending on the speed of your connection).

UPDATE: With some tuning, the script is now able to push out about 600,000 grains of rice per hour, which according to the statistic of 20,000 grains per person per day, is enough to feed over 720 people per day! If one thousand people run this script, it will (allegedly) generate enough to feed 720,000 people per day.

Before you go off on me, disclaimer: Yes, I realize this program subverts the intent of the freerice.com site. I’ve released this not to “game” freerice.com but simply to show a flaw in their design and have a little fun at the same time. If what they are after is human interaction, this design doesn’t mandate it. That’s all I’m saying.

Run it for a while and see how many people you can feed!

Prerequisites:

  • Ruby (Linux, OS X, Other)
  • Rubygems
  • gem install mechanize –include-dependencies

Download the code

Concurrent Erlang: Watch out, API developers!

Continuing my theme of lifting ideas from Dave Thomas’ blog posts, in our last episode we built a somewhat broken program to sequentially fetch feed information from Youtube’s XML API.

I had some trouble understanding why I wasn’t getting #xmlText records back from xmerl_xpath. Thanks to a comment by Ulf Wiger, I now understand what was going wrong.

As many of us do, I was using the shell to play around with ideas before committing them to my program, and in the shell the record format is not defined because xmerl.hrl is not referenced. This *was* being included in my program, but I wasn’t running the program in my testing in the shell.

I took his advice, used the force, and got my patterns to match #xmlText records.

I also copied Dave Thomas’ design pattern for parallel spawning of the fetch process to produce this program which 1) grabs a feed of the most_viewed videos on YouTube, and then 2) grabs in parallel the user profiles for each of those videos.

While I still only have a rudimentary understanding of the language, I at least understand everything that’s going on in this program. It’s amazing how quick concurrent programs in Erlang can be. The fetch_parallel function in this program runs in about 3 seconds, while the fetch_sequential version takes about 20 seconds.

If you think about what this means for API developers, it has scary implications. In short, they will need a lot more bandwidth and processing capacity to deal with concurrent clients than are presently needed to deal with a sequential universe. Most API developers are accustomed to interacting with programs that make a single request, do some processing, and then make another related request.

A world of Erlang-derived, concurrent API clients likely calls for Erlang-derived concurrent API servers. Today’s API interactions are timid, one-off requests compared to what’s possible in a concurrent API interaction.

Imagine a recursive program designed to spider through API data until it finds the results it’s looking for. You could easily write a program that grabs a set of N search results, which in turn generates N concurrent API queries, which in turn generates N^2 concurrent API requests, which in turn generates N^3 requests.

You get the idea. Rather than being simple request & response mechanisms, APIs in fact expose all of the data they contain — in parallel and all at once. A single concurrent Erlang client can easily create as much simultaneous load as 10,000 individual sequential API consumers do now.

API developers should start pondering the answer to this question. Right now, there are no standards for enforcing best practices on most APIs. There’s nothing to stop a developer from requesting the same data over and over again from an API, other than things like the Google maps geolocation API limit of 50,000 requests per day. But what about caching and expiring data, refusing repetitive requests, enforcing bandwidth limits or other strategies?

Many people do all of these things in different ways, but we’re at the tip of the iceberg in terms of addressing these kinds of issues. A poorly designed sequential API client is one thing; a badly designed concurrent API client is another thing altogether and could constitute a kind of DoS (denial of service) attack.

Start thinking now about how you’re going to deal with the guy who polls you every 10 seconds for the latest status of all 142,000 of his users — in parallel, 15,000 at a time.

And for you would-be API terrorists out there, here’s some code:

-module(youtube).  -export([fetch_sequential/0, fetch_parallel/0]).  -include_lib("xmerl/include/xmerl.hrl").

 get_feed() ->  { ok, {_Status, _Headers, Body }} = http:request("http://gdata.youtube.com/feeds/standardfeeds/most_viewed"),  { Xml, _Rest } = xmerl_scan:string(Body),    xmerl_xpath:string("//author/name/text()", Xml).

 get_user_profile(User) ->  #xmlText{value = Name} = User,  URL = "http://gdata.youtube.com/feeds/users/" ++ Name,  { ok, {_Status, _Headers, Body} } = http:request(URL),  { Xml, _Rest } = xmerl_scan:string(Body),  [#xmlText{value = Id}] = xmerl_xpath:string("//id/text()", Xml),  [#xmlText{value = Published}] = xmerl_xpath:string("//published/text()", Xml),  { Name, Id, Published }.

 fetch_sequential() ->  lists:map(fun get_user_profile/1, get_feed()).

 fetch_parallel() ->    inets:start(),  Users = get_feed(),  lists:foreach(fun background_fetch/1, Users),    gather_results(Users).

  background_fetch(User) ->    ParentPID = self(),    spawn(fun() ->              ParentPID ! { ok, get_user_profile(User) }          end).

  gather_results(Users) ->          lists:map(fun(_) ->            receive                 { ok, Anything } -> Anything            end          end, Users).