Controlling database usage in RSpec and factory_girl, part 2: association strategy and unstubbed queries

An important part of testing a Rails application is establishing the correct relationship between the specs of each layer and the database. To correctly test the layer that they test, some types of specs (model and feature specs) should run against the database; other types should not. As well as ensuring correctness, policing database usage optimizes performance: tests that use the database are slower.

This post, part 2 in a two-part series, describes two measures that you can take to minimize incorrect and unnecessary database usage in RSpec specs that use factory_girl.

In part 1, “Choosing and allowing appropriate strategies”, we concluded that model specs should create test instances with factory_girl’s build and create methods; controller, helper and view specs should use only build_stubbed; feature specs should use only create; and specs of other layers don’t need test instances at all. But if you follow this guideline in all of your specs of some application, run your model specs, and look closely at your test log, you’ll see INSERTs even in specs that only build test instances. And if you run your controller, helper and view specs, your test log will probably still contain many SELECT queries. Where’s all this database activity coming from?

The first hole through which database usage is escaping is an unintuitive nuance of how factory_girl creates a model instance’s associated model instances. Suppose your Photo factory looks like this:

FactoryGirl.define do
  factory :photo do
    flickrid { "1#{rand 10000000000}" }
    views 0
    # more attributes ...
    association :user # we could have just said 'user', but wait for it ...
  end
end

When you build :photo you might expect the Photo‘s User to be created using the build strategy as well. But no — even though the Photo is only created in memory, the User is created in the database. In general, even if an instance is created with build, its associations are still created with create.

Fortunately, factory_girl allows you to configure the association to use a different strategy. Just add strategy: :build to the association,

FactoryGirl.define do
  factory :photo do
    flickrid { "1#{rand 10000000000}" }
    views 0
    # more attributes ...
    association :user, strategy: :build
  end
end

and when you build :photo the User will be built as well, not created.

But what happens when I want to create :photo, you ask? Although you wouldn’t have expected it from the name of the :strategy option, factory_girl does the right thing. :build should have been named :current. Whatever strategy you use to create the root object, the same strategy is used to create associated objects. strategy: :build propagates build_stubbed from root to associations as well as create. So just configure all of your associations to use strategy: :build and you’ll eliminate some unnecessary database usage without compromising the correctness of your specs.

A side note: Helpfully, if surprisingly, the associations of an instance built with build_stubbed are built with build_stubbed as well, even without specifying strategy: :build_stubbed or strategy: :build on the association. It’s only instances built with build that drop the ball and use a different strategy for their associations. That’s why, if you were already using build_stubbed in your controller, helper and view specs, you didn’t see any INSERTs in test.log — build_stubbed was already on the job. But strategy: :build propagates build_stubbed to associations, so if you make all of your associations use strategy: :build you’ll reduce database usage in your model specs and your controller/helper/view specs will work just like they did before.

Now for the SELECTs. If you use build_stubbed in your controller, helper and view specs, and you’re much more diligent than I, your test log might already be free of database activity when you run those specs. But if you’re like me, you might have written just enough test code to test what you wanted to test and looked no further. For example, suppose we’re testing a controller action that displays a list of Photos. Since this is a controller spec, we stub the model method that the controller uses to get the list:

describe PhotosController do
  render_views

  describe '#index' do
    it "displays interesting photos" do
      photo = Photo.stub(:interesting) [build_stubbed(:photo)]
      get :index
      response.should be_success
      Capybara.string(response.body).should have_css(
        %Q(img[src="#{url_for_flickr_image(photo)}"])) # test that the photo appears on the page
    end
  end

end

But suppose that a Photo has_many Tags, and the page we’re testing displays each Photo‘s Tags. That wasn’t the point of this particular example, so we didn’t give our test Photo any Tags or think about them at all. But the controller or a helper or the view will still try to display them, probably by calling each Photo‘s association method tags. And even mighty build_stubbed doesn’t stub association methods, or any other query methods! When tags is called, a SELECT query is run. It’s so easy to forget subsidiary queries like this that the first time you look at a test suite in this way, you’re sure to see many that have not been stubbed.

That this can happen is a performance concern, but usually a minor one — it only happens with queries that don’t cause an example to fail. But it’s a bad practice to leave test data partly unspecified, and when I find something like this I’d much rather make it evident in the test. There’s no shortcut; you just need to stub the method:

    it "displays interesting photos" do
      photo = Photo.stub(:interesting) [build_stubbed(:photo)]
      photo.stub(:tags) { [] }
      get :index
      response.should be_success
      Capybara.string(response.body).should have_css(
        %Q(img[src="#{url_for_flickr_image(photo)}"])) # test that the photo appears on the page
    end

Now it’s completely clear what’s going on, and as a side benefit a database query is avoided.

You might be thinking that this is just going to keep happening, because no-one is going to remember to check the test log every time they write or edit an example. But you can bring unstubbed queries to your attention immediately by forbidding database usage in specs of layers that don’t need it.

It is easy to prevent database usage altogether using the nulldb gem. NullDB is the Null Object design pattern applied to an ActiveRecord ConnectionAdapter. It quietly eats all database accesses, returning empty lists when appropriate. But it can be asked what database statements you tried to run, and you can fail your example if there were any.

Add activerecord-nulldb-adapter to your Gemfile
Generate db/schema.rb if you haven’t already (e.g. because if, like, me, you store your schema as SQL, not Ruby), because NullDB reads it:

    rake db:schema:dump

Make sure that either rspec-rails is configured to tag specs with their type with this line in spec_helper.rb

    config.infer_spec_type_from_file_location!

or you’ve manually tagged all of your specs with their type (shudder). If you have lib specs, you’ll have to manually tag them with type: :lib in any case, since rspec-rails doesn’t automatically tag that spec type.

Finally, add something like this to your spec_helper.rb:

    %i(lib controller helper view routing).each do |type|
      config.include NullDB::RSpec::NullifiedDatabase, type: type

      config.after :each, type: type do
        begin
          ActiveRecord::Base.connection.should_not have_executed(:anything)
        rescue RSpec::Expectations::ExpectationNotMetError
          raise RSpec::Expectations::ExpectationNotMetError,
            "Database usage is forbidden in #{type} specs, but these SQL statements were executed: " +
              %Q("#{ActiveRecord::Base.connection.execution_log_since_checkpoint.map(&:content).join '", "'}")
        end
      end

    end

Any example that mistakenly accesses the database will now fail at its end, and tell you what database statement(s) it ran. That’s not quite as nice as failing at the line that issues the database statement, which NullDB doesn’t support, but in practice it’s good enough that I haven’t been motivated to make it fail faster.

With these two measures in place, and by using appropriate test instance creation strategies as described in Part 1, you’ll completely eliminate unnecessary, slow database queries from your specs. More importantly, your specs will construct all of their test data exactly as they should (as far as database usage is concerned), which means fewer unpleasant surprises as your application and test suite evolve.

Written by dschweisguth

June 21, 2014 at 15:45

Posted in Programming, Rails, Ruby, Testing

Dave Schweisguth in a Bottle