Avoiding extra queries in ActiveRecord 3
ActiveRecord 3 introduces a new query interface based on Arel. The new query methods are more succinct than the old :conditions
etc. syntax, and they’re easier to combine into complex queries. Scopes (formerly known as named scopes) are still present and useful, but there’s much less difference between a scope and a simple method that returns a query than there was before. Along with these carrots encouraging you to rewrite all your queries there is the stick of deprecation of the old syntax in Rails 3.1 and removal in Rails 3.2, so most people migrate to the new interface when they migrate to Rails 3.
Migrating GWW to the new interface went smoothly, but when I was reviewing some pages for performance recently I noticed ActiveRecord issuing some extra queries that I didn’t expect, and that weren’t necessary. Fortunately, once noticed, they were easy to eliminate.
The extra queries came about because the new query interface issues SQL lazily. If you step through this method in a debugger while watching your test log,
class Photo def self.print_unfound_titles unfound_photos = where :game_status => 'unfound' puts (unfound_photos.map &:title).join ", " end end
you’ll see that unfound_photos
is an ActiveRecord::Relation
rather than an Array, and it issues an SQL query only at the last possible moment, when map
is called.
Lazy evaluation allows you to pass around queries and add to them without running them until you’re ready. But, as in real life, sometimes laziness results in doing more work in the long run. GWW had several situations like this
class PhotosController def map @photos = Photos.order :dateadded first_dateadded = @photos.first.dateadded @photos.each { |photo| photo[:color] = color_gradient first_dateadded, photo.dateadded } end end end
in which it got a list of something from the database and made some use of the first or last item before iterating over the list. In Rails 3, this issued two queries:
Photo Load (0.5ms) SELECT `photos`.* FROM `photos` ORDER BY dateadded LIMIT 1 Photo Load (0.3ms) SELECT `photos`.* FROM `photos` ORDER BY dateadded
(Note that ActiveRecord finders don’t belong in controllers. I wrote that here to illustrate my point succinctly.)
The old Rails 2 query, Photos.all :order => :dateadded
, had returned an Array, and .first
simply returned the first element. In Rails 3, order
returned a Relation
and .first
returned a second, refined Relation
. Calling .dateadded
caused the second Relation
to issue a SQL query, and then iterating over @photos
caused the first Relation
to issue a second SQL query.
Having noticed the problem, the solution was simple: force the first Relation
to run its SQL right away
class PhotosController
def map
@photos = order(:dateadded).to_a
first_dateadded = @photos.first.dateadded
@photos.each { |photo| photo[:color] = color_gradient first_dateadded, photo.dateadded }
end
end
end
thus turning .first
into an array reference again and eliminating the LIMIT 1
query.
You can also fully load a Relation
by calling .reload
on it, and .reload
returns self
so you can chain method calls, but I find that .to_a
better expresses that loading (not reloading) is taking place.
There are quite a few situations in which ActiveRecord loads a Relation
only partially, and in which you might want to tell it to load fully. (I say “fully” rather than “eagerly” because the latter refers to the separate decision of whether to load associations.) Looking at ActiveRecord::FinderMethods
, we see that some forms of .first
load fully and some do not. .last
generally works like .first
, except of course for its lastness.
A not-yet-loaded Relation
will also issue a reduced query when you call .size
, .empty?
or the blockless forms of .any?
or .many?
. These methods run a COUNT(*)
query. (Oddly, .length
always runs the full query; this seems like a bug.) As before, if you’re going to need the full query results anyway, you can eliminate the reduced query by calling .to_a
on the Relation right away.
Although it’s good to know the kinds of methods that result in reduced queries, so that you have an intuition about when it might help to eliminate them, it’s probably counterproductive to try to learn all the details of the cases in which ActiveRecord might issue such a query. Furthermore, the implementation of any of those methods might change in future versions. One tactic would be to call .to_a
in any case where an application first looks at some property of a query method’s return value and then uses the full query results, but that would be cargo cult programming. Instead, do what you were probably already doing: if the performance of a web page or other ActiveRecord-using process matters, periodically review the SQL that it issues and take whatever steps are necessary to eliminate redundant queries and address inefficient ones. There are plenty of reasons to do that besides the extra LIMIT 1
and COUNT(*)
queries discussed here.
Leave a Reply