Rediscovering fixtures

Over the last couple of months, I’ve been investing a lot of time in improving my testing skills. I spent most of this time in going back to the basics, trying to start from scratch with every little concept one takes for granted. Trying to see everything through a child’s eyes. And today I’m rediscovering fixtures.

Development Ruby

May 27th, 2024

By Patricio Mac Adden

May 27th, 2024

By Patricio Mac Adden

Development Ruby

There are 3 things that I’m giving importance to in this quest: speed, simplicity, and maintainability. And today I’m putting factories under scrutiny.

I started using factories (particularly factory_bot, when it still was factory_girl) more than a decade ago. In my very first Ruby on Rails project I tried to do everything The Rails Way, but soon I felt like fixtures were bad because, well, we didn’t use them properly (as it always happens when you use something for the first time) and the Rails Guides didn’t cover fixtures organization. It is over time that you get to understand how everything works and start mastering it, but I didn’t give fixtures a second chance. I was sold on using factory_girl because with fixtures:

you are not creating objects, you’re creating db rows (database structure)
yaml is a bit foreign and uncomfortable to write compared to Ruby
model associations are named, meaning you need to scan several yaml files to find associated objects

To be honest, factories don’t solve all these issues. You can feel more comfortable writing your test data with a Ruby DSL (Domain-Specific Language) than yaml files, but you will still need to scan several files to find the associated objects.

Factories add other features such as callbacks, which are handy. But the bigger and more complex your domain is, the bigger and more complex your factories will be.

However, I can’t recall a single project where I used factories where it didn’t bring frustration to the table. Most of the time I found to have mystery guests in our test suite. But also, factories got bigger and more complex over time making it harder and harder to follow over time. This is the motivation for writing our guidelines for writing better specs.

So, if we consider speed, simplicity, and maintainability, factories are not the best choice.

Before jumping into these 3 things, let’s answer…

What is a test fixture?

Seems like a dumb question that everyone takes for granted, but we need to answer this to move forward and decide what’s a better approach for us.

A test fixture is all the things we need to have in place to run a test and expect a particular outcome.

This means that, before running a test, we must know what objects we will be dealing with. Not only the SUT (Subject Under Test) but all its associated objects.

This can be achieved with both fixtures and factories, but not quite. With factories, we don’t have a test fixture until we create it before or during each test.

Speed

Fixtures are faster than factories. The yaml files become database inserts instead of ActiveRecord objects that get saved. This saves a ridiculous amount of time as validations and callbacks are not run. This comes with a tradeoff: the data in your yaml files are not validated, which could result in runtime errors. But this forces you to have the right database constraints in place, which is always good. Factories, on the other hand, become ActiveRecord models, which means that they are validated and all callbacks run.

Here’s a quick benchmark: A newly created Rails app that has only one model. We will run 100 tests that assert that 5 users are valid. With factories, we’d have this user factory:

FactoryBot.define do
  factory :user do
    email { Faker::Internet.email }
    password_digest { BCrypt::Password.create("password") }
    first_name { Faker::Name.name }
    last_name { Faker::Name.last_name }
  end
end

and this is the test:

require "test_helper"

class UserTest < ActiveSupport::TestCase
  100.times do |i|
    test "test #{i}" do
      users = create_list :user, 5

      assert users.all?(&:valid?)
    end
  end
end

With fixtures:

<% 5.times do |i| %>
user_<%= i %>:
  email: <%= Faker::Internet.email %>
  password_digest: <%= BCrypt::Password.create("password") %>
  first_name: <%= Faker::Name.name %>
  last_name: <%= Faker::Name.last_name %>
<% end %>

and this is the test:

require "test_helper"

class UserTest < ActiveSupport::TestCase
  100.times do |i|
    test "test #{i}" do
      assert users.all?(&:valid?)
    end
  end
end

The results are eloquent. The results with factories are:

Running 100 tests in parallel using 11 processes
Run options: --seed 64117

# Running:

....................................................................................................

Finished in 14.061858s, 7.1114 runs/s, 7.1114 assertions/s.
100 runs, 100 assertions, 0 failures, 0 errors, 0 skips

And with fixtures:

Running 100 tests in parallel using 11 processes
Run options: --seed 35467

# Running:

....................................................................................................

Finished in 2.008258s, 49.7944 runs/s, 49.7944 assertions/s.
100 runs, 100 assertions, 0 failures, 0 errors, 0 skips

We can see that fixtures are 7 times faster than factories for this simple benchmark. However, it’s not something to take for granted. Hello world kind of benchmarks aren’t real benchmarks, but they give you a notion of the best-case scenario.

Simplicity

From the implementation perspective, fixtures are simpler than factories. They’re just yaml files that get inserted into the database. Then in your tests, you can use ActiveRecord as you normally would. Factories on the other hand is a Ruby DSL with certain features that can make the factories more or less complex depending on how you use them. It’s a simple Ruby DSL, but a DSL at the end of the day.

With fixtures, you model your test data specifically for the tests you will make. Everything will be in place, and available for you when you write your tests. You can also use the same data in development (or even build the data in development and then dump it into your fixtures), which is useful.

With factories, test data is disposable. They give you a disposable object with meaningless data that you can’t use (at least out of the box) in development. You don’t exactly know what your test fixture looks like, you need to build it before each test, making the test suite larger, and harder to read and follow.

Maintainability

Maintainability is often overlooked. But when you get a legacy Rails application that has been in production for a couple of years (we have several cases like this!), you wish the test suite was running. And if it’s running, that the test data is easy to understand.

You can simply load your test fixture in your development environment, play around with it, and make sense of it. Factories, again, give you a way of building generic and meaningless test data. You need to read factories and tests, scan them up and down, back and forth to understand what’s going on. Usually, a 5-minute fix becomes a 5-minute fix plus 5 hours of fixing a test.

Conclusion

Fixtures are more powerful than we think they are. By using them we have more benefits than using factories in the long run:

they’re faster
they allow us to better understand our test data
they’re simpler and easier to maintain than factories, so it’s harder to make a callback mess out of them
they provide a unified testing and development dataset

It’s time to give them a second chance.