Don't over-engineer your tests

I’ve used RSpec for years, and until recently I had to work with a colleague that never really tested before. This process made me think if I should be critical about how I test and if I should (or shouldn’t) do something about it.

Mar 6th, 2024
By Patricio Mac Adden

For various reasons, we ended up using RSpec by default in all our Ruby on Rails projects. Sometimes we inherit a project using it, and sometimes our clients prefer it over Minitest. We would start all new projects using it over Minitest for this reason.

These are the things we like about RSpec:

  • tests read naturally
  • tests are easy to write once you understand the mechanics
  • very succinct syntax
  • one of the most popular testing frameworks
  • good documentation and online resources

We have an internal tool, a time tracker, that we built a couple of years ago and use every day. Naturally, when we started working on it, we decided to go for RSpec as the testing framework, like we’re used to.

All of our test suites are normally comprised of RSpec, shoulda-matchers, and factory_bot. And if the project needs it, we could use more gem dependencies to help out our testing.

When my colleague, a junior developer, started working on the time tracker, he needed to add a couple of features, and we wanted him to add tests for them. Nothing out of the ordinary, though: model validations, controller tests, etc. Having not tested before, it was a good opportunity for him to learn and practice.

The tasks he needed to complete were very simple and he did great, but he struggled with the test suite. I asked myself why, because to me, it was pretty straightforward and he would only need to copy certain structures present in other tests, make sense of what he needed to test and that would be it. Simple.

But afterward, when shared with me the tests he was doing, I was surprised. The tests were correctly written (they were based on other similar tests) and the code made sense. But even then, some of them wouldn’t pass.

This is an extract of one of them: it’s a class that has a start_date, an end_date, and a duration, which is calculated before_validation. Something like this:

class Event < ApplicationRecord
  validates :duration, :start_date, :end_date, presence: true

  before_validation :set_duration, if: :dates_present?

  private

  def dates_present?
    start_date.present? && end_date.present?
  end

  def set_duration
    self.duration = end_date - start_date
  end
end

It is a simplified example, the actual class should also validate that the start_date can’t be greater or equal to the end_date and the actual model also has additional fields and validations, but it should be enough for this article.

The spec would look something like:

RSpec.describe Event do
  describe "validations" do
    subject { build :event }

    it { should validate_presence_of :duration }
    it { should validate_presence_of :start_date }
    it { should validate_presence_of :end_date }
  end
end

Simple tests, right? We have 3 validations in the class, we test those 3 validations. They also read beautifully: “It should validate the presence of duration”. Pretty close to English. That’s RSpec’s beauty. But do you see the problem yet? Well, it’s hidden at first sight: the first test will always fail. You can take a moment to re-read it and think about why.

The duration is set before the validation occurs, and the subject under test is a valid Event (ie. with start_date and end_date).

This is a big problem. We’re sacrificing a direct interaction with our subject under test in favor of readability. This is great if we’re used to it, but let’s not forget the original problem: a junior developer cloned the project, added a feature, and then struggled to test it. The problem wasn’t obvious to him until he spent more than twice as much time working on the test suite than the actual feature and he decided to put the PR under review to get some help from me.

This is a huge problem. Spending more time trying to test something than programming it is a symptom that something is wrong with our test suite.

What’s causing this?

Over-engineering is the act of providing a solution to a problem in an elaborate or complicated manner, where a simpler solution can be demonstrated to exist with the same efficiency and effectiveness as that of the original design.

In other words, we like to be clever when writing tests, and RSpec allows us to be clever when writing tests. Nesting many contexts, defining objects in different contexts that will only be created when called (or when defined), is a way of over-engineering our tests. We’re building layers and layers of tests that, in favor of aesthetics, are impacting the ease of understanding.

Using a library such as shoulda-matchers adds another layer of complexity, even though using it sounds harmless. When testing using should validate_presence_of, we’re essentially not exercising our subject under test (these terms come from the Four-phase test pattern), shoulda-matchers is. And shoulda-matchers is a library and as such, must provide a generic implementation. To achieve this, the matchers are much more complex than explicitly interacting with the subject under test and performing a simple assert.

This over-engineering looks great and we feel great when we write tests like this, but then you have to maintain those tests. And more importantly, someone else should understand those tests easily. And when coming back to a test suite like this, and having to jump up and down tracking down a variable, believe me, you don’t feel great anymore.

Model specs like this one, usually have no more than 10 tests. Controller specs tend to get worse in terms of over-engineering. Especially if we’re handling roles, permissions, etc.

How did we solve this?

In this particular case, we decided to ditch the test as a valid Event, ie an Event with a start_date and an end_date, will always set the duration set before_validation.

An alternative approach would’ve been to change the subject to a new Event (though shoulda-matchers advise providing a valid object) or add a context when the Event is new.

To be continued…

This process made me think if I should be critical about how I test and if I should do something about it. My verdict is in the next post.