I come from a career of no automated tests — I’ve dropped DLLs onto production servers to patch bugs, at companies where scheduled releases were once-a-month all-hands-on-deck events preceded by weeks of manual regression testing. So there are definitely some reasons to write automated tests. If we are going to do it, we might as well do it well. Our spec code is code that we and others read, and it should be resilient to failing, especially if we expect it to help us make the rest of our code resilient to failure.
Why should you test:
- You’re more productive
- You can make changes confidently
My reasons to format tests well:
- Other people will read it
- Test code is code
- Can prevent bugs and flakes
Disclaimer: I’m NOT going to talk about what to test, when to test, or even how to test. The answers to those are different depending on who you ask, both out there on the blogosphere, and even here at VTS.
In this post, I want to specifically focus on some heuristics that I think make the tests we have chosen to write better. How to name variables, what to expect in our test cases, whether to evaluate things eagerly or lazily, and what to do when order matters. Without further ado, let’s dive in.
❌ Name Things thing_1 and thing_2
Sometimes we need multiple of the same kind of thing in our tests. If we need two spaces, maybe we call them space one and space two — we probably started that way in our heads anyway, so why not get it down in code.
My take: I am not a fan. When I write this and come back to it four years, three months, or two days later, I spend way too much time figuring out which things are which.
Martin Fowler has a website commemorating this quote, and its variants. The original form, attributed to Phil Karlton, listed Cache Invalidation and Naming Things as the two hard problems in computer science. The software community agrees with us that naming things is hard — which makes it important to put in the effort.
✅ Name Things Consistent With Outcome
How do we come up with good names? Well, if our test case is looking for authorized spaces, then there is probably the concept of an authorized vs unauthorized space. Why not name them exactly that?
One additional style choice I made here is to hide those other objects — the floors, properties… that aren’t actually used anywhere else. They don’t need to be in their own let blocks, and can just be local variables that we forget about immediately.
My take: this, I think, is easier to understand whether I or someone else wrote it.
Okay. That one was pretty straightforward, let’s talk about expecting things.
❌ Expect To Match An Array
Our app is full of situations where we want several objects back — so our tests end up using the array matcher pretty often. It is order agnostic, and lets you know if you have missing or extra objects when the test fail.
My take: this is bad. Why, you ask?
This test may read well, but when it fails…
Our tests spend much of their life in green, and expecting objects in green is perfectly fine. But when we or another developer change a related piece of code, we hope that our test will fail to let them know that they erred. If we look at this failing test that expected objects, it isn’t obvious what is happening at first glance. There is an id of an extra element, if you do a lot of reading, but much of the rest of the information is useless to us.
❌ Expect IDs Instead
So, we can just map the ids of the objects on their way in and out, to remove the useless information. Our code is still running and getting objects, but our failure case is easier to read, right?
My take: this is still bad.
When there are missing elements — we know what we expected, so we could figure out that 20 is the id of the first object in our expected array. But what about those extra elements? Are we sure that 19, 22, and 23 were created explicitly by us, and not by a factory as a side effect? How much time and energy will it take to mentally map those ids to the objects, to understand the root cause of the failure?
My ideal failing spec tells me exactly what is wrong, with no ambiguity. I might not know yet WHY this failure has occurred, but before even looking at the code I know exactly WHAT the symptoms are.
Wouldn’t It Be Nicer If…
In Chapter 5 of “Growing Object-Oriented Software, Guided by Tests”, authors Steve Freeman and Nat Pryce recommend adding a fourth step to the canonical “Red-Green-Refactor” TDD cycle, specifically between Red and Green. Improve Diagnostics. If the Red test doesn’t obviate what to do, make a change in the test code that clarifies how to fix it. Only then should you fix it, confident that if this test does fail in the future, the next developer to see it won’t feel debugging fatigue.
✅ Expect Object Names
Almost all of our objects have some sort of name field — spaces have suite, users have an email address, properties have a street_address — the list goes on.
We already went through the difficult exercise of naming our variables — now we can copy those well-crafted variable names into the objects themselves, and map on that name field for our expectations.
My take: low effort, high reward for when this spec fails.
In RSpec we have the let block to create globally available variables. Let’s talk about it, pun intended!
✅ Use Let! Everywhere
let! (let bang) differs from let in that it, quite literally, adds a hidden before block, that calls the variable on the same line that its definition lives. That makes it eager, where let normally is lazy, and won’t pollute your memory until you ask for your variable.
Sometimes we come up with a large number of variables, and we eager load all of them for every case.
Hot take: this is good — much better than the alternative. Let me illustrate.
❌ Terrifying False Negatives
Take a look at this again — I made it lazy. If we set up a lazy variable, and then write a spec that expects nothing to appear in the result set, then we have a false negative. The spec passes, but for the wrong reason. Maybe we should try to avoid negative test cases too — but that isn’t always possible or easy. As it stands, using let here is actually a bug in our spec code, not just a formatting problem.
Hot take: that’s scarily dangerous.
❌ But There Are So Many Objects?
So I just suggested that we eagerly instantiate all of our variables using let! — but when we have shared objects, like property, floor, and space, where spaces aren’t useful for several of the cases, we might want to be conservative, and want to be lazy. In feature specs, there might be enough objects that being eager will double the runtime. Lazy seems reasonable here, right?
My take: don’t be lazy. Find another way to not pollute. The following example can get confusing to reason about what objects are being instantiated, and you can quickly end up relying on objects you don’t expect.
✅ Eager Load Everything, Just Need Fewer Things
If you find yourself not using certain global variables for certain specs but not others, reconsider having those variables in the outermost scopes. Put them into smaller contexts where every test case will make use of them. If you get to the point where eagerly initializing your variables makes the spec run the same duration as lazily initializing them, well, now there isn’t a good reason left to be lazy.
My take: be eager, but put the effort into scoping your variables. You can do this by starting your specs with local variables only for every test case, and only moving things into let! blocks when you want to refactor the spec to save some lines of code. TBH tho, if you forget to do that refactoring, that’s A-Okay.
❌ Mutually Ordered Creation And Expectation
So, this one doesn’t seem like a problem, until you notice what is being sorted by. Sort order in this case is the same for all of the objects. I specified zero to point it out, but if I hadn’t, that’s actually what our Spaces factory would have chosen anyway. Oops. Yet, this spec could pass, especially when you run it locally in isolation.
Hot take: this is a bug in spec code, and will lead to flakes in our test suite.
Postgres guarantees deterministic sort order for unique sort keys. Five is always greater than four, your database won’t let you down. But zero sometimes appears greater than zero, and sometimes less. Through decisions postgres makes when reading data and joining it together, the final order of objects in memory won’t always magically match the order that those objects were created in. So if we started out being nondeterministic, we will get hit with a flake when we go to CI.
If sorting is not chosen, the rows will be returned in an unspecified order. The actual order in that case will depend on the scan and join plan types and the order on disk, but it must not be relied on. (link)
✅ Avoid Order Flakes From The Start
How do we avoid making a mistake like that, you might ask? To avoid mutually ordering things by accident, you can choose to create your objects in a different order from the start. If we did that, we would realize that the sort was broken even on our local machines, and would have been forced to specify real sort_order values.
Wrapping up, here is a summary of my thoughts on all of these problems:
Naming is a hard problem — one potential solution is to name our variables to agree or disagree with our test cases. This makes the spec easier to read later.
Having named our variables, we can reuse those names for a name-like attribute on the object. This makes reading errors easier, in case the test fails for a good reason down the line.
On the other hand, to avoid tests passing when maybe they shouldn’t, I recommend using eager variables only. To avoid polluting your specs with eager variables that aren’t being used, move those variables into narrower scopes. Drying up specs isn’t necessarily a virtue anyway — it is honestly okay to copy the same variables for every test case.
We talked about true positive failures, and false negative failures. There are also false positive failures, which we often refer to as flakes. To avoid this specific expression of those, avoid mutual ordering or your inputs and expectations from the start.
Thanks for reading!
About the Authors
This post was written by Yuriy Zubovski and Kyle Holzinger, full stack Senior Software Engineers who have both been with VTS for more than 4 years — we hope you have enjoyed reading our post. Let us know your RSpec tips and tricks as well, we’re always on the lookout for more.