Steve Freeman Rotating Header Image

Some mocks

“I don’t use a knife when I eat, if need one it means the food hasn’t been cut up enough.”
“Forks are unnecessary, I can do everything I want with a pointed knife.” [1]

One of the things we realised when writing “Growing Object-Oriented Software”:http://www.growing-object-oriented-software.com, is that the arguments about mocks in principle are often meaningless. Some of us think about objects in terms of Alan Kay’s emphasis on message passing, others don’t. In my world, I’m interested in the protocols of how objects communicate, not what’s inside them, so testing based on interactions is a natural fit. If that’s not the kind of structure you’re working with right now, then testing with mocks is probably not the right technique.

This post was triggered by Arlo Belshee’s on “The No Mocks Book”:http://arlobelshee.com/post/the-no-mocks-book. I think he has a really good point, buried in some weaker arguments (the developers I work with don’t use mocks just to minimise design churn, they’re as ruthless as anyone when it comes to fixing structure). His valuable point is that it’s a really good idea to try setting yourself style constraints to re-examine old habits, as in this “object-oriented calisthenics exercise”:http://binstock.blogspot.co.uk/2008/04/perfecting-oos-small-classes-and-short.html. As I “once wrote”:http://www.higherorderlogic.com/2008/06/test-driven-development-a-cognitive-justification/, Test-Driven Development itself can have this property of forcing a developer to stop and think about what’s really needed rather than pushing ahead with an implementation.

As for Arlo’s example, I don’t think I can provide a “better” solution without knowing more detail. As he points out in the comments, this is legacy code, so it’s harder to use mocks for interface discovery. I think Arlo’s partner is right that the ProjectFile.LoadFrom is problematic. For me the confusion is likely to be the combination of reading bytes from a disk and the construction of a domain structure, I’d expect better structure if I separated them. In practice, what I’d probably do is some “annealing” by inlining code and looking for a better partition. Finally, it would be great if Arlo could finish up the reworking, I can believe that he has a much better solution in mind but I’m struggling with the value of this step.

There is one more thing we agree on, the idea of building a top-level language that makes sense in the domain. Calling methods on the same object, like a Smalltalk cascade, is one way, although there’s nothing in the type to reveal the protocol—how the calls relate to each other. We did this in “jMock1″:http://jmock.org/jmock1.html, where we used interface chaining to guide the programmer and the IDE as to what to do next. Arlo’s example is simple enough that the top-level can be inspected to see if it makes sense. I’ve worked in other domains where things are complicated enough that we really did need some checked examples to make us feel confident that we’d got it right.

1) of course, most of the world eats with their hands or chopsticks, so this is a culturally-opressive metaphor

7 Comments

  1. David says:

    Is it the mocks or the way of thinking that’s important?

    Are mocks like ruled pencil lines on an artist’s drawing, i.e. possibly useful to give guidance but largely unnecessary / tedious once you understand perspective and have mapped out the general perspective of the picture you’re drawing?

  2. Arlo Belshee says:

    I agree that the optimal testing design will have a number if test doubles involved. It may even, depending on the code, have some full behavioral mocking. However, most usages of mocking that I see in typical code end up being duplications of the implementation expectations of the depended on code. Thus, a change in the dependency requires remembering to update all the tests for dependents.

    This is time consuming at best, and error prone at worst (if I miss a dependency).

    I also find that any general purpose tool, applied liberally, reduces thinking. This decreases thinking about alternate designs. I’ve had to learn dozens of design techniques in order to change from code easily tested with stubs to code easily tested with just inputs and outputs.

    I really like the tell, don’t ask design style. I use it often. And when I use it, I test it with mocks. I just also use a bunch of other styles in other places. Sometimes I need to encapsulate activity and expose data (compiler passes, for example). Sometimes I need to encapsulate both data and communications, and expose behavior. Sometimes I just want to expose who is listening to what when, and don’t want to execute behavior or examine state.

    Overall, I use test doubles now and then and full mocks rarely. They are critically useful in the right places. I just always make sure that I’m not using them where a better design would do.

  3. @arlo Thanks for responding. I too have a line about going too far and then pulling back, “if you haven’t overdone it you don’t know where the boundaries are.”

    Yes, there’s a lot of dreadful use of mocks out there–and of procedures, data structures, objects, functions, variables, etc. So it’s important to distinguish between disagreement with a technique and an instance of its use (after all, the title is “No Mocks”).

    I’d really like to see you follow through with more refactoring of the example to see where it takes you. I understand your intent, but I wonder how it would pan out.

  4. For me, the way of thinking. Done properly, I don’t believe that mocks are any more (or less) constraining than any other unit test approach–sometimes the right thing to do is to delete tests. I think I’d argue that mocks that got in the way are still a symptom of design problems and that just deleting them is not really addressing the issue.

  5. Grzes says:

    @arlo, I also think that input-output based tests are a lot better than just checking that some arbitrary mocked method has been called. These days, when mostly developing isolated numerical code I don’t use mocks at all. However, when I was deling with legacy code in big organizations, the picture was totaly different. There your code needs to work with the user (UI), other systems and a few different databases at the same time. I just can’t see how you want to test it without mocks, and your example certaily doesn’t answer this question!

  6. Arlo Belshee says:

    I actually spend most of my time in large legacy systems. I work at MS with several different teams. Each of them has a product of about 200 MLoc. Some are as small as 60 MLoc.

    And in those situations, I recommend all sorts of solutions as temporary patches. But that is different than the No Mocks or Do Mocks suggestions Steve and I are arguing about.

    In legacy, the first thing you need to admit is that you have a problem. And that it will take a couple years to fix (and you can fix it). Since team size is usually determined by code size (not the other way around), this 2 years is seemingly constant. So there are all sorts of things that I recommend a team do in the first 3 months of the legacy recovery in order to enable the next 3 months. That is not to say that I recommend the same things in the 6th quarter in order to enable the 7th.

    Mocks for dependencies are in this camp. Go ahead and use them if your design is f*cked. (Note: does not apply to other uses of mocks.) If you are not willing to state that your design is a piece of sh*t that just has to be fixed, then don’t use mocks for dependency substitution. There is a better answer. If your design is not sh*t, then you can eliminate dependencies. Or you can provide simpler alternatives for them (simulators / multiple implementations). Or better abstractions. Or many other good design alternatives – if your design is good, then you can whichever of the many options work best for here and now.

    Mocks the way Steve uses them are different. He is not substituting for dependencies. He sometimes suggests that he is, but it is different. He doesn’t care about whether a dependency is slow to execute or hard to set to a state. Instead, he defines the job of an object to be the messages that it sends. No more; no less.

    This is a great way to do design. It’s the TDA style; the dual of the Pure FP style. Those are the 2 design techniques that I practice (well, when I don’t screw up: I’m not perfect either).

    In pure FP, all I care about is the return value. There can be no other effects of a function call, so I don’t worry about spooky action. Traditional state-based testing verifies this perfectly.

    In TDA, all I care about is the message sends. There can be no other effects of a method call (including internal state transforms or returns), so I don’t worry about spooky action. Intercepters (usually mocks) verify this perfectly.

    Anything between these two extremes is lazy (not in the FP sense). When a team is dealing with large code debt, it does not have the budget to solve all problems at all times. Fair dinkum. But it should get better. And as it does, then all the middle-ground uses of mocks tend to fade away.

    Personally, most of my systems tend towards the pure FP style. I don’t do a lot of TDA. And so I end up with large systems with no mocks (or fakes of any kind). But that is a design choice. The real measure is the code, not the test strategy. Does the code allow emergent behavior (aka spooky action)? Or does it ensure that every result is determined by one piece of concrete code in one place? Emergent behavior is fun, but provable behavior in a specific location is the secret to scalable systems.

  7. Techniques and style are a rich vein of conversation and are very meaningful, but presume a level of proficiency in some more fundamental issues that explain why we see so many atrocious designs and tests. Two of these are intent and coupling.

    Know what you want your code to do. In TDA and OO that should tend toward the behavioral, but face it, things have inherent attributes no matter how functional their purpose, suggesting some state-based testing is essential. That sets the context for your testing and doubling.

    Then choose your test doubles based on the level of coupling they introduce beyond the essential. Reduction of coupling is a (the?) major driving force in striving for TDA, FP, OO, side-effect free, etc. styles.

    You’re always coupled to the interface you’re testing. Minimize the degree of coupling to the essential. That’s what will let your tests be resilient to system changes. Generally, prefer dummies over stubs over spies over mocks. Choose your doubles as intentionally as you write the rest of your code.

    The discussion you’re having about styles and techniques has this as a subtext, but that subtext is not obvious to many less experienced craftsmen.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>