Testing and Tacos
What do testing and tacos have to do with one another? Quite a bit, as it turns out. After my interview with Gregory Schmidt I discovered many parallels between tacos and testing and what they can teach us about test coverage.
The taco analogy focuses on a common denominator, which is that most people who work in IT have attended college. Everyone that attended college had a dining hall that closed at, what, 6:30, or 7:00? Say you’re working on a paper at ten or eleven at night, you’re going to get hungry because it’s been four or five hours since you’ve last eaten
When I was attending college in the mid-90’s, the only place near my dorm that was open that late was Taco Bell, so we’d go over and get our ten soft tacos. The first taco would basically get inhaled. It would be delivered directly to our olfactory senses, and it was fantastic.
It was proof that there was a God, and he loved us.
By the time we got to the second and third tacos they were still delicious, but were they as satisfying as that first one was? The answer is no, because of something called decreasing marginal return to variable inputs.
When you look at each additional unit of something you’re going to consume, each one gives you a little bit more additional benefit. Your total benefit may rise.
For the sake of this article let call them “yum” numbers.
Say your yum number for the first taco is 10, and your yum number for the second taco is 8. Your total yum factor is now 18, but you gained 2 fewer yums from the second one, and this continues on. You eat your third taco, and you’re thinking to yourself, “This might not be the freshest lettuce on earth.” The yum factor from your third taco is something like 6 units. Your total yum factor is now 24, but you’ve lost two more additional units.
The way the analogy works is that for any given test suite, you have tests that absolutely have to work.
A Finite Amount of Time to Complete Testing
Another example of this is to imagine that you work for Facebook.
If you worked at Facebook, one of the things you’d have to ensure works all the time, every time, is Login. If that isn’t working, nothing else really matters.
Let’s consider that a “first taco” test. You’re going to get the biggest bang for your buck, and you should probably do this every time.
Is tagging yourself in a photo that your third cousin, twice removed, posted from their trip to Uruguay warrant its own test? Maybe not. Maybe that’s your third, fourth, or fifth taco. The reason we look at this is because we absolutely have a finite amount of time to complete our testing.
Before your next release, you have to get a certain amount of testing done on a certain number of browsers, and devices. Are you getting the best bang for your buck?
Negative Marginal Return
The reason we talk about as many as ten tacos is because you can eventually start getting a negative marginal return.
Sticking with the taco analogy, by the time you to get to your eighth, ninth, or tenth taco, you might actually start to feel a little sick.
If you are Wells Fargo, and you want to be sure that your Canadian business customers can log in from an Office of Foreign Affairs company (someone you’re not allowed to do banking business with), it might take quite a bit of work to get that one test completed — to the point that it may be better to just let them contact the call center.
If you have three testers working on this for $50 an hour, and it takes them three hours to prepare and run the test, you now have a $450 test. A phone call to the call center for this one user might cost you $4. If you have a million of these users, that’s $40 million. Therefore, it makes a lot of sense to perform a $450 test.
Deciding What to Test Based on Tacos
As testers we all know that you can’t test everything. So how do you decide what to test or automate? Where will you get the most benefit?
If you break it down into, say, how many tacos you prefer to eat in a sitting, you may determine that you like to eat three or four tacos.
If people have taken up the Agile process and have started to point out their stories, it gives you an opportunity to point out your tests.
Is it a first taco test? Is it a tenth taco test? Is the tenth taco more important for this release than for the previous release?
Thinking about tacos when planning your tests should help to ensure that your test suite yum factor is always optimized for complete testing satisfaction.
More Taco Testing Awesomeness
Is you’re still hungry for more taco testing awesomeness — check out my full interview with Gregory:
|Joe:||Hey, Gregory. Welcome to TestTalks.|
|Gregory:||Hey, thanks for having me.|
|Joe:||It’s awesome to have you on the show, but, before we get into it, could you just tell us a little bit more about yourself?|
|Gregory:||My name’s Gregory [Shmitz 00:09]. I am what we call a quality engineer, other people call QA or QS. It’s my second stint in IT. I worked on a small project called the Year 2000 Project and then went straight from that into professional cheerleading and subpoena serving, and then I worked on cars, and then I sold parts, and then I got back into IT, and I break software for money.|
|Joe:||Cool. I think that’s the reason why we all get into it, pretty much blessed with breaking things, so to get paid for it’s even better.|
|Joe:||Awesome. Today, I’d like to talk about a presentation that you’ll be giving at STP Con, I think, called ‘Optimizing Test Suite Sizing, Described By Taco Consumption.’|
|Joe:||I have to start off by saying, what’s the taco analogy all about?|
|Gregory:||The taco analogy focuses on a common denominator in that most people that are in IT went to college. Everybody that went to college had a dining hall that closed at what, 6:30, 7:00? If you’re college student, you get hungry. You’re working on a paper 9:00, 10:00, 11:00, 12:00, when I was there in the mid-90’s, mid- to late-90’s, the only thing open was Taco Bell, and so you would take your little self out and go get your ten soft tacos, and you would be so excited about this, because it’s four, five, six hours since you’ve eaten, and that first taco would just get inhaled. It would be delivered directly to your olfactory senses, and everything that is good in this world, and it was fantastic. It was proof that there was a god, and he loves you. Then you have your second taco, and yes, it’s good. It’s still delicious, but is it as satisfying as that first one was? The answer is no, because there is a thing called decreasing marginal return to variable inputs. Now, if I titled something ‘Decreasing Marginal Return to Variable Inputs as Applied to Optimized Regression Test Suite Size,’ people would be asleep before they walked in the door.|
|That said, we look at each additional unit of something you’re going to consume, and each one gives you a little bit more additional benefit. Your total benefit may rise. Let’s say I’ve got, we’ll call them yum numbers. My yum number for the first taco is 10. My yum number for consuming the second taco is 8. My total yum factor is now 18, but I gained 2 fewer yums from the second one, and this continues on to the point- You have your third taco, and you’re thinking to yourself, “Maybe this lettuce wasn’t quite picked today. This might not be the freshest lettuce on Earth.” The yum factor from your third taco is something like 6 units. Your total yum factor, ten, eight, and six, is 24, but you’ve lost two more additional units. The way that that analogy works is, for any given test suite, you have tests that have to work, that absolutely have to work.|
|I can only go into so much detail with my own company, but let’s take something off that people work with pretty regularly, and that’s Facebook. If you work at Facebook, one of the things you have to make certain works all the time, every time, is log in. Does log in work? If it doesn’t, nothing else really matters. We consider that a first taco test. You’re going to get the biggest bang for your buck, and you should probably do this every time. Does tagging yourself in a photo that your third cousin, twice removed, posted from their trip to Uruguay? Maybe not. Maybe that’s your second, third, fourth, fifth taco, and the reason we look at this is because we do absolutely have a finite amount of time to complete testing. Before your next release, you have to get so much testing done on so many browsers, on so many devices, and are you getting the best bang for your buck?|
|The reason we talk about ten tacos is because you can actually get negative marginal return. I want you to picture, by the time you to get to your eighth, ninth, or tenth taco, you could actually start to feel a little sick. If you are Wells Fargo, and you want to make sure that your Canadian business customer can log in from an OFAC country, or Office of Foreign Affairs, someone we’re not allowed to do banking business with, it might take quite a bit of work to get that one test completed to the point that maybe you should just let them call the call center. If you have three testers working on this, here’s an easy number, $50 an hour, and it takes them three hours to get the data ready, prepare it, run the test. You now have a $450 test. A phone call to the call center for this one user might cost you $4. If you have a million of these users, that’s $40 million. Then it pays to do a $450 test. Am I rambling yet? Does that clarify where we’re going with this?|
|Joe:||Yeah, this is awesome, and I think it’s a great analogy. I think it’s a great analogy because I think everyone can relate to that. I know I had a Taco Bell near me, and I definitely feel what you’re saying. What’s interesting now, though, is, when people still struggle with this, and I’m not sure why- I think I know why. A lot of people say, “It’s automated. Why does it matter?” I always have this conversation. “Look, we shouldn’t have to run all our test suite against every single OS-browser combination that’s possible. The maintenance, the things that go into that, it just doesn’t make sense.” Is this the type of thing that you’re talking about?|
|Gregory:||[inaudible 00:06:51] And I’ll be perfectly honest. Automated tests do tend to make your stomach better, to follow that analogy. You would be highly surprised at the companies that do not yet automate. I’m always surprised when I found out what people do and what struggles they find in trying to complete a test suite. That is the one part of it. The second part is I want to empower quality analysts, quality engineers, QS, quality services, whatever the word will be, I want to empower them to take not a hard line stance but the ability to look at the manager or someone who is maybe not fully aware of what goes into this and say, “This isn’t worth it.” I don’t know how many times you ever heard the expression, “Just test it all.”|
|Gregory:||“Oh, come on. I can’t just test it all. No one can just test it all. You want me to test it all? Give me a QE environment. Give me a testing environment that is exactly like production. Let me mirror transactions for a week, and I will record all those and rerun them.” “That’s pretty expensive.” “Yeah, it’s also unnecessary.” Now, that’s an extreme version of it. If you break it down into just, which tacos do you enjoy eating, you might find out that you only like three or four tacos. If people have taken up the Agile process and they have started to point out their stories, you get an opportunity here to point out your tests. Is this a first taco test? Is this a tenth taco test? Is this tenth taco more important for this release than for the previous [inaudible 00:08:39] some of these concepts and ideas in the talk. It’ll be in Dallas at STP Con.|
|Joe:||Awesome. Does this is also go to the point where, like you mentioned before, you don’t necessary have to or want to test out the net. There are certain, I think you hit on this, money paths, where you make your money, where this piece of your application has to be up, because, if it’s down, you’re going to lose money.|
|Joe:||Then there’s these other things that it’s nice to have, but it’s not your core business.|
|Joe:||I do get this resistance. “Oh, we’ll just test everything.” Like you said, it’s up to us to educate our managers and whoever’s telling us, to tell them why that’s a bad idea and why that’s not the best approach.|
|Gregory:||Correct. I feel some people’s pains. For instance, our company is going through, gosh, probably five or six now, the Agile transformation. For anyone that believes it is an overnight thing, that’s not true. Any spokesperson for any company can wave the Agile flag and say, “We’re 100% Agile.” What they’re left with is people from the Waterfall method who are project managers. They say, “Oh, great, we’re Agile now? I don’t like change, because change affects the way that I make my money,” so what you end up with in the beginning is three separate stories. You get the BSA story, and then you get the development story, and then you get the testing story.|
|On day nine of a ten-day sprint, if you’re that lucky, the quality analyst gets to start their story because they have something delivered to them. Then you have to fight that a little bit. How do you fight that? We make sure everything’s all in the same story. The story has tasks, and your PM, project manager, will say, “Okay, here are the three tasks in this story. We have a BSA task, a development task, and a QE task.” What you find is people that don’t necessarily want to change that, if they can’t get past the full team, sitting side by side, sitting next to your dev and having them be able to give you a small chunk of code inside of a day or two that you can then test, do a unit test on, if you’re still finding some of those things, you’re going to find it more difficult to get people to agree with you on what is the value of any given test that [inaudible 00:11:19].|
|Joe:||I don’t know if you’ve seen the company I work for, but this is exactly the issue we have. I know you have experience working with companies, moving them towards Agile, so any cultural tips you can give us on how to move people really towards what is considered Agile?|
|Gregory:||We’re still finding those things with my company. We’re still finding them day to day, but we’ve been fairly empowered, and the changes that have come have been great. Some pointers. One, colocation. Find a pod space. Sit next to them. Make sure your QE’s and your developers are on the same page. That’s not always easy, and believe me, I don’t like to paint with broad strokes, but sometimes you get, let’s call it a fighter pilot or a fighter jock mentality from the developers. “It can’t possibly be wrong. I wrote it. What are you talking about?” That doesn’t happen with all of them, but the culture has to come from within, and believe me, from all that I’ve come across, the least amount of resistance to Agile has been in the QE space. We’re practically salivating at the concept of getting a build. Are you kidding me? Once a week, once every three days, getting a build, and what my team is pushing for is a build once a day. Whatever you’ve got, push it in there. We’ll run our smoke test and automated tests. It’ll spit out what’s broken, and we’ll talk about it then.|
|The team has to come together. We have to find ways to make people not so much part of their own tribe. There’s not a QE tribe. There’s not a developer tribe. There’s not a test lead, dev lead, product owner, Scrum master tribe. I have had something like five different Scrum masters in the last three years, five or six, and that’s going to be a big key to it. If your Scrum master and your product owner are true champions of what you’re putting together, then I think that’s the big start. The second thing is a different concept that we’re trying out. It’s called ‘you build it, you own it’ If we build something a year and a half ago, somebody has a question about it, it’s still on us to answer it. Once that becomes clear that that’s what’s going to happen, you tend to get slightly better product, because you don’t [inaudible 00:13:41] something you haven’t looked at in a year and a half, trust me.|
|Joe:||Gregory, what are your thoughts on Agile and Velocity? A lot of teams always bring up Velocity, Velocity, Velocity. What are your thoughts on that?|
|Gregory:||We very quickly figured out that Velocity was not an Agile tool, at least from the Scrum perspective, that worked for us. We have people reporting into S2, and they’ll say, “Team X has completed 159 of their 172 points.” We’re like, “Holy wow,” because then the next team will come up and be like, “We’ve completed 18 of our 20 promised points.” You would inevitably end up with someone above us, too, that says, “Why is someone going seven or eight, nine times faster or has seven or eight, nine times more work?” You say, “No, one of them’s completed 92% of it …” We figured out that points don’t matter because the wrong people are looking at them. We only use the points inside our team for the purpose of figuring out, “How much can we get done in this two week sprint?” Velocity does not go past our own Scrum master, and it has made us all much happier.|
|Joe:||Awesome. It’s funny you bring that up, because I actually saw an article maybe yesterday that talked about how points isn’t really part of Agile. It’s just some part of Scrum and how they were saying that, when they got away from this point idea, that they end up doing better. I do see that people start gauging it like, “We need to meet our points,” because, if they don’t, they have someone in the Scrum soon, “How come you didn’t meet your deliverables? How come your do-say ratio wasn’t in line?” All this weird stuff. It’s almost like they’re obsessed with this point system, and I think you’re right. Maybe, if we shift our focus and just kept that in just for our sprint team to gauge themselves but not necessarily as a gauge for how well they’re doing, maybe it would help the culture. I don’t know. I’m just thinking out loud.|
|Gregory:||No, that’s a very promising way to think out loud, because I have often thought part of the Agile manifesto, that first thing is, “We value working software over whatever it is they value [inaudible 00:16:09].” Do we care what the points are? Yeah, I’ll put down 45 points or whatever you want me to put down for a day of testing and then make it look great, because you have that slippery slope, too, is, “You want to see us do more points? Oh, sure, no problem.” The next time they go into planning, what was previously a 3-point story is now a 15-point, or whatever the Fibonacci number is [inaudible 00:16:31]. I don’t know. “This is a 20-point story, 21-point story.” You can make them happy that way, but that doesn’t really help the culture, because that concept comes from before time out of mind.|
|If you’ve ever heard the expression, “If you can measure it, you can manage it.” Come on. What am I? Am I an automaton? Am I a robot? I’m a human with a wife and kids. Some days, I am going to get here at 9:25, and some days, I am going to stay until 6:00, but some days, I’m going to leave at 4:30 because I’ve got to pick up my kid. There’s real life, and I’m very fortunate to work at a great company that takes a look at your whole you. I can’t brag too much. I got to be careful what I say, too. Just for the right to have this, I had to go and talk, I had to meet my corporate communications person, but I can say that, yeah, I work for a good company. Man, this is a terrible thing to say, but if you find yourself in a company that is worried about the numbers and the Velocity and all this instead of the great product, maybe it’s time for a different company, man. I don’t know, Joe. That’s a big decision to make.|
|Joe:||Just for the record, I love my company, but I can’t legally say who it is, because the same thing with my legal department.|
|Joe:||They would be all over me. I think it’s just a culture thing. It’s hard to be a catalyst. Like you said, it’s not overnight. A lot of people just assume you go from Waterfall to Agile, and, even though we’ve been doing it for three years now, still, it’s a constant renewing, a constant of telling people and motivating them and coaching them.|
|Gregory:||Oh, yeah, absolutely. I think I can say this. We were pretty famous for finding a very expensive tool and buying it and then modifying it for our very specific use and modifying it so hard that we actually couldn’t accept any of their updates. It’s kind of like in the case of, we went and bought a 90-inch, 4K television because we wanted to play with a really cool box. Once we started getting away from that as a culture, we started looking at some of these tools that are faster or open source. They’re great.|
|The culture did actually give QE a chance. I want to say that we were probably 60/40, contractor to associate, and, when I started, all I needed was a grasp of English and two thumbs to read instructions and record what happened. They had an opportunity, when they decided that they were going to move into the automated space and to a stronger QE world, they had an opportunity to buy the talent or to train it, and they chose to train us up. They chose to take the money and get us our courses in Ruby, get us our courses in [gea 19:47], get us our courses in [oh-off 19:49], and let us go and cut us loose. If your culture is willing to put the money into the dev, I feel like your company should also be willing to put it into the QE.|
|Joe:||Absolutely. I like how you just brought up automation. I’m just curious to get your experience here. With Agile, a lot of teams may be helping to develop the automated test, but another thing I’ve seen teams struggle with is the other part of automation, not necessarily the test part, so the actual infrastructure, being able to build up an environment and tear it down, getting the data population you need for your tests, these really harder things to manage. A lot of people don’t think of that. They only test in their environment. You go into another one, and everything does. Are there any tips you have or any experience you have around how you could automate these things that aren’t necessarily your automated tests but your infrastructure, your data management, things like that?|
|Gregory:||Yes. Holy wow. Those have been pain points for QE since shortly after the Iron Age, is your environment and your data. Fortunately, we have a centralized team, and that is their job. They have done a fantastic bit of work getting us self-serving. I can go right now and get, within 10 minutes, 20 good pieces of data. I can [inaudible 00:21:28]. I can [inaudible 00:21:29]. Now, on the flip side, though, environments, we did partner with a very large outside company who rhymes with [Flamazon Sed Curvaces 21:40], and, when we need an environment, we spin it up. We create it. We test on it with what we call stacks. We know what we’re going to put into it. We spin it up. We test against it, and then, to save money, we turn it off, I don’t know, 7pm to 6am or something, but we test against it, and that’s been our saving grace, because, without that, you would have, like in your case, eight or ten sprint teams that all dump code into one region and expect it all to play nice at the end. We can’t. Part of that comes down to communication. I know that the Agile manifesto says, “We value something over good documentation,” but, as you scale up, trust me, documentation and communication are key.|
|Joe:||Absolutely. We have a wiki page, and we update the wiki page, but to me, that’s almost like, no one’s going to update that, so I’m trying to think, how can I automate that process of, when you make a change, it’s in one place, and it gets automated [inaudible 00:22:47].|
|Gregory:||One of the companies that we bought did a very good job, and I’ll be honest. It probably helped them keep a lot of their jobs, because every time one company buys the other one, your instant savings is in axing people. We told them what we do, and they looked at us like we had a third eyeball. They said, “This is what we do. We create mini services and microservices, and we toggle them off and on.”|
|Gregory:||“We test them in the middle of the week,” and I’ll be honest. This blew my mind. They test and release in the middle of the week. It takes two days. That’s it, but you can’t get an e-mail in there. They’re in a different location. They lock the doors, and I’m pretty sure their custodial engineer is running test cases. That’s what they do, and this is actually really cool. They’re like, “We release on a Wednesday.” We said, “Why do you release on a Wednesday?” They said, “So when people come in Thursday, if there’s a problem, it’s a regular work day.” Otherwise, you’re releasing on Friday night. Friday, midnight to 6am is your window. If something goes wrong, they’re asking people to come in on a weekend. The point being, we had this monolithic app. It still had snow on it in Summertime. This thing was huge. Spielberg called it back, this thing was so huge. Every time we’d have to make these changes, we’d then be required to run this full regression sweep, and we would have to get it into IT or get it into our integrated region at a pace that just … Our rate of innovation was so slow. We had all these things that we wanted to do, but we had hampered by what we’d had up until that point.|
|Joe:||Awesome. Yeah, I love this concept. I try to tell everyone, “If you can avoid an end to end UI test, you should, and you’ll be better off for it.” Microservices seems to be the direction a lot of companies are going. To be actually successful at automation, the less you lie and touch up, I think the better off you’ll be.|
|Gregory:||Yeah, because it takes time. Even if you’re running Selenium, if you’re running Ruby, you’re running Selenium, we partner with a great company out of San Francisco called Sauce Labs. They emulate browsers, and they are good at what they, plus their nice. I got to tour their facility.|
|Do a quick aside. It’s this great startup in San Francisco in what might have been at some point like a meat packing area. I don’t even know, and I’m taking a tour with Neil [Manvar 26:24], fantastic guy. He’s like, “Yeah, she’s our HR department,” and there’s one person sitting on a sofa with a laptop. A whole HR department? We have a floor of two buildings that’s our HR department. Man, are they great.|
|Still, you connect to them, and you’re log in test takes a minute or two, three minutes. You do your validations on each side with your Ruby, and, if you have to do that against … We schedule out our browsers. We have information based on who’s using what. We know that we no longer need to test Chrome 37. I think that through 37 is also PLS 1.1, which is not good anymore, anyway. We know which ones we have to do, and so I’ve got a suite of tests that, on one browser, even though they’re all automated, which some people think is magical, these UI tests for one browser takes an hour. I’ve got to do this on Chrome, Firefox, IE, iPad, the browser, not the app, they say iPhone, but let’s be honest. If it works on one Apple device, there’s nothing sketchy about- That is one seriously homogeneous ecosystem there. Android 4.4 or later, maybe 5.0. I’ve got five or six browsers, and I’m just doing this UI validation, and that’s six hours. If I wanted to test everything, we’d be running non-stop. Then all you have to have is one back end system fail, and your test fails, and you have to run it again.|
|Joe:||Is there any sort of visual validation testing you’re using to minimize your taco consumption?|
|Gregory:||Not yet. I ran into Adam in San Francisco this Spring, Adam [Commay 28:28]. He’s [crosstalk 00:28:32]. I started looking into it probably around November or December of last year. I found out he was going to be there, and I caught up with him and Dave from the Selenium project and a couple of guys, and we just sat around and were talking, and it might not be as great for me, but we definitely have other teams that I want to show this to and explain to them who would benefit most from this. If people don’t know, so you’re doing visual validation. With Ruby validation, what I do is I try to find some element that’s unique to that page that my developer has also remembered to grant an ID to, and I say, “Did this show up?” Or expect browser.elementID(“Monkey_Cheese”).present() = true. If that’s the only thing that shows up, my test still passes. With visual validation and the Moshe with a company called Applitools, they’re out of Israel, you get to see what the page looks like when it shows up. “Did all these items render correctly, or even close to correctly?” When you go from Chrome, whatever we’re on now, 51 to 52 or 52 to 53, “Did it work the same? Is it all good?” My goal is to get our company to visual validation testing. It will be fast. It will make life easier.|
|Joe:||Okay, Gregory. Before we go, is there one piece of actual advice you can give someone to improve their test suite execution efforts and let us know the best way to find or contact you?|
|The best way to get in touch with me is probably through Twitter. I’m KrymDahg, K-R-Y-M-D-A-H-G. It’s a nickname I picked up in college as an RA, because I didn’t let anything fly. They called me Fred McGruff, so crime dog, but crime dog was taken, so now I’m KrymDahg, K-R-Y-M-D-A-H-G.|