4 Essential Test Data Strategies (It’s OK to Be Selfish)

Automation Test Data Strategy
Automation Guild Online Conference Top ad

4 Test Data Strategies

If you’re a test automation engineer, you’ve probably faced test data dependency issues in your test automation suites that have caused all kinds of flaky test behavior.

Not only can this be frustrating, but it can make also make your tests highly unreliable, which in turn can make your team lose confidence in your test suites.

In order to overcome this situation, I highly recommend Paul Merrill’s data strategies which were specifically designed to help solve these test data issues.

What is a Test Data Strategy?

Okay — you might be asking yourself, “What is a test data strategy?”. I like to think of it like design patterns for testing.

Paul’s working definition of a data strategy is the combination of code, procedure, and infrastructure that affects how tests interact with data to stimulate the system(s) under test.

These data strategies, or patterns, also have two main parts: a creational piece and a cleanup piece.

  • The Creation piece is all about how and when data is created.
  • The Cleanup piece is the method you use to set the data source back to a previous state.

1. The Elementary Pattern

You can identify this pattern by the behavior it exhibits: the first test execution works, but the second and third executions don’t work because you’ve modifying data in the data source.

Typically, this is the first approach we all use. The very first time you sat down and started doing automation or testing your system, you probably just started creating and modifying data within the system. And if you’re like most people, you never put much thought into how the data you created data affected other people or other systems.

Sure — this approach might work in rare cases, but how do you make it so that you can repeat that test case? Because it’s only through repeating a test over and over again that you get a clear idea of what the system is actually doing. A better approach might be the refresh data source strategy.

2. Refresh Data Source Pattern

The Refresh data source pattern focuses on refreshing the data source prior to or after you run a set of test cases. It’s not a particularly ingenious approach; it’s one that many of us have used before. You can run that refresh against the data source directly, or you can run it through the system under test.

This approach is great when it works, but often times you might try to go back and use a test case to clean up data only to run into issues. One of those issues might be that it doesn’t necessarily clean up all the data because of either environmental issues you didn’t think through, or a misunderstanding of what the data schema is and which tables were affected as a result.

There may be times you don’t necessarily know how the application is changing the data.
One work-around for this problem is grabbing a snapshot of the production data, then pulling it down and working with that in the test environment. At the end of all the test suites, the last step is to go in and drop all the data from all the tables.

There area also some test data management tools that can help you, like CA Grid Tools or Delphix.

Because cleaning up data can be a headache and test data management can be a pain, many folks turn to the Selfish Data Generation pattern.

3. The Selfish Data Generation Pattern

This pattern takes a different approach in that you use it to create unique data each time it runs without doing any data cleanup whatsoever afterwards.

Paul pointed out that doing a data cleanup is not always needed. If the data’s unique each time and you can keep that data in memory in order to go back and verify it the only need for cleanup is caused by whatever your constraints in your environment are. If, for instance, your constraints are you only have a file system so big that might be one that you need to pay attention to.

The reason that this is called selfish data generation is that there is no thinking about anyone else involved. There’s no thinking about other test cases or any constraints in terms of the size of the database. You selfishly create data and you don’t care about the cleanup. I’ve actually have had some successes with this approach myself.

The last strategy is the data generation and batch cleanup approach.

4. Data Generation and Batch Cleanup.

With the data generation and batch cleanup strategy basically you’re just doing a batch cleanup at the end. You know what data you’ve created and you go in and clear it out after an entire test suite runs.

That means that several things are going on. You have to remember what was created somehow and how to delete it all after a test run.

Some of the fun really starts when you’re trying to figure out how to deal with this data later on. You generated a whole bunch of data, how do you know what it is you’ve created?

You can certainly keep it in a list in memory while test cases are running, but what happens when the test stop execution prematurely. In these cases you would no longer have a list anymore because it’s done with that memory space.

Paul did mention that there are way to over come this but it gets very complicated very quickly. But one thing he found that worked really well was a concept called signed data.

Just like each of us has a signature that is for all intensive purposes unique one can make the data that they create look unique.

Another way to do this would be to mark the data, so inside the actual schema, maybe you have a column for each and every table that’s called is test data. That’s a little bit more cumbersome, but there are lots of ways around these things, but we do get into that problem very quickly of how do you identify the data that you’ve created and how do you delete it later on.

More Data Strategies

So those are the main data strategies Paul and I talked about in the TestTalks podcast episode 105: Data Strategies in Testing.

For more data testing strategy awesomeness check out the full transcript:

Joe:Hey, Paul. Welcome to Test Talks.

 

Paul:Hey, Joe, how are you? Thanks for having me.

 

Joe:Awesome, it’s great to have you on the show. Today I’d like to talk about your data strategies and testing as well as test automation in general, but before we get into it could you just tell us a little bit more about yourself.

 

Paul:Sure, I’d be happy to. My background started in software engineering. I have a computer science degree from Florida State University and for the last 15 years I’ve been working in software from the perspective of a software engineer, a tester of product management, product management. I’ve done all different things within the world of software. I started Beaufort Fairmont, my company, back in 2009 in order to rid the world of bad code. We do that through automated testing and providing folks with training or project work and also consulting, things along those lines, anything for automated testing to help out with creating better software.

 

Joe:Awesome, yeah. I’d like to get into that a little bit later, but first I just want to focus on some strategies that you laid out for data testing management basically which I think is really critical for test automation engineers. It’s actually something I’m struggling with right now. I guess at a high level what is a data strategy?

 

Paul:Yeah, absolutely, so when I first started digging into this idea of data strategies I had to come up with a working definition. The working definition that I have is that a data strategy is the combination of code, procedure, and infrastructure that affect how tests interact with data to stimulate the system under test. That’s the beginning and really what I was trying to understand was why is that we always have to deal with these problems in every test situation we get into and what is it that we’re really trying to solve?

 

I use a lot of terminology with the presentation that I’m doing that I mentioned that I’m doing at STARWEST coming in October and a couple of other places. Really the key that I was trying to get down into is why is that we always have to deal with this and how do we deal with isolating test cases in such a way that they can run in parallel, that they can run on multiple different machines, but still interact with shared resources. The key to it really was that shared resource. The first one that I’m trying to overcome is the idea of data strategies and dealing with data sources.

 

Joe:Awesome. Yeah, it reminds me of patterns. You have a presentation, I’ll have it in the show notes, but I think you had 6 or 7 data strategies that you lay out. Before we get into those strategies I think you broke it out into two pieces, what makes up a data strategy. Can you tell us a little bit more about what those

 

two pieces are of any data strategy?

 

Paul:Yeah, so you’re totally right. I love the book, the Gang of Four’s book, Design Patterns from the software engineering world. It’s one of the better books that’s out there, it’s a classic. I took some of what they’ve done and thought about how we interact with or how we look at experience and identify patterns within our experience. I did that in this role of data strategies. I break it down into two parts. In trying to explain anything you really have to know it a lot better than just doing a thing. The two different pieces are the creational part of the strategy, the creational strategy, and then the cleanup strategy.

 

Joe:I think this is critical because a lot of times I’ve seen data strategies where people populate the data, but they don’t clean it. Then they end up getting junk data over time and it really causes a lot of headaches down the line if they’re not aware of it.

 

Paul:Yeah, exactly, so this is the first approach we all use, right, Joe? The first time you sat down and started doing any automation or testing anything you’re interacting with this system under test. You’re creating data, you’re modifying data within the system and it really isn’t even something one would think about how doing that affects other people or affects other systems. Whenever most people go out to write their first automated test regardless of the tool that it’s in … We’ll use the example of creating a user profile or let’s use changing the password on a user profile.

 

Writing a test for that would be pretty easy. You already know a user that works in the system. You have a password that works in a system, so your test case would go in and log into the system using those and then modify the password. Then check to see if the information on the screen looks like that was a valid … We did what we expected and then we move on from there. That’s an easy test case, but it only works the first time because the second time you go in the password’s different. That’s what I’m talking about here and what you’re talking about is yes, we’ve got all these different ways to change the data in the system, but how do we make it so that we can repeat that test case because it’s only through repeating it that we get a very good idea of what the system actually does.

 

That’s what I’m calling the elementary approach. I thought about a lot of different names for this and this was the one that was the least negative is the elementary approach. It’s the one that we all do when we start out. The first execution works. The second, third execution don’t work and that’s because we’re modifying data in the data source. Some of the pros and cons we talked about, it’s really simple. It’s really easy to do this because we don’t have to write any code to associate with it. There’s really no support whatsoever for it because we’re not really doing anything. The test case has information that it’s going to

 

work with and it creates the data that it needs and modifies it.

 

Joe:Awesome, so I don’t know if your data strategies go up in sophistication as you go along, but the next one I think was the refresh data source approach that you went over in your presentation. Can you just tell us a little bit more about that approach.

 

Paul:Yeah, absolutely. I guess I first ran into this one back in 2002 or 2003 and it’s one of the first ones that we run into as well as an approach to how to deal with changing data and being able to run test cases repeatedly and in isolation and in parallel and these types of things. Refresh data source, it basically means we’re going to go in and refresh the data source prior to or after we run a set of test cases. It’s not a particularly ingenious approach. It’s one that many of us have used before. You can run that refresh against the data source directly or you can run it through the system under test.

 

Many times we’ll try to go back and use a test case to clean up data. That can work in a lot of systems, but it doesn’t necessarily clean up all the data because you’ve got this application in the way. You don’t necessarily know how the application is changing the data. A lot of times going directly to the data source and interacting with it in order to refresh it is a good thing to do. Many times people will grab a snapshot of production data and they’ll pull that down and work with that in their test environment. That’s basically the same thing and then at the end of all the test suites they’ll go in and drop all the data from all the tables.

 

One of your episodes talked about data virtualization. I think it was episode 87. It was really great. It really cued me into the fact that you guys are very much in line with what I’ve been talking about with data strategies. It sounds like with data virtualization you can do a lot of this very, very quickly. Most of the time refresh is the one that I just mentioned, it could take a long time. There may also be a lot of complexity to it. With rules and regulations like HIPAA and others you may have to go in and actually massage the data coming from production if you’re going to do that in order to use it. Yeah, I want to say thank you for that episode. [Inaudible 00:06:59] someone’s really great by the way.

 

Joe:Awesome, yeah, thank you. They actually have a free … Once again I’ll have a link to it. They have a free version of their app that you can try out and it supposedly helps a lot with this. I haven’t actually tried it myself fully, but it is the direction I’m thinking that I probably should go in my data strategy. I guess the main reason why is … One of the main reasons we have or issues we have is we have a backend that’s [inaudible 00:07:21] in Sybase and there’s a schema on top of it that no one knows. It’s just literally two guys that know it. When we were trying to write tests for it a lot of our tests we said oh, we’re going to set up data and we’re going to clean up data. We have these dynamic sequel statements going on, but sometimes these statements aren’t really touching all the tables that they should be testing or touching, so when you clean up you have these orphan data that causes other issues. Then it just creates a ton of headaches, so I don’t know …

 

Paul:I hear you.

 

Joe:Is that what a refresh [inaudible 00:07:50] one of the cons would be is if you don’t know the database enough then maybe your cleanup routines may not be working as you think they should be?

 

Paul:That could be the case in complex situations like if you have more than one data source and you have to clean up multiple ones. Generally with the refresh data source you’re just going in and blasting away the entire data source and refreshing it completely. That takes care of those problems in most cases. If you have more than one data source and the data’s related you can have that problem. As you go through these and learn more and more of these patterns, yes, absolutely you run into these problems where as you’re developing these strategies or as you’re implementing them you can run into situations where you have to have orphan data. That’s one of the problems that I wanted to help people know how to fix and give them a guide in how to avoid in this presentation and in some writing that I’m doing, but don’t tell anybody that I’m writing about this, Joe.

 

Joe:I hope you do write it because I think this maybe causes 50% of my test issues, I think, and probably a lot of other people.

 

Paul:I think so, too, and each client that we walk into this is one of the questions that we ask when we start talking about automated testing is how do you manage your data? What is that you do in your test environments to get good data to test on and what is the procedure to do it? If there’s a blank look on people’s face, which many times there is, then we have to do some education and talk through this. That’s another part of where this presentation and some writing about it helps. It’s very important when you’re doing this work to be able to formulize your thinking in order to help other people understand because that’s really all I’m trying to do with my company and whatever else is help other people do this more effectively.

 

Joe:I guess the next approach, it’s one of my favorite named ones is what is the selfish data generation approach?

 

Paul:Right, right, so I got into looking at this one. I actually did this one first. I guess I probably implemented it long before I realized I did. It was probably pretty early on that I did this. The first time that I did it really consciously was a few years ago and the situation was such that I had a client who I asked what’s our data strategy, what are we going to do with this and there were blank stares. We started doing some education and talking about it, but there wasn’t a whole lot of importance placed on it. It was difficult to get some decisions made and at the time I felt it would be a good approach to let’s just do what we need to do to move forward. That was the attitude of the team, we were an agile team. It was take on responsibility and get things done and when we run into problems we’ll fix them and move forward which is an attitude I love.

 

Basically, what I decided to do was each test case would create a data that it needed which is not novel. Plenty of people do that. In fact, it was creating unique data each time a test case ran and there was no cleanup whatsoever. If the data’s unique each time and you can keep that data in memory in order to go back and verify it the only need for cleanup is caused by whatever your constraints in your environment are. If, for instance, your constraints are we only have a file system so big that might be one that we need to pay attention to.

 

In this particular case we were using Mongo on the backend and I kept hearing these conversations, I was not familiar with Mongo at the time, but I kept hearing these conversations about shards and shards getting too big. I couldn’t connect two and two and understand that it equaled four. Shards apparently are pieces of the database and they were getting large because the automation was creating data constantly and there was no cleanup. That was people screaming and saying ouch, don’t do that anymore.

 

Since then we’ve been able to go back and create cleanup strategies to deal with that. The reason that this is called selfish data generation is that there is no thinking about anyone else involved. There’s no thinking about other test cases or any constraints in terms of the size of the database. We selfishly create data and we don’t care about the cleanup. I love the name, too, Joe.

 

Joe:Yeah, that’s awesome. I’ve actually seen some success with this approach. In theory sometimes I work with large enterprise companies and their database could be very huge. Maybe over time over a few years we may see the issue, but in the short-term just populating the data and not having to worry about cleanup may be … I don’t know if you agree, is it a good place to start or would you just recommend always have a cleanup or does it just depend on the situation?

 

Paul:I’m like you, I think it depends on the situation. There are places that you walk into and they already have a policy or procedure in place to do some type of cleanup and it’s effective for what they need. Certainly you can find a lot of bugs and defects when databases grow and when you have large amounts of data. You can find usability issues and whatever and that can be very helpful. I’m a big proponent of practicality and doing what is needed at the time for the particular client in that situation. That’s usually different from one client to the next.

 

Joe:Yeah, great advice. I definitely agree. You have two other ones, I believe. The next one is data generation and batch cleanup.

 

Paul:Yeah, so it’s the same idea. We’re generating data in one way or another. One tool that I love in C Sharp and in some other languages, I think it’s available in Java and there’s something similar to it in Java Script as well is a tool called Faker and it’ll create things like names or phone numbers or addresses or whatever and make them look very real, company names, all that kind of stuff. It makes them look like real data and it’ll generate something randomly for you each time you use it. Even you can use it with numbers and things like that. Any way you generate data it’s up to you with each of these strategies, but data generation and batch cleanup basically you’re just doing a batch cleanup at the end. You know what data you’ve created and you go in and clear it out after an entire test suite runs.

 

That means that several things are going on. We have to remember what we created somehow and we have to know how to delete it all after a test run. Some of the fun really starts when you’re trying to figure out how to deal with this data later on. You generated a whole bunch of data, how do you know what it is you’ve created? You can certainly keep it in a list in memory while test cases are running, but what happens when the test stop execution prematurely. You have no list anymore because we’re done with that memory space. Then you’ve got data that’s been orphaned just like you alluded to earlier, Joe. There are a lot of ways to do this and I’ve tried a whole lot of different things.

 

A couple of things that I mentioned in this presentation and as I talk through with clients are some ideas like certainly keeping a list is fine, but if you can’t keep it in memory where do you keep it? You have to start persisting outside of memory and what does that mean? Does that mean we serialize it in some form like a class that holds these values and somehow serializes it to a file system? Do we create a supporting test database in order to hold this information and what does it look like? If that’s the case we have to know which data source we’re interacting with, we need to know which table these things are in. If it’s a database, if it’s a file system or something we need to know locations of the files. Then we need to know which pieces of the data to delete, so it gets very complicated very quickly if we go that route.

 

Some things that work really well are a concept that I come up with and hear of signed data. Just like each of us has a signature that is for all intensive purposes unique one can make the data that they create look unique. For instance, in one particular row like a user profile in a table we might use the actual username as our field to be unique and then put the value in to make it unique like test underscore and then whatever the username is at Foo.com. That might be a way to do that, so you basically have a signature for that data.

 

Another way to do it would be to mark the data, so inside the actual schema maybe you have a column for each and every table that’s called is test data. That’s a little bit more cumbersome, but there are lots of ways around these things, but we do get into that problem very quickly of how do we identify the data that we’ve created and how do we delete it later on.

 

Joe:Yeah, awesome. I definitely agree with those approaches. I guess what I struggle with is I have 8 to 10 sprint teams and they have their own little worlds, their own little future files. The future file needs data and it may impact another future file that another team has, but of course no one communicates with one another. They just assume the data will be in a certain [inaudible 00:16:35] and they can use it. Do you have any strategies on how you can isolate per feature or how do you … Is there an isolation approach you can use per test to know that I belong to … This data belongs to this test and don’t touch it?

 

Paul:I’m trying to think through how I approach this and basically the way that I approach these problems and the way that I try to break down my thinking on this was what are the things influencing the decision on how to use a particular data strategy and what are we trying to achieve. In this case it sounds like you’re trying to achieve the isolation of test cases in such a way that they don’t work with the same data.

 

There are probably some hard and fast rules that one can make around the teams provided all the teams buy in and they’re all implementing them. That would be policy and so the more I got into this the more I realized there’s several forces that act on us as we’re determining which data strategy to use. Each of those forces break down into about 5 different abstractions. Is there a constraint factor? Each particular constraint would break down into one of these 5 and these are the ones that I’ve determined so far and this is up for debate and change later on. I’d love to have feedback from your listeners about this.

 

The way that I see them is that basically you have policy, cost, expectations, infrastructure, and people. Those are the abstractions that we break constraints down into. The constraint that you’re talking about would be … It sounds like you have shared data that different test cases are using. I would say why do you have the shared data and go from there. Is it from laziness or is it from a lack of policy? Is it because we don’t know better or is it because we haven’t talked about it yet. Those would be the questions that I would start asking.

 

Joe:All of those. We get into communication, so that stinks because that’s not easily solved.

 

Paul:It’s not, it’s absolutely not. The people to people relationships with all this stuff are much more important than the technology and much harder to hack. That’s something that I think most of us become aware of over time after 15 years

 

doing this.

 

Joe:It’s almost like culture. Culture is so difficult. For example, you talk about … This was funny. I was watching a video. Once again I’ll have this in the show notes on, what was it, principles of test automation. You mentioned this concept of the elite automation engineer. I started the project going … I want to treat this like open source. I set up the high-level framework, but the teams will contribute to it and make all the generic things and add it to the framework. Everyone could see, they could contribute, it’s an open book and everyone would love it. In this culture it’s almost like they want command and control. They want an automation person, so they could point to and go you’re in charge of that, we don’t have to worry about it. It’s almost like the opposite of what you recommended in that video. I just found that very insightful.

 

Paul:Yeah, and that was several years ago and I’d like to go back and look at that again. It’s been a long time since I’ve looked at it, but I remember the concept of the elite engineers. In a lot of places there might be one particular individual who is allowed to change the test framework. There’s one individual who gives the advice and says go or no go to any new idea related to the test framework or test frameworks. I think that there is a lot of value in expertise and that there could be a whole lot of power in it, but we have to make sure that with any amount of power that it’s used in such a way that it helps us all reach our goals.

 

My experience with the elite tester syndrome or the elite tester situation is that many times it leads to less getting done, it leads to less motivation and less learnings from those who could potentially be contributing or are contributing, but not as much as they’d like to be. It also becomes a bottleneck for the organization to have one or two people that are the gatekeeper for your test framework. That’s been my experience. Now there are other places where everybody is a novice and you have to have somebody who is experienced. I’m not sure that that’s the very same pattern. I’d say the elite tester is a little bit more of a … has more of a negative connotation whereas a true leader and mentor is something completely different.

 

I think culture tends to be built off of very few policies, but they’re policies based on fundamentals that the organization wants to see exercised. Yeah, when the fundamentals are enablement, when the fundamentals are empowerment and making sure that people are motivated and helping people reach toward goals that they want to see and to succeed, when those goals are in line with the organization you have a rocket ship of an organization. When you have lots of policy and lots of rules and lots of gatekeepers generally there’s a lack of motivation and your highest talent is not going to stick around for very long.

 

Joe:Yeah, once again another great point, I definitely agree. Could you tell us a little bit more about your consulting. Like I said I see a lot of companies, I speak with a lot of people. They all seem to be dealing with multiple challenges with test automation. Is this something you can help people with? Is this something … a service you provide to companies?

 

Paul:Absolutely, so there’s three main pieces of Beaufort Fairmont Automated Testing Services. The first is automated testing. If you have project work, if there’s a certain goal that you want to achieve with automated testing and you need us to come in and work on that and help you get moving we’re all about it. In fact, we’ll help your team to get up to speed, so that they can take over. I have absolutely no ego about it in terms of needing to stay there. We’ve got plenty of business and we’re happy to help teams get up and running and move out of there when you’re ready to go.

 

The second piece of this is training, so a lot of times what I find is you have a certain culture that you want to instill in your organization. We were talking about culture. A lot of times that can come from sitting together and working together in a training environment, so we do train teams for a number of different reasons, but mainly for automated testing products or getting into the principles of automated testing and making sure that we understand those. We do this for our open source products, we also do it for third-party products although that’s a little more rare.

 

Yeah, so testing, we do testing projects, we do training and we also do consulting. There are plenty of times when companies just want to have a conversation with us and I’ve got a new thing that we’re working on which is a live workshop which is pretty cool. We spend a couple of hours working with folks just on whatever need they have right now. It’s a great way for us to get to know each other. It’s a great way to solve whatever … or try to figure out whatever problems are going on at the time. Our team is very … We tend to be pragmatic and we tend to be down to earth. If we don’t know something we’re going to tell you we don’t know it. The last thing we want to do is give you an answer that doesn’t make any sense for you because we felt like we needed to give you an answer.

 

After this many years of doing this we do have a whole lot of answers, Joe, so if we can help a lot of teams get moving. Many times teams are just getting started with automation and they want to know a direction to go. Plenty of times teams want things like how to figure out which tool to use. That’s a big one that we get. We’re definitely willing to help folks out with that. A lot of times people don’t know what they don’t know. What is it that a particular client doesn’t know and how do we help them get to the point that they’re aware of it and can move forward. That’s one of the things that works for us very well as well.

 

Joe:Awesome. That’s a great point. I think we’ve been doing it for so long that for me sometimes I’m like … I just assume they know some of this stuff. I have a new team that comes onboard I just assume they know certain principles, basic principles about automation, but it’s almost like constantly educating your team, I think, is really critical.

 

Paul:It is and if you think about it we’re willing to go to the doctor a couple of times a year, we’re willing to go to the dentist 2 or 3 times a year depending on your situation. Why aren’t we willing to go sit down with some experts a couple of times a year to just check in and see how we’re doing. You’re absolutely right, the concepts that you and I are rock solid on, continuous delivery, continuous integration, things of this sort, they’re brand new to other people. I sat down with a client recently just a few years ago and I’m thinking hey, look, I don’t know what year it was, but say it’s 2013. It’s 2013, everyone knows what continuous integration is and they had no clue, just a blank look on their face.

 

The idea of doing testing in a continuous integration environment was completely new to them. Then what the benefits could potentially be were new to them. I think for me personally why do I like automated testing? It came from a developer standpoint because I loved having feedback from something that’s giving me a source of truth and having it almost immediately whether it was from [inaudible 00:25:26] tests or integration tests or unit tests, whatever it was, something giving me feedback that I changed something and done something that created a regression, for instance. That was awesome and being able to get that feedback loop as small possible for me was essential.

 

Now providing that to developers is a whole other world because I have developers who have never had unit tests or integration tests that would give them feedback right away. When they start seeing it of course sometimes at first it’s really unnatural. I have all this red and red is bad and why are these tests failing and I feel a certain way about my test cases failing or my code creating failures. Working through that together, but getting to the point where they’re like oh, my goodness, this is awesome. The fact that I do anything and I know how I affected the stated requirements, that’s awesome, so more test automation awesomeness for you there.

 

Joe:Awesome, yeah.

 

Paul:Yeah, it’s a great thing that I’ve experienced and getting to see that light bulb go on with other people is just terrific.

 

Joe:Yeah, I try to promote that all the time with developers and say it’s a safety net. You should enjoy it. Real developers have a good feeling that they have a test to test anything they changed to know hey, I didn’t break anything and that should be helpful, not a hindrance. It shouldn’t be holding you back. It should be encouraging you to make changes, so you’re not scared to make changes also.

 

Paul:Right, and that comes down to our core beliefs about how we write software, too, though, Joe. Think about why is it that one person writes software versus another. One person is going in because they got to put food on the table and this is how they do it. Another person goes in because they want to create something that affects the world around them in a positive way. Those two trains of thought are completely different. You have to sit down and figure out where people are coming from, where organizations are coming from, where a team wants to go and what their goals are, what their objectives are and then work with them to get there. Not everybody wants fast feedback, not everybody … Some people just want a really great audit trail for regulators and automation can provide that.

 

Joe:I definitely agree and I guess once again people are different, but those type of people put food on the table and that’s the only reason why I’m doing it really annoy me, but that’s maybe just because I’m passionate about it. Not everyone’s passionate about test automation. Obviously we’re the freaks probably in this case.

 

Paul:I hear you. I’m passionate about it as well. I love this stuff, I love helping the light bulb come on for people. I love helping build better products and really I want to make the world a better place and I do that through automated testing. We all use software every day all day long and I am so tired of people telling me the computer says I can’t do that. Any of those cases where that happens or where my bank account bill comes in the mail, yes, I still get it by mail and it’s got the wrong name on it, I really want those things to be fixed. Yeah, I’m passionate about this as well.

 

I’ve come to a different realization lately about people and their intentions and that is sometimes their current intentions don’t have any indication of where their future intentions might be. If somebody’s coming in and they’re just trying to put food on the table it may be a really rough time in their life and that may be what they’re trying to do, but it doesn’t mean that tomorrow they’re not coming in psyched about their job. I really feel there’s an opportunity for folks like you and I to inspire people about their jobs and inspire people about what they do. I think you do that on Test Talks.

 

Joe:I appreciate that. Talking about podcasts, I love speaking with my fellow podcasters. You have your own called Reflection as a Service, great segway. What is Reflection as a Service? Do you talk about test automation topics or is it more broad?

 

Paul:It’s a little more broad. We do talk about test automation from time to time, but it’s a podcast about software engineering and entrepreneurship. My buddy, James Jeffers and myself, we love to talk about all things software engineering. We’re both entrepreneurs, we’re both learning the entrepreneurship game, but we’ve been in software engineering for quite a while. We’re constantly working together to try to move software engineering teams forward, to help people accomplish their goals with software, with test automation as well. We had all these really great conversations. At some point it was like hey, let’s talk about this, let’s record this.

 

I was doing a talk for a local group here in the research triangle and one person came up to me afterwards and said a lot of the things that you’re talking about we learn through reflection. I think this person said they felt that was being lost lately. That there’s not as much time to sit down and do reflection these days as maybe there was in the past. I think he was really right because every time I’m standing in line I look around and everybody else is on their phone. We’re not sitting around thinking about things anymore and a lot of what James and I were doing was reflecting on things. The name and the branding, obviously software as a service, we went with Reflection as a Service. We just wanted it to be a place where we investigate entrepreneurship, where we talk about software engineering and it’s done really great.

 

We’ve been doing it since September of 2015. We’re not as far along as [inaudible 00:30:44]. I’m open to any tips that you can give us about it. We’re just starting to get into production value and learn a little bit about how production should work. We probably should have done that before now. Once again I’m a pragmatist and getting the information out there and the content was more important initially than the production value, but now we’re starting to get back into that.

 

Joe:Okay, Paul, before we go is there one piece of actual advice you can give someone to improve their test data management efforts and let us know the best way to find or contact you.

 

Paul:Determine the goals of your team with regard to test automation. Understand the surroundings in your environment and the forces that are acting on you. Then make a determination and make sure about what to do and go forward from there. I think so often we spend time just looking at technologies and tools to solve problems, but there’s so much more going on. Until you look at the entire scene to understand all the details and all the nuances going on with the people around you and the policy and whatever, the decision you make is usually inaccurate until you do that.

 

Number one, I’m available on Twitter all the time @DPaulMerrill. My podcast is at Reflection as a Service. You can find my company Beaufort Fairmont online, BeaufortFairmont.com. We’re on Facebook, on LinkedIn, all those other places. I’m always open to getting together and getting to know folks. I’ll be out looking to meet new people and their new ideas coming up soon in June of 2016 at [inaudible 00:32:11] in Raleigh, North Carolina. In October I’ll be speaking about these data strategies that we’ve been talking about, Joe, at STARWEST out in Anaheim, California.

 

Joe:Nice.

 

Paul:Then in November I’ll be at the Better Software Conference down in Orlando speaking there on iteration development and automated testing from the inside out. There’s a lot of other stuff coming up as well, but those are the main ones. I’m always looking forward to meeting people and learning from them. You can learn by reading, you can learn by doing, but listening to other people’s experiences and gaining insight from them is one of the greatest accelerators I’ve found.

 

2 comments
Testing Bits – 6/5/16 – 6/11/16 | Testing Curator Blog - June 12, 2016

[…] 4 Essential Test Data Strategies (It’s OK to Be Selfish) – Joe Colantonio – https://www.joecolantonio.com/2016/06/09/4-test-data-strategies-automation/ […]

Reply
Nikolay Advolodkin - June 14, 2016

I’m also a user of the selfish strategy but I would love to be able to do the data refresh. Capture a snapshot, use the data, delete the data. That seems ideal to me. But I struggle with getting solutions like Delphix implemented in my workplace because there isn’t enough importance placed on data management.
Any ideas on how to convince management on such a solution?

Reply
Click here to add a comment

Leave a comment: