The Biggest Mistakes We Made on Our Agile Journey (and Why We are Glad We Made Them) — Bug Fix Thursday

I am working with Tim Coonfield to develop a talk for the one day Agile Shift conference scheduled for April 12, 2019 in Houston, TX titled “10 Biggest Mistakes We Made on Our Agile Journey (and Why We are Glad We Made Them)“. This is the second in a series of articles that Tim and I will use to explore some of those mistakes and what we learned from them.  You can find other articles in this series here

If you are interested in hearing this talk or some of the other awesome speakers and topics that will be covered at the event, you can learn more about the conference and purchase tickets here.

Everything I will share here happened at Global Custom Commerce (GCC), a Home Depot Company, as we developed and improved our 2nd generation e-commerce platform, called Autobahn, designed to make it easy for customers to shop for and buy more complex configurable products and services like custom window blinds, custom windows, flooring and decks.  

In 2011, when the Autobahn platform started development, GCC was already the #1 seller of custom window coverings online and owned several brands including blinds.com. We were a couple years away from acquisition by the Home Depot and had about 80 employees. The existing e-commerce platform had been in production for a number of years and was still actively being improved by a large team following a traditional project management philosophy using Gantt charts and reporting on percent complete.

The Autobahn project marked a number of firsts for GCC including our first use of cloud hosting at AWS and our first use of agile methodologies. This article highlights one of our bigger mistakes and how we were able to improve as a result.

In the early days of the effort to deliver our 2nd generation e-commerce platform, Autobahn, we adopted the Scrum methodology and were following all the practices “by the book” including sprint planning, stand-ups, sprint review and sprint retrospectives.  However, the company continued to use more traditional QA techniques and processes. As a result, QA engineers were assigned to the team but continued to work semi-independently with their own manager. This obvious mistake was rectified fairly quickly and is not really at the heart of what I want to tell you about here. It’s just important to note since it could be tempting to attribute what came next to where we started with QA engineers sitting half inside and half outside the team.

We also made one large decision around architecture that impacted this story somewhat. After attending Udi Dahan’s distributed architecture course, many on the team wanted to focus on building a microservices architecture with small, loosely coupled components connected generally by asynchronous messaging. Remember, this was 2011 when these ideas were just gaining widespread attention. After consulting with experts, including Udi, we were advised to build a monolithic web application using a more traditional layered architecture to provide separation of concerns. For the first year or so, that is exactly what we did. By the time we started to introduce microservices, there was a pretty significant monolith sitting at the center of our platform. In retrospect, this was clearly a mistake. This certainly contributed to what follows, but was not the sole or even the most notable cause.

When the Autobahn effort started, the business and the development team all agreed that quality was one of the most important things required to make the new platform successful. In fact, the new effort was chartered, in part, based on the promise that quality would be baked into the new platform from the beginning. After all, the company had suffered for years from quality issues on the existing platform and was tired of spending too many cycles fixing problems and not enough time truly innovating.

The team invested in quality from the beginning. Early in the first sprint, we had automated continuous integration including a comprehensive unit testing suite that ran on every commit and failed the build if any tests failed. We also implemented code coverage reporting and focused on achieving as close to 100% test coverage as we could get. The team cared deeply about quality and was fully committed to writing and maintaining unit tests to make sure things worked as designed and continued to work as the code base evolved.

Besides unit tests, our definition of done included an expectation for QA system, integration and regression testing. The QA engineers on the team were responsible for writing test cases based on the stories. Once stories were ready for testing, the QA engineers took responsibility for executing the test cases and recording issues on pink stickies that were added to the physical scrum board maintained in the team area. Software engineers took responsibility for fixing all the bugs the business deemed important in the sprint. Once stories were completed and integrated into the main branch, QA engineers focused on testing for regression using a rapidly growing set of test cases stored in a test case management system.

Within a few sprints, a clear cadence and separation of duties naturally developed within the team. QA engineers would start the sprint trying to automate key use cases from the sprint before. They would also work with the PO to produce a set of test cases for the stories committed in the sprint. Meanwhile, software engineers would start several stories in parallel. By the early part of the second week, software engineers would start finishing up stories with passing unit tests and QA engineers would start UI and integration testing. By mid-week, the team would have all the stories in good shape and would start merging everything into a release branch. Thursday morning we would lock down the release branch so our QA engineers could focus on regression testing and work with the software engineers to make everything ready for review on Friday.

After a few sprints of this, the team started referring to the key testing day in the sprint as “Bug Fix Thursday”. It was a neat way to describe the code freeze that would happen each sprint after the merge completed and regression testing started. Up until Bug Fix Thursday, the team was able to focus on developing new features for the sprint. Starting in the morning on Bug Fix Thursday, the software engineers would generally work ahead on stories lined up for the next sprint if they weren’t busy fixing bugs identified by QA engineers on the team.

Sometimes we had trouble getting stories ready in time for Bug Fix Thursday. Most of the time we simply relaxed the code freeze rule to allow ourselves to add to the release branch later on Thursday, or, in extreme cases, Friday morning. This put a lot of pressure on the QA engineers to either rush through regression testing or to perform multiple rounds of regression testing. It also led to some unhealthy behaviors like allowing regression testing to leak into the beginning of the next sprint. Since the team was small and the platform was not in production yet, we were able to live with some of these problems for quite a while.

As Autobahn gained momentum and the team grew, Bug Fix Thursday got a little uncomfortable. As one agile team grew to two and then to three, we starting feeling the pinch of Bug Fix Thursday more and more often as teams struggled to merge and test all the sprint’s stories in time for the demo on Friday afternoon. Although we introduced more cleanly separated microservices that could be deployed independently, most sprints included functionality that touched the monolithic customer or associate web sites and required extensive regression testing to ensure everything worked as expected. QA engineers felt the pressure the most as regression testing routinely leaked into the following sprint even for stories the team was calling “done”.

Processes were improved to compensate. The one that seemed to help the most was focusing teams on getting more stories done and ready to deploy in the first week of the sprint. This forced teams to work on one or two stories at a time and to make sure they were merged and regression tested before moving onto another story. Although this did not eliminate Bug Fix Thursday, it gave the QA engineers enough confidence to time box regression testing by reducing the number of test cases checked on Bug Fix Thursday.

As we grew from three teams to six and started exploring new business opportunities, Bug Fix Thursday started to get very uncomfortable again. The team exploring new businesses started to release pilot components more frequently, mainly because these systems had very small impacts. However, when they touched critical system components, which was far too often due to the monolithic nature of the system core, their code had to be merged into what was becoming one very big and complex sprint release. The team was also surprised by how these “safe” releases managed to break things in unanticipated ways. We beefed up our unit testing. We added integration tests. We tried adding a QA engineer to float outside the teams and focus on writing more automated UI tests. We brought automated UI testing into the sprint. We challenged our software engineers to work more closely with the QA engineers on the team to finish regression testing at the end of the sprint. We even turned Bug Fix Thursday into Bug Fix Wednesday for a little while to allow more time for regression testing to complete. Some of these changes worked and stuck, some didn’t, but overall the various changes seemed to help us keep Bug Fix Thursday manageable. We got to the point where releases would happen the Tuesday after the sprint and the business was reasonably satisfied.

Behind the scenes, our QA engineers were barely holding things together. They worked long hours on Bug Fix Thursday often testing late into the night. They tested Fridays after sprint review to make sure the release was ready. Testing often continued through the weekend and into Monday. Occasionally, testing could not get done by Tuesday and releases would slip into Thursday and, in extreme cases, into the following Tuesday.

By the time we added our eighth development team, the unrelenting pressure had led us to make a number of quiet compromises on quality. The pressure to finish last sprint’s testing left QA engineers with little time to write and maintain automated UI tests. Because comprehensive regression testing was taking too long, manual regression testing focused on areas the team thought could be impacted by the changes in the sprint and very little time would be spent testing other areas. Because schedule pressure was almost always present, the team did not believe they had the time they needed to clean up the monolithic components so technical debt was growing and it was getting harder to accurately identify the parts of the system that really needed regression testing.

Once we grew to 12 teams, the symptoms were clearly visible to the team and our business. One sprint’s release took so long to test that we decided to combine it with the subsequent sprint’s work into one gigantic release. “Hot fixes”, intra-sprint releases made to fix critical bugs that were impacting our customers, became common. In fact, we were starting to see cases where one hot fix would introduce yet another critical issue requiring a second hot fix to repair.

Finally, the pace of change completely overwhelmed our teams and processes. Release after release either failed and required rollback or resulted in a flurry of hot fixes. In one particularly bad week, the sprint release spawned a furious hydra; Each time we fixed one problem, two more would show up to replace it. By that time, I was leading the IT organization and, after consulting with team members and leaders, I mandated strict rules around regression testing, hot fixes and releases to stop the bleeding.

Simultaneously, we launched a small team of three people dedicated to improving quality and our ability to release reliable software frequently. We named it Yoda. We claimed it was an acronym, but I can’t find anyone that remembers what the letters were supposed to mean. Its biggest concrete deliverable would be an improved automated regression testing suite. We also asked the Yoda team to find ways to simplify the release process and improve the overall engineering culture.

Over the next several months, the Yoda team made progress. As expected, automated tests improved. However, the big improvement came from improvements in the release management process and the culture.

Although by this time the web sites were still pretty monolithic, they were surrounded by microservices that were independently deployable. The teams had also made progress on making aspects of the web sites independently deployable. The Yoda team spent some time documenting the various components and worked with various development teams across the company to determine which were truly independent enough to release on their own and which required more system-wide regression testing. Yoda improved the continuous delivery process and added a chatbot to make it easier for development team members to reliably deploy. They worked with the development teams to make releases easier to rollback too.

Once the Yoda effort gained momentum and the development teams were ready, we relaxed the rules around regression testing and releases for the components that Yoda identified as reasonably separated and safe to release independently. Over the next couple of months, we went from 1 large release per 2-week sprint to over 50 per week. Because releases were smaller, they were easier to test and quality improved. Hot fixes became rare again. Rollbacks occurred from time to time, but, because teams planned for the possibility, did not create the kind of drama we observed in the past.

Process changes were also required. As the number of releases per sprint increased, we realized visible functionality was making it to production before business stakeholders had a chance to formally review and approve it. As a result, teams started to demo stories to stakeholders as soon as they were done and ready to deploy. For some teams, that made the traditional end of sprint review exercise far less useful. Therefore, some teams stopped performing the end of sprint review though they continue to value and practice retrospectives based on the feedback received from the many stakeholder reviews and releases that happen during the sprint. As they work more story by story, teams are gradually starting to look at things like cumulative flow diagrams and cycle times and are starting to experiment with other agile methodologies, such as Kanban.

And so Bug Fix Thursday lived and mostly died within our agile process. At times, it served us well. At times, it reflected problems in our process or our code. At times, it created additional problems and raised stress levels. The solve, though obvious in retrospect, was terribly counter-intuitive especially in a world where the codebase includes some critical monolithic components: Create and nurture a culture that values releasing more and more frequently. Smaller, more frequent releases make testing easier and the risks smaller. Independently testable and deployable components are an important part of the story, but don’t do much good without the commitment to release more frequently. Although we had always talked about it and even built much of the necessary infrastructure to support it, we never brought it into focus until we launched Yoda and truly changed our culture.

Unlike some of the other stories in this series, we’re still not quite done with Bug Fix Thursday. We just found a way to make it smaller and insured that it can’t get any bigger by limiting its impact to the monolithic pieces of our system that are left over from the early days of the Autobahn platform. We’re also committed to shrinking it further over the coming months by focusing a small team, called Supercharge Autobahn, on breaking down the highly complex remaining pieces of the original monolith into truly independent components. We also continue to work on our engineering culture to make sure we don’t backslide.

The Biggest Mistakes We Made on Our Agile Journey (and Why We are Glad We Made Them) — Failure IS NOT an Option

failure

I am working with Tim Coonfield to develop a talk for the one day Agile Shift conference scheduled for April 12, 2019 in Houston, TX titled “10 Biggest Mistakes We Made on Our Agile Journey (and Why We are Glad We Made Them)“. This is the first in a series of articles that Tim and I will use to explore some of those mistakes and what we learned from them. You can find other articles in this series here.

If you are interested in hearing this talk or some of the other awesome speakers and topics that will be covered at the event, you can learn more about the conference and purchase tickets here.

Everything I will share here happened at Global Custom Commerce (GCC), a Home Depot Company, as we developed and improved our 2nd generation e-commerce platform, called Autobahn, designed to make it easy for customers to shop for and buy more complex configurable products and services like custom window blinds, custom windows, flooring and decks.  

In 2011, when the Autobahn platform started development, GCC was already the #1 seller of custom window coverings online and owned several brands including blinds.com. We were a couple years away from acquisition by the Home Depot and had about 80 employees. The existing e-commerce platform had been in production for a number of years and was still actively being improved by a large team following a traditional project management philosophy using Gantt charts and reporting on percent complete.

The Autobahn project marked a number of firsts for GCC including our first use of cloud hosting at AWS and our first use of agile methodologies. This article highlights one of our bigger mistakes and how we were able to improve as a result.

In the early days of the effort to deliver our 2nd generation e-commerce platform, Autobahn, we adopted the Scrum methodology and were following all the practices “by the book” including sprint planning, stand-ups, sprint review and sprint retrospectives.  However, the company continued to use more traditional project management techniques and reporting processes. As a result, the team was required to work with the PMO to keep a Gantt chart updated with progress by mapping completed stories to estimates of percent complete (i.e. a mistake I’ll cover in another article). The team was also concerned that, even though the CEO was shepherding the project himself, many leaders in the company were not happy about the company investing resources to build a new e-commerce platform instead of investing more in the existing one. There was also the issue of racing to replace the existing platform, which remained under active development since nobody in the company was interested in moving our e-commerce business to the new platform until it was more capable than the existing one. Despite these challenges, the team was optimistic that we would deliver on time and within budget.

For the first few months things appeared to be going well.  Every sprint we delivered exactly as promised. I know the Scrum guide talks about forecasts these days, but back then the book called for commitment — a very clear promise from the team to the business that they would deliver what they said they would deliver at planning. As the project started to gain momentum, we shifted focus from basic CRUD to the critical functionality required to sell any sort of configurable product or service.

As the required functionality got more complex, we started to have more difficulty delivering the way we intended. However, we could usually figure out a way to get enough done to demo by interpreting the story narrowly or, in some cases, by pushing the PO to pull out things that we convinced ourselves were not truly critical into new stories to be tackled in future sprints. At times, a couple of us also worked crazy hours over the weekend to make sure things got done as planned. The good news, we thought, was that velocity kept increasing so there was no doubt that we could tackle all those new stories and still deliver according to the original plan. We certainly thought that the combination of increasing velocity and the on-time trend shown in the project plan would make us look good to the business and help us keep our project alive.

Then came THAT story. It’s not really important which story exactly or what it was expected to deliver. What does matter is that after the first week of our two week sprint we knew we were in trouble and we knew THAT story was the problem. As per usual, a couple of us decided to work over the weekend to break through the hard bits so the team could finish wrapping it up in the second week of the sprint. Sunday night we still had not broken through. In fact, we had started to recognize that we had more work ahead of us than we had believed back on Friday evening.

The following Monday, the whole team got together after daily stand-up to talk about THAT story yet again. The team members that worked over the weekend shared details about the issues they had solved and the new issues they had uncovered. Nobody seemed overly worried. After all, we were the team that always delivered and failure just wasn’t an option. We spent a hour or so breaking down the remaining work into a set of tasks and put them up on the board. We updated our burn down and were pleased to note that we were still going to deliver THAT story, and everything else we had committed to, within the sprint even if it was going to require a little extra effort along the way.

Late in the day on Monday, I recognized I was in for a long night. As I stood up to stretch, I noticed other team members around me still hard at work and realized they were likely behind on their tasks too. A few hours later, I realized I was simply staring at the screen and was not really accomplishing much so I packed up to head home with a plan to start fresh early the next morning. The rest of the team appeared to be in about the same place as they headed out after a very, very long day.

It was pretty much the same story Tuesday and Wednesday. Despite lots of conversation, creative re-planning each morning and long days trying very hard to finish tasks, we just couldn’t seem to break through into the home stretch. Left unsaid was a rapidly growing sense of dread — THAT story wasn’t getting done.

Thursday morning we spent significant time after stand-up focusing on how to trim THAT story back, move bits out or otherwise bend the fabric of reality to allow us to call it done.  Our newest team member then said what desperately needed saying, “THAT story is not getting done this sprint. If we keep wasting time on it, we won’t finish up testing the other stories in the sprint either.”  He then advanced a heresy we had not even allowed ourselves to consider, “Perhaps,” he said, “we should fail with honor. We tried our best and we’ll get everything else done this sprint”.

We stood there for a moment and silently asked ourselves essentially the same question: how will the business react and how will we explain it? We started to discuss the idea and very quickly got past the fear. We would simply acknowledge the story wasn’t done and that it could go into the next sprint if the business still wanted it at the top of the priority list. We wouldn’t make excuses and we wouldn’t talk about percent complete. Our direct manager, who coded with the team, was very uncomfortable with the idea and did not want us to do it at first. He knew the political environment and worried about the backlash. In the end, he came around and the team decided to put THAT story aside and make sure the rest of the stories in the sprint were truly done before the demo.

I got the dubious honor of demoing the last completed story and discussing the failed one. Despite all the brave talk the morning before, I distinctly remember the little flutter in my gut as I finished up demoing the last completed story.  I remember staring at the ugly unfinished story on the projector screen as I said, “and THAT story did not get done”.

I took a deep breath and just a second passed before one of the business leaders asked the obvious question, at least the obvious question for someone used to reviewing projects using Gantt charts clearly illustrating percent complete. She asked, “So how close is it to done?”

I delivered the answer, carefully vetted with the team and rehearsed in advance, four simple words designed to explain our commitment to agile principles and to clear communications and transparency, “It’s simply not done”.

Of course, there was more to say after that. It started with a discussion about why percentage complete is an illusion and quickly moved into a group effort to figure out the best way to move forward. After a few minutes, I realized everyone there, the developers and the business, were engaged in looking for solutions and not spending any time at all assigning blame.

None of us should have been surprised, though we were. After all, one of our four company values is “experiment without fear of failure”. In essence, that’s just another way of saying that sometimes you’ll stumble and that’s OK as long as you learn and improve based on the experience. We saw it and lived it that day for sure. It was the beginning of a true partnership between the team and the business to figure out how to really deliver the benefit of a new e-commerce platform to our customers.

It’s also between the lines in the Agile Manifesto.  All of the unintentional spin and irrational optimism inherent in those pretty Gantt charts showing tasks 80% done gets pushed aside by a perfectly clear idea: What’s done is done.  Transparency from a development team is critical to actually making informed decisions about what to do next. “Responding to change”, one of the four values in the Agile Manifesto, is as much about what you learn on the technical side as it is about reacting to evolving business realities. The day our team and our business embraced that simple idea was one of the most important days in our agile journey.

When I Started Writing Code for a Living

punch-card

When I started writing code for a living, C++ was just getting started in the lab. C#, Java, Javascript and HTML, the technologies at the center of almost everything I do at work these days, had not been invented yet. My first job involved changing reel to reel computer tapes over night and writing COBOL on punch cards. Holy hell I’m old.

And yet I am not. I get to build systems that help people improve their homes and their lives. I love Node.js, my applications live in the cloud and my eyes don’t glaze over when talk turns to a debate between Angular and React. I help figure out ways to make other software engineers happy too. Sure, I do budgets and I sit in plenty of meetings. However, I still love everything about the act of creating software and keep my hands (and my heart) in it. It’s what got me here and it’s what helps me remain a vital part of the team at GCC and The Home Depot. What more can an old coder (and entrepreneur) want?

What is so special about Global Custom Commerce (a Home Depot Company)?

IMG_6613

Here’s a Q&A I did for the company website a couple years ago. I must admit that I feel even more strongly about this place now than I did then. If you are interested in joining the team, check out our jobs site.

Q: What was your “Aha” moment that made you choose to join Global Custom Commerce (GCC)?

After I sold my software company, my original plan was to do a little consulting while I took some time to figure out my next entrepreneurial venture. One of my clients was GCC. During that time, I got to meet lots of great people and saw how special it was. For me, it wasn’t one “aha” moment. It was more a process of figuring out that GCC was a place where I could be my entrepreneurial self without having to start another company. I liked GCC so much that I volunteered to defer my contracting invoices for a couple months around an investment round so I could stick around instead of moving on to other opportunities. Shortly after that, I became a full-time member of the team. It’s the best move I’ve ever made.

Q: Which of the core values influences you the most in your life and why?

Improve continuously hands down. I’ve been a software developer for more than 30 years and have had to re-make myself more times that I can count. It’s great to work in a place that emphasizes what I think has been one of the key difference makers in my career.

Q: What’s some advice you’d give to someone who is considering being a part of the team?

Be passionate about what you do and can bring to the team. We’re passionate about what we do and we’re looking for the same thing in everyone that joins us.

Q: What experiences have shaped you the most as you enjoy the GCC ride?

Our CEO and Founder, Jay Steinfeld, stood up in front of the whole company and said the Autobahn team showed “true grit” as we fought through challenges to finish the Home Depot launch on time. That moment was almost indescribable for me and something I will never forget. It perfectly expressed what I though made the team successful and also why I was so proud to be part of it. For me, enjoying the ride is not about ping pong, cake or dressing up, though I do enjoy those things. For me, enjoying the ride is about building things and sharing the joy and pain of creation with other great people. Those simple words from Jay convinced me that I was understood and valued.

Q: Describe your relationships with other GCC associates. What’s the environment like?

I’m a programmer at heart so I’ll speak to development culture here. First of all, I get to work side by side with some brilliant developers and I’m happy to say I learn something every day. Everyone, and I mean everyone, has a voice in architecture and design. Debates are brief, passionate and productive. We’re truly agile and we get stuff done. Our business is fun to work with too. They are super-creative and intimately involved in the development process. When things go wrong, they are supportive and trust that we will work our tails off to make things right. Finger pointing is never part of the equation. The focus is always on finding a solution that works.

Q: How would you describe GCC in three sentences or less?

I need just one word: Awesome

Interview on DevOps

From time to time, I get the opportunity to talk to industry reporters about agile and DevOps. Today, I was interviewed via email for the first time, which turned out pretty interesting. Here are the questions and answers from that interview.

Please briefly describe how the company is using DevOps, including when it began, which DevOps tools and for which types of projects.

We see DevOps as a culture that encompasses people, practices, tools and philosophy. In that sense, it has become central to everything we do to develop, maintain and operate our e-commerce sites for Blinds.com, JustBlinds.com, AmericanBlinds.com and, of course, Home Depot custom window coverings. Infrastructure is code that evolves in concert with our other software components. DevOps happens inside our agile development teams and often draws in specialized resources from our operations group. It also happens inside our infrastructure group and often draws in developers. It’s part of our DNA.

The tools aspect of it is pretty standard stuff. We use Git and GitHub for source control. All our application and infrastructure code is there. Puppet helps us with rolling out and managing servers. Our backends are mostly .NET so we use Octopus Deploy to help with rolling our code. TeamCity is in the middle of our development process and code there is used to expose deployments and tie them together with builds. Logs are mostly managed by Splunk though we’ve played with an ELK stack for this as well. Nagios is used for infrastructure monitoring. NewRelic is our app monitoring tool and we depend on it to alert us to problems with the user experience. All our alerts get fed into Pager Duty for escalation management. We’ve been experimenting with Consul for discovery and config.  We’re also experimenting with Docker. What’s holding us back there is .NET on Windows. Of course, that story is changing with .NET Core and Windows 2016 on the horizon so we have high hopes for Docker as a next step.

What were the business drivers for deploying DevOps?

Agile drove our adoption of DevOps. Our adoption of agile was driven by our organization’s culture more than anything else. One of our key values is “experiment without fear of failure”. Another is “improve continuously”. Over the years, our whole IT process had gotten into that uncomfortable place where limited resources lead to a difficult relationship with the rest of the business. They saw us as standing in the way of all the cool experimentation and improvement they wanted to do. Agile helped us break down the walls that had developed and form a true partnership for innovation. DevOps is a necessary part of the agile process. How can you innovate constantly if deployment requires an over-the-wall handoff and lots of manual intervention to get done? If operations and infrastructure are not intimately involved in the process, how can you support and manage it once it gets into production?

What benefits has the company seen from DevOps? 

DevOps enables agile, which allows us to continuously improve. It’s a big part of how we were able to deliver on all the promises of our new e-commerce platform, which lead directly to the acquisition by Home Depot. It has allowed us to continue to innovate and thrive inside a Fortune 50 corporation and take on new challenges to help drive innovation outside of the custom window coverings business.  DevOps is like oxygen for the agile process. Without it, it’s very possible that we would have ended up with “agile in name only” where agile terminology is used but nothing really changes and the organization doesn’t see the kind of exponential increase in innovation that we’re benefited from here.

Any challenges of deploying and using DevOps, and how were they addressed?

Our biggest challenges revolve around security and compliance especially now that we are part of one of the largest retailers in the world. We’re still learning how to deal with all that when it comes to sharing responsibility for deployment and infrastructure between developers, infrastructure and operations engineers. We’re constantly tempted to solve these problems with handoffs and work hard to avoid that. Now that we have trust across all the impacted groups it’s much easier to work through them and come up with ways to address compliance without undermining the velocity of innovation.

Quickie Review of “The Evolution of Everything: How New Ideas Emerge”

“The Evolution of Everything: How New Ideas Emerge” by Matt Ridley is a very thought provoking book about how our world has been largely shaped by bottom-up trial and error evolution rather than top-down intelligent design. Along the way, the author challenges assumptions about everything from how religion created morality to the role and impact of government in education. In the end it is a bracingly optimistic look at how our civilization got here and what the future may hold.

Think of the incredible web of economic activity that makes the apparently simplest of products, the humble pencil, available as explained in the famous story, “I Pencil” (check it out on YouTube if you have not seen it before). Nobody designed the entire supply chain. No single person or small group of people direct it. No government requires it or builds it. Even so, tens of thousands of people somehow coordinate their activities to make it possible.

Think, too, of invention. Edison invented the light bulb, right? What do you think would have happened if he died the week before his discovery? Well, Humphrey Davy started the march towards an efficient electric light many years before and Edison was just the best funded of those pursuing the task. It is certain that if Edison had failed, someone else would have stepped forward likely mere months later. This pattern, the almost inevitability of invention, can be seen in just about every case. Steve Jobs did not direct the invention of the smart phone. Apple just got there a bit sooner and with a bit more innovation than countless others. If Steve had died a few years sooner, Android may have been first or someone else would have stepped up with an almost identical device. The Wrights flew one of the first powered aircraft but certainly not the best. If they had crashed on the beach before anyone noticed, Curtis or one of the dozens of others chasing the flight dream would have gotten the credit. In the end, the world would have had all these advancements around the same time no matter who died, failed or gave up because they were actually the result of many improvements and ideas contributed by thousands across many years of constant evolution.

New ideas, the author suggests, are almost always the result of natural selection between competing ideas. The pace of innovation and improvement continue to grow because human beings are more empowered than ever before to collaborate and breed new ideas. For example, a developer in Houston, TX can work on an open source project with contributors from all over the world. The other important ingredient, competition, is also much easier to achieve today. An entrepreneur in Africa can compete on equal footing with another in New York thanks to the Internet and easy access to rapid travel.

The optimism in the book comes from the fact that the improvements in our lives, the countless gadgets that have made communication easier, food cheaper, close cleaner, people healthier and leisure more accessible to billions, are evolving more quickly today than they have ever been able to in the past. Today’s poor have more access to knowledge, education and communications than the richest that lived in America at the beginning of the 20th century. A man that cannot attend the best university can pay pennies at an Internet cafe and get access to more books and more research than a tenured professor at Harvard could have hoped for 40 years ago. We are living in an age of accelerating progress and opportunity for all that will breed more of each.

Check out the book on Amazon.

Quickie Reviews of Two Books to Help With Mindfulness at Work (and Elsewhere)

When you are mindful…You become keenly aware of yourself and your surroundings, but you simply observe these things as they are. You are aware of your own thoughts and feelings, but you do not react to them in the way that you would if you were on “autopilot”…By not labeling or judging the events and circumstances taking place around you, you are freed from your normal tendency to react to them.

— A Guide to Practicing and Understanding Mindfulness

One of the things I regret most about my younger self is my extreme fear of failure and the way it drove many unhealthy behaviors. Sure, it had its uses. For example, I often used that fear to drive me to work harder, read more, learn more and generally stand out from the average. However, it also made me a very difficult boss in the early days of my company as I drove everyone as hard as I drove myself and rarely but too often yelled at my employees for simple mistakes. I’m ashamed to say that this made me a very poor leader at times. When the chips were down, when the going got rough, I worked harder than anyone and so did my team but they did so often in fear of the slightest stumble. My ugly behavior extended into my personal life too where my fear and doubt sometimes made me quick to anger and cost way too many personal relationships.

To this day, I can’t say for sure where my deep fears come from. Western therapy gave me some theories but never really helped much. Instead, I learned how to see my fears for what they were and took away their power to control me through lessons I learned from Buddhist practices and Eastern philosophies. The simple ability to recognize the blooming of an emotion, how it changes the way my body feels and being able to choose in that moment instead of reacting had a very powerful impact on my life. It helped me grow my business and build a far stronger team of happier employees. It lead me into a great marriage with an extradorinaiy woman. After I sold my company, it helped me find a home at Blinds.com along with a role I love. Although I still have a long way to go, I certainly feel like I’m better off today than I was those many years ago mainly because I better understand what’s going on inside my head and my heart at least most of the time.

Lately, I’ve noticed that mindfulness is making its way into the business mainstream.  Companies like Google are actively training employees in mindfulness practices because it improves the quality of communications, helps people get more done by enhancing their focus and generally improves the happiness and satisfaction of their employees. To that end, here are two books that can help get you and your company get started.

“Fail, Fail Again, Fail Better: Wise Advice for Leaning Into the Unknown” by Pema Chodron is nothing more than a commencement speech and an interview and nothing less than a brilliant and gentle lesson about the importance of facing up to your failures with compassion and understanding. It only runs about an hour and a half so I was able to listen to the Audible version while walking one afternoon. Something about the combination of walking 7 miles and the lessons in the book left me with tears my eyes. Often, it felt like the author was talking directly to me and my struggles.  Yes, I often took refuge in blaming others when I was involved in failures. Yes, even more often I judged myself unforgivably responsible for every failure. The secret, as she says, is to learn to sit quietly with the feeling instead of pushing it off quickly with blaming others or self. Use failure to become stronger, more compassionate and more able to make your way through life.  She tells a story of a man who walks through the surf to swim out to sea. A large wave comes and knock him down.  As he lays at the bottom with sand in his mouth and in his eyes he thinks of two choices: get up or, well, die. He gets up and walks further. He gets knocked down again, lays at the bottom and gets up again. Each time he gets knocked down, it becomes easier to get back up because of the habit. Easier said than done, of course, but there’s far more nuance to it and she does talk about it a bit more as the book goes along. The commencement address that opens the book is full of humor and good will even as it challenges the new graduates in the audience (and the reader) to go out into a world full of unknowns and many, many opportunities for failure.  Check it out at Amazon.

“Search Inside Yourself: The Unexpected Path to Achieving Success, Happiness (and World Peace)” by Chade-Meng Tan is a straight-forward guide to the mindfulness course the author helped to assemble and now brings to people throughout Google. Tan is one of Google’s earliest engineers and brings an engineer’s mindset to mindfulness. He provides plenty of scientific evidence for the benefits of mindfulness training, which makes the seemingly squishy ideas behind things like meditation much more approachable for highly analytical types. His enthusiasm for the subject and his infectious joy come through page after page. There’s no doubt that the title on his business card is accurate: “For he’s a jolly-good fellow, which nobody can deny”. I would highly recommend getting this one in print or Kindle form so you can easily go back to the practical exercises.