Cloud

Common DevOps Tool Chains Pitfalls

Bart Driscoll By Bart Driscoll Director, DevOps Services October 1, 2015

shutterstock_285918083We have all read that DevOps transformation is cultural and organizational. At this point, I don’t think anyone would argue this fact. The trouble is if you start with tooling (as I and many others would recommend), you must be vigilant not to recreate the dysfunction you are trying to eliminate. Melvin Conway (Conway’s Law) stated it best when he said:

“Organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.”1

So how do you protect yourself or prevent yourself from codifying the old?  What are some of the common tooling pitfalls that impede adoption or derail transformation?

Pitfall 1: Little/No Collaboration (single department or group).

This is the most common anti-pattern I encounter in the field and perfectly matches Conway’s Law mentioned above. Rather than work with “them”, we avoid collaboration. Infrastructure teams build infrastructure services and portals, developers implement build automation and dashboards, and QA team build test automation. After the investments are made, many leaders look in disbelief. Despite significant investments in tooling, adoption is low and IT performance hasn’t significantly improved. (To better understand why, I recommend researching Goldratt’s Theory of Constraints.)

Many customers start this way because their span-of-control/influence is limited to their respective department. It is easier to start within the confines of a single department rather than add complexity and possible conflict with other stakeholder groups. As such, these customers assemble a team and start building a piece of the tool chain, like build automation, server provisioning, automated testing, etc.. While this approach is effective at optimizing a department or step along the SDLC workflow, it is rarely effective at optimizing the overall productivity of the system and/or accelerating deployments.

For example, I recently developed a MVP tool chain for a customer that was responsible for build and release management but wasn’t positioned to collaborate with development, test, and infrastructure groups. In the end, the tool chain consisted of only a code repository, artifact repository, compiler, and deployment engine. It did not include unit test integration or code analysis and wasn’t linked to infrastructure provisioning system. While this tool chain did speed up his team ability to create deployable code, it couldn’t validate that the artifacts being created were any good, couldn’t provide real-time feedback back to DEV teams regarding code quality and functional completeness, and couldn’t verify that the packaged application was compatible with available infrastructure. Note that future stages of roadmap did expand on this early MVP but the example is still relevant. Without collaboration, the value of tool chain is limited; its ability to impact IT performance is negligible; and, the work completed is at risk of significant rework (even full rewrite) once an enterprise vision is established.

The remediation is too collaborate more through cross-functional design and build teams. In other words, work across departments, practice influencing others, and most importantly learn from others so that teams can greater awareness of the broader development and deployment pipeline. Reinforce collaboration through shared metrics focused on system throughput. That is, how many changes are successfully deployed to production, rather than on departmental metrics like system provisioned, production outages, code complete, etc.

Pitfall 2: Closed architecture (not easily extensible)

This is the ultimate paradox IMO. DevOps tool chains are built to handle the rapid introduction of application and infrastructure changes into an environment but many of these tool chains themselves are inflexible and unable to easily add/remove tools. When designing a tool chain, I urge customers to think globally, or think past the immediate problem statement, pilot application, or step in workflow. As your DevOps transformation gains steam, new tools are going to be added support non-conforming legacy workloads as well as new, cloud-native technologies like Docker or Cloud Foundry. To account for this, DevOps tools chains need to embrace the concepts and best practices inherent in service oriented architecture, like encapsulation, abstraction, and decoupling. By building a set of loosely coupled services, such as Unit Test or Create VM, that are linked to an encoded workflow and abstracted by an API framework, your tool chain becomes very flexible and adaptable as long as new tools have compatible APIs. Using this pattern, your tool chain will consist of a set of objects, parameters and policies that will define how and when to use a tool given a specific set of rules are met. In practice this approach does add complexity and requires a development contract for adding new tools but in exchange it offers significant flexibility and resiliency of your tool chain and investment.

The remediation is:

View the DevOps graphic

Pitfall 3: Cannot replicate on-demand

I love introducing (or re-introducing) customers to Chaos Monkey™. For those unfamiliar, Chaos Monkey™ is a utility that basically destroys a live, production server at random. It was developed to test the resiliency and response time of the infrastructure and automation system to failures. Most customers that I meet with think this concept is crazy — intentionally destroying a system just to make sure you can recover with little to impact to service. If you don’t have any confidence in your automation platform and MTTR metrics keep you up at night, I agree running Chaos Monkey™ is bananas (sorry, couldn’t resist).

Being able to replicate on-demand has two prerequisites, namely a known good state and automated build processes. Known good state is a specific version of an application and its corresponding configurations coupled with a specific version of an infrastructure and its configurations. ALL these artifacts are created and tested in the development (DEV) stage of the SDLC process and then reused in later stages of the SDLC for testing. Release candidates, or packaged artifacts, that successfully move through the SDLC process are promoted to Production (PROD). PROD releases trigger a new known good state to be defined. This then becomes the standard against which all new changes are tested.

The second pre-requisite is automated build and configuration process. These automated processes are needed to take bare metal servers, virtual machines, containers, and/or existing systems and convert them into working application environments. Integrated tooling enables each layer of the environment and the corresponding configurations to be systematically applied until the desire state (aka. known good state) is achieved. Tools like Puppet and Chef take this paradigm a step further and use the known good as a ‘declared state’ and manage the environment to maintain that ‘declared state’. Cloud native technologies, like Pivotal Cloud Foundry or Docker, shift this paradigm a bit and deliver a standardized version of a container as a service in which an application can run. Developers wanting to use these platforms write code that will comply with the usage contract of a given container. Regardless of what type of tooling your application requires, automating the end-to-end build process is key to replicating environments on-demand.

In order to remediate the lack of ability to replicate on demand, remember these two pre-requisites:

1. Version your infrastructure like the applications code base. By versioning infrastructure AND code, you can confidently like known good infrastructure configurations with known good and compatible application versions.

2. Build once and replicate deployments from known good packages.  The infrastructure build and code packaging process should only occur once during the DEV stage.  All subsequent phase/quality gates (including PROD) should be assembled using known good artifacts and deployment scripts tested repeatedly through the SDLC and deployment processes.

Wrapping up

While there are many other factors that can contribute to the overall success or failure of your DevOps transformation, paying careful attention to how you select tools and build continuous delivery tool chains will provide you with a foundation that can mature and scale as you transform the enterprise. If you are interested in learning more, click here.

1Conway, Melvin E. (April 1968), “How do Committees Invent?”Datamation 14 (5): 28–31, retrieved 2015-04-10

Bart Driscoll

About Bart Driscoll


Director, DevOps Services

Bart Driscoll is the Director for the Application Modernization discipline within the Americas. Application Modernization services deliver a full spectrum of assessment, planning, implementation, and transformational services to enable customers to drive down cost and accelerate time to value in application delivery. This group specializes in Mainframe Transformation, Legacy Application Modernization, and DevOps.

Prior to his current role, Bart was the Application Architecture, Design, and Development Managing Principal for the Northeast/Canada division. Here Bart managed a portfolio of programs and projects focused on application development and software delivery optimization. While his primary responsibility was delivery oversight, Bart also played an active role in presales and helped grow bookings in region by over 100% in two years.

Bart has broad experience in IT ranging from networking engineering to help desk to application development. He has spent the last 15 years honing application development and delivery skills in roles such as Information Architect, Release Manager, Test Manager, Agile Coach, Business Architect, and Project/Program Manager. Bart holds certifications from PMI, Agile Alliance, Pegasystems, and Six Sigma.

Bart earned a bachelor’s degree from the College of the Holy Cross and a master’s degree from the University of Virginia.

Read More

Join the Conversation

Our Team becomes stronger with every person who adds to the conversation. So please join the conversation. Comment on our posts and share!

Leave a Reply

Your email address will not be published. Required fields are marked *

7 thoughts on “Common DevOps Tool Chains Pitfalls

  1. Love the perspective. On your second pitfall, the IT4IT reference architecture put together within the Open Group addresses this topic. If DevOps tooling adhere to that reference architecture, it can help to address this problem. Granted, this is fairly new thinking and vendors are not yet aggressively adopting it, but I think you will start to see a shift here in the industry over the next couple of years.

    • Great question. There are multiple places to inject security into a DevOps toolchain. But first, we need to consider the two different usecases. Usecase 1 is internal, targets the development lifecycle, and is focused on how to ensure security best practices are followed and approved frameworks and code patterns are employed by product teams. The second usecase is external, targeting production, and is focused on adding new/updated environments to the security monitoring and management platforms like Archer.

      I will start with use case 2 as the answer is fairly short answer. As part of production deployments, agents and/or configurations are automatically deployed via declarative scripts that define the parameters and attributes of a system. The automation configures and validated that the environment is built to spec and added to the management platforms (security, cloud, etc.). This assumes that the needed CRUD APIs are available for the management platforming. Using automation rather than manual configuration is both more secure and less error prone.

      As for use case 1, we want to start designing, developing, and testing for security as early in the development lifecycle as possible. This starts in DEV with proven, pre-configured services for environments and frameworks. Continues into TEST as code is analyzed to ensure that best practices, development standards, frameworks, and security policies were used in development. Code that fails any of these test “breaks the build” and requires the development team to correct the error before promoting the code. In later stages, we can introduce pen testing or other types of automated security test as part of the standard. The goal of DevSecOps (the next evolution of DevOps) is to move testing left, or closer to development, so that we start designing security into the architecture and regularly check of common security issues (like unencrypted passwords, reuse of passwords across layers, open port requirements, etc.). The goal is to create higher quality code early so your attack plane in Production is as hardened as possible.