LLMs and Generative AI in the enterprise.
Inspire, develop, and guide a winning organization.
Understand the unique values and behaviors of a successful organization.
Create visible workflows to achieve well-architected software.
Understand and use meaningful data to measure success.
Integrate and automate quality, security, and compliance into daily work.
An on-demand learning experience from the people who brought you The Phoenix Project, Team Topologies, Accelerate, and more.
Learn how to enhance collaboration and performance in large-scale organizations through Flow Engineering
Learn how making work visible, value stream management, and flow metrics can affect change in your organization.
Clarify team interactions for fast flow using simple sense-making approaches and tools.
Multiple award-winning CTO, researcher, and bestselling author Gene Kim hosts enterprise technology and business leaders.
In the first part of this two-part episode of The Idealcast, Gene Kim speaks with Dr. Ron Westrum, Emeritus Professor of Sociology at Eastern Michigan University.
In the first episode of Season 2 of The Idealcast, Gene Kim speaks with Admiral John Richardson, who served as Chief of Naval Operations for four years.
Exploring the impact of GenAI in our organizations & creating business impact through technology leadership.
DevOps best practices, case studies, organizational change, ways of working, and the latest thinking affecting business and technology leadership.
The debate over in-office versus remote work misses a fundamental truth: high-performing teams succeed based on how they’re organized, not where they sit.
Leaders can help their organizations move from the danger zone to the winning zone by changing how they wire their organization’s social circuitry.
The values and philosophies that frame the processes, procedures, and practices of DevOps.
This post presents the four key metrics to measure software delivery performance.
August 23, 2018
The following is an excerpt from a presentation by Anne Marie Fred a Senior Engineering Manager at IBM titled “Compliance and Audit Readiness: The DevOps Killer?”
You can watch the video of the presentation, which was originally delivered at the 2018 DevOps Enterprise Summit in London.
Think about the software development teams that you work with on a regular basis, how many would you say are deploying software at least once per year?
How about once a month? Once a week? Once per day?
Now imagine, and maybe it’s not so hard to imagine, that you’ve built a compliance and audit readiness culture and processes around deploying maybe a few times per year, and suddenly you’re deploying several times per day.
What kind of pain would you experience? That’s what I’m going to share about today.
I’ve worked for 17 years as a software engineer at IBM and in the last three as a manager. It’s very important to note I am not an attorney. I am not a compliance expert. I am not a consultant. You get what you pay for. These are my personal memories and opinions, and what I hope you will do is take them back, anything that you find interesting, and run them by your own people and see if they’re interested in trying it.
In the Digital Business Group, we have about 350,000 employees at IBM, and we have several hundred in the Digital Business Group.
What we do in our group is we manage IBM’s digital presence worldwide. That includes things like websites, pricing information, checkout, provisioning software as a service offerings when you order them, search engine optimization, analytics, developer outreach programs, like developerworks, and even conferences and events. As you can see, we do a great deal of customer facing work, but we don’t sell any products ourselves.
If you look at my reporting chain, you’ll see IBM at the top and then the Digital Business Group.
There are about 75 squads in DBG. If you’re not familiar with a squad, it’s basically an autonomous team with a clear mission, and they have everybody they need on that team to deliver on that mission. So, they have their own business owners, designers, developers, project managers, and so on.
And myself, I am a manager for four squads, and between these four squads, we’re responsible for roughly 150,000 of the web pages on IBM.com.
But in large enterprises, we have to do it at scale.
We have thousands of applications and services at IBM. For each one of them, we have to ensure compliance. We have to worry about all these things, and it can get a little bit overwhelming.
If your CEO asked you today what applications and services you were running right now, would you be able to answer that question? How long would it take you to answer the question? How accurate would it be? How many things are deployed out there that you don’t even know about, necessarily?
What you need is an application and service registration system. We call ours the enterprise application library. It includes information like the system application name, the business and engineering owners, and other basic information.
We had this library, and people saw this as a perfect opportunity to make sure that systems were compliant at their first release.
They said, “You have to register before you can release, and you have to be compliant before you can register.” The problem is that meant for some applications, like those that process personal data, it was taking six to eight weeks to register the application.
I think this is a mistake. You should make it very easy to register an application and then follow up on the compliance immediately after that.
Because what happens is developers are like water. If you make something difficult, they will find a way around it. People were very resistant to registering the applications.
Make it easy, and then make it very clear what your guidelines are for which applications need to be registered with your registration system. Finally, you want to have very clear people who are personally responsible for making sure that those applications are there.
For us, that’s the business owners and our HR managers.
‘What is your business continuity value?’
For this, we ask people to assess how critical the application or service is to IBM’s business.
We ask them to think about things like:
Depending on your answers to these questions, you’ll get a score, 1-4, of what your business continuity value is.
If you have a high BCV score, that obviates the need for more caution in how you deploy your service. For those applications, we need to have offsite data backups, a disaster recovery plan that you’ve actually practiced and tested, an IT support workforce continuity plan, and so on.
You don’t ever want to be in a situation where at any time your systems are exposed to hacking, right? But, again, with frequent deployments, we can’t rely on manual processes to enforce this.
Fortunately, there’s a whole field of study in this now. It’s called Dev Sec Ops.
I also want to mention one thing that was particular to compliance, which is this GDPR secure by design requirement. Your applications need to be secure by design.
What does that mean? It’s an evolving area, and I think that we’re growing in our understanding of what that means to us, as well.
Here are a few things that we’re doing to make our applications secure by design.
Which we use for your first deployment into production.
This is a set of a couple dozen questions, and what we are doing is asking you to fill this out in order to educate the security architect on how your application works and how it’s secured.
Then you send that to the architect, and they review it with you. So that together, you come up with a set of remediation steps that you need to take. You get those completed, and then everybody signs off that it’s good at the first deployment.
One thing that we’re starting to ask people is, “Have you considered the security implications of this change before you make it?”
This is early on in the planning process before you write a line of code. Just asking people to spend 10 seconds thinking that through.
Secondly, in our code reviews, we’ve trained our developers to think, “Does this have any security implications, this code change that I’m about to put in? Has somebody done something silly like check a private key into our source code depository?” (That never happens…) These are two good ways that we maintain this on an ongoing basis.
We bring in outside consultants who try to hack into our systems on a regular basis. They write up bug reports for any vulnerabilities that they find or even potential vulnerabilities. Then they don’t just throw it over the wall. The nice thing is, they stay with us and help us fix them.
We have two classes of automation tools that are used pretty broadly. One is static code analysis tools. These are able to actually process any number of different programming languages and find common vulnerabilities or mistakes that people make in their code. These can run in every build, and they can fail your build and prevent a deployment that would make your application less secure.
We also have web crawlers like IBM AppScan Web that run against our production servers on a frequent basis, maybe daily. They are checking for common exploits and hacks. Again, they can create a report and automatically open the defect against us, and we can fix that very quickly.
“Access control is the selective restriction of access, whereas permission to access a resource is called authorization.”
In a DevOps world, some of the access controls that we see frequently are API keys, IDs, and passwords.
Just a couple of rules of thumb that we find useful: we want to use individual credentials for any manual action so you can trace who made a change. This is great for audit or fraud detection. You never want to share a password, even amongst the team, because you lose that auditability and traceability.
In cases where it makes sense for something to last a long time, if people come and go, functional IDs are a really good answer there. We also have API keys that can be either long-lived or short-lived, depending on the account they come from. We actually have our manager set up the functional IDs and own those, and then they will just encrypt the secrets before they put them into our deployment pipeline.
A global privacy assessment is one of the few things that we do require people to do before delivering to production. What this is is another questionnaire about what personal data you collect. How do you process it? How do you store it? Who uses it and why, which is very important to GDPR, what’s the purpose of the processing. The access controls you put into place, what countries are involved in storing or processing the data, and so on.
The answers to this questionnaire are then reviewed by our legal and privacy experts in various countries, to make sure that we’re not breaking any local laws. The output of this, just like our IT security testing, is a series of actions that are required to comply with the laws, including GDPR.
We do have two fast passes through the global privacy assessment. One is for applications that don’t store or process any personal data at all.
Another one is for applications that only process a very limited type of personal data, which I’ve heard called pseudo anonymized data, which is basically data that are not personal in nature, but it’s a reference to a person. This is something like an IP address or maybe an internal ID number that we use to identify a person, that does not equal their email address. If that’s all you have is maybe some IP addresses in your logs, there’s a fast pass through this assessment. You can get through it in a day or two. This is the thing that was taking six to eight weeks if you do process personal data.
To be GDPR compliant, the good news is if you look at what we talked about earlier, it sets you up very well for GDPR compliance.
You also need to ensure that the third party services you’re working with are themselves GDPR compliant. For example, anybody who’s a processor for us, we want to make sure that they’re GDPR compliant.
We also want to make sure they’re up to our IT security standards, and we do it through our procurement process. We are not allowed to pay for a third party service unless they have agreed to abide by our standards. This can make procurement take a longer time, but for us, it’s worth it. We’ve renegotiated so many contracts because of GDPR, and I think many people did as well.
Another thing that’s kind of interesting about GDPR is the data subject access requests (DSAR).
DevOps makes this a little bit more difficult, especially microservices. We have many services with many small databases. You can end up with a proliferation of personal data repositories.
How many of them are storing a copy of some data from the user’s profile, so they don’t have to look it up again later?
To address this, the first step was really to identify where our personal data repositories were. We started with our application registry and went from there.
We said, “Okay, here are a series of questions that will tell you if you’re a personal repository or not, yes or no. If you are, you’re going to participate in the DSAR process when it comes out.” Well, this was a pretty powerful motivating factor for people to get rid of extra personal data repositories.
We took the profile data and we centralized it in one place. We said, “If you can, please rewrite your applications so instead of storing a copy of somebody’s profile data, you look it up from the profile service every time.”
Furthermore, on the profile APIs, they are asking you what is the purpose for which you are going to use this data? The profile service is connected to the consent service, so it can look up what kinds of processing each customer has consented to, and they will only send you the data that you’re allowed to use for that purpose.
Many of our services did that, and then they deleted their copies of the data. We also had many services that maybe had personal data for some kind of fluffy function that we really didn’t care about anymore. So we just got rid of some features. We even shut down entire services because they would have been too difficult to remediate. This kind of dovetailed nicely with this server consolidation project that we were in the middle of.
It’s for that pseudo anonymized kind of data, those IP addresses, and those internal ID numbers. It wouldn’t be very helpful if you asked a company what data do you have on me and then we said, “Well, what’s your IP address?” No.
In fact, even if we tell them what their internal ID number is, it’s not very helpful to them. It’s more helpful for us to say, as a blanket statement, “If you visited our website, we’ve logged your IP address, and we use your internal ID number across our systems to track your sessions and so on.”
The other thing that makes that easier is that we consider those types of data don’t require consent. We need to keep your IP address in order to keep our systems secure, to prevent a denial of service attack, to respond to fraud or security problems.
This is the practice of having more than one person who’s required to complete a task. Its intent is to prevent fraud and error.
With DevOps, you might not have an operations team to separate your duties to, it’s the same people.
So, separation of duties is required for some sensitive applications by law and by best practice, things like healthcare data who do actually still have separation of duties. But for things like websites, web servers, which is a lot of what we run, it’s overkill.
For us, it’s more about the spirit of the law. How are we going to prevent fraud and errors without actually having many people involved in the deployment process?
For example, on our web servers we don’t just have ping tests making sure the host is up, we also have tests that display the page and check for certain words on the page. We have visual regression checker tests. I don’t know if you’ve ever seen those, but it’s actually like an image of the page. It will raise an alert if the page has changed, so you can make sure that was intentional, and so on. And of course, we have our security checks on a regular basis.
Accessibility is the design of products, devices, services, or environments for people who experience disabilities.
IBM takes accessibility very seriously. We follow all of the standards like the World Wide Web Consortium and the Web Accessibility Initiative and then have our own accessibility standards on top of that. We don’t just require that the applications that we sell for government bids are accessible. Our internal standard is that all of our websites will be accessible, and even our internal documentation and training. This is something that touches all of us who develop software at IBM.
Now with DevOps, this is a little bit more difficult, because again we have frequent deployments.
In the past, the standard was that you did accessibility tests pretty late in the product development cycle when the UI was fairly settled, but this doesn’t work anymore. We had to develop processes that made this lightweight and fast.
Fortunately, we have a website that’s publicly available to everybody, which is www.ibm.com/able. I strongly suggest that everybody go out there and take a look at that website. There’s a lot of best practices for accessibility. As you can see, there’s actually a title on that page about how to streamline your agile DevOps processes. We have open source tooling that’s available for everyone to use, where you can check your source code and your web pages for accessibility.
The easiest example is probably screen readers. It’s hard to tell if something’s going to make sense when you read it or hear it through a screen reader unless you’ve actually done it. We do do very intensive accessibility tests. The first time an application is deployed, usually actually right after it’s deployed.
We’ll maybe spend a week where the whole team is finding any accessibility problems and fixing them. We also have periodic manual checks, where somebody will go back and sort of spot check pages and see if they find any problems.
Two other things that we do, similar to our security checks, is we ask people to think about accessibility when they’re doing code reviews.
Part of testing of a change to the user interface should be to actually bring it up with the browser plugins like Chrome DAP plugin. Bring it up in the browser with the plugins that are showing you if the accessibility looks good. Is your contrast good? Can you tab through the page? Etc.
Modern package management tools like NPM make it trivially easy to pull open source into your project. They’re very widely used in the DevOps environment.
Furthermore, if you just pull in one package, you’re usually going to end up automatically pulling in several other packages, without even knowing that it’s happening. We actually have different processes for software that we sell and internal use software. Most of what we’re doing continuous delivery on is internal use software. But not all.
For software that we sell, we actually have a complete sign-off, where they have to list for the release every single package and version and what its license was and was that approved by our open source standards committee. For internal use software, we actually allow teams to self-certify their compliance, and then we give them the tools to make that easier for them.
Is to educate everybody in open source, right?
It was kind of woven in there. One thing that we have for audit specifically is documentation. One thing that’s great is to standardize the documentation for all the areas that we talked about as much as possible.
A very simple thing that’s working for us is Box folders. Boxes, it’s not an IBM company, it’s a third party company we use, that has secure shared cloud storage. We have folders that roughly mirror the organizational structure.
For example, there’s one folder for the Commerce platform, which is my boss’s level. Then there’s a subfolder for each squad in that area. Within each squad’s folder, they’re responsible for gathering up the compliance documentation that they need.
The individual pieces of documentation may have additional access controls if they’re sensitive, but at least this gives our managers and our project managers a very easy place to go, in case they have a request for audit.
If you’re requiring somebody to assert that they’re compliant to a standard, you can actually just set up a readme file in a GitHub repository, and they can make a pull request to sign their name to it. It’s a very auditable and traceable sign-off.
Trusted by technology leaders worldwide. Since publishing The Phoenix Project in 2013, and launching DevOps Enterprise Summit in 2014, we’ve been assembling guidance from industry experts and top practitioners.
No comments found
Your email address will not be published.
First Name Last Name
Δ
The Public Sector's Unique Challenge Government agencies face a fundamental problem: Imagine if you…
Moving from isolated pilots to enterprise-wide GenAI implementation requires thoughtful strategies that balance innovation…
In product development, the quest for better flow has been a constant for nearly…
As enterprises move beyond initial experiments with generative AI, establishing robust governance becomes critical…