The following is an excerpt from a presentation by John Willis, who needs no introduction, titled “Containers and Next Gen Infrascructure Ecosystem.”
You can watch the video of the presentation, which was originally delivered at the 2018 DevOps Enterprise Summit in London.
Four years ago, at DevOps enterprise Summit I was asked to do a talk about Docker for managers. Back then nobody knew what was going on, like, ‘what is Docker? How does it work?’ I thought about this, and I probably should have called it ‘Next Generation Infrastructure for Managers.’ So, today, I’d like to share with you a high level of a technical infrastructure, as I think about it.
It’s Kubernetes in containers. You can leave now.
But notice I didn’t say, Docker, I said containers.
I’m going to share a little about:
- The Open Container Initiative (OCI)
- I’m going to try to put in context “What is the container ecosystem?” There’s an incredible amount of confusion right now when people try to describe containers. So I’ll be breaking that down quite a bit.
- Service Mesh: If you’ve heard of ISTIO and Envoy, I’ll put that in perspective.
- Kubernetes API Extendability: The single most interesting thing going on in technology-based future.
First, A little bit about the OCI
Basically, Docker originally used Linux containers, LXC. Then they wrote something called ‘libcontainer’ as they were evolving. Now, you’ve got to give Docker incredible credit for exposing what only very few companies were using at scale and commodified it for the rest of us. Then at some point, when the OCI was created they basically donated the runtime, which was the libcontainer.
Now that is the predominant runtime, it is most of the players that you would be interested in are running runC. There’s also a lot of work, and a lot of arguing, about image specification, and that’s owned by the OCI as well. Long story short, the OCI is a really good place to keep track of where things are going, because like I said, all the players that I would consider first tier players in this game pretty much are running runC, and are arguing over image spec.
The Container Ecosystems
Let me tell you how this usually plays out. When I go to conferences, like the DevOps Enterprise Summit, I talk to a lot of people, and I ask two questions.
Question #1: “What container implementation are you using?”
And the fun begins.
Always the answer is Docker. Then I’m like, “Which version, which type of Docker?” And then they start looking at me weird, and now you get into the real fun stuff and it’s like, “The open source one.” I’m like, “There is no GitHub Docker/Docker so which one are you using?”
Nothing. And I get this from even some of the most mature, people who have started with Docker for four years ago!
Now, I appreciate the fact that the word Docker has become like Frisbee or Coke, but it’s amazing how many times when I ask these questions and most people don’t know.
Question #2: “What’s your long-term strategy?”
And even the best of the best will say, “We have no freaking idea” on their container strategy. “Or we’re waiting. We’re going to see what Google does,” or “we’re going to see how these all converge” etc.
It’s a mess. So, instead of us saying ‘Docker’ all the time, I think it’s time to have a more honest conversation.
I would say that we need to kind of break it down into three kinds of distinct questions or categories of how we think about the container ecosystem.
1. “What’s your container runtime?”
Truth is, it’s probably going to be runC which is pretty much what most people have settled on.
I wanted to talk about more than one, so Kata Containers is also interesting. There are some nice properties about it being very lightweight, there’s a buzz around it. I still think, unless you just want to experiment, for all intents and purposes when you talk about container run it’s runC.
2. “What engine are you running?”
Which is how we get to ‘just Docker,’ right? But remember there are three versions of Docker.
- There’s Moby, which is the open source and is like literally renamed Docker/Docker to Moby/Moby. To be honest, I don’t think many people are running that.
- There’s the community edition, which is not open source, but it’s free, and it has all the properties,
- Then there’s the enterprise edition that you pay for.
So, Docker is really three flavors.
Then there’s RKT, which came out of CoreOS.
And, finally, CRI-O which is interesting because it’s part of the container runtime infrastructure that is part of how Kubernetes is designed to run containers, and, for the most part, I would put it in a red hat bucket, but Google was involved.
I also wanted to cover cloud-based container engines:
- ECS, Amazon. All honesty, Amazon does great things, they didn’t do their initial container thing well.
- ACS, Azure’s done a pretty good job.
- And, GKE, Google.
Then there are container orchestration tools
- Kubernetes which is now that the dust is settling is the orchestration tool.
- Docker has Swarm, which was a good product, they just didn’t invest in it the way they should have.
- And the same thing sort of with Mesosphere. Mesosphere has gone all in on Kubernetes. I know less about that instantiation but it seems like everybody is basically following the Kubernetes lead.
3. “Okay, what is your orchestration engine?”
- Kubernetes: “Okay, which Kubernetes?” Not this again. All the while there’s Kelsey Hightower, who is the Kubernetes god, amazing guy. He has a git-repo called “Installing Kubernetes the hard way” and I’ve heard a couple of customers now tell me they call it the ‘hard way.’ It’s default distribution — if you want to roll up your sleeves.
- Heptio: They’re interesting because they call it the un-distribution. The founders of Heptio are two of the original developers of Kubernetes at Google. What they say is that ‘we are going to try to give you what we call an un-distribution, and we’re going to be open source end to end.’ Which basically just means, they’re going to try to do something that’s really difficult to do and give you that promise of enterprise-ready, but without the proprietary kind of enterprise closed functionality.
- OpenShift. You’ve got to give OpenShift credit because they’ve been running Kubernetes since literally early beta. They had put Kubernetes into OpenShift, so, they have more burn time.Also, I’m a big SDN fan, and they run OpenSwitch. They also have, in my opinion, the best network solution, but there’s some confusion on what their long-term strategy commitments are. What if you want to use hybrids? Or if you want to use Kubernetes as a service versus OpenShift? Are you going to run OpenShift on all your clouds and on on-prem?
I talk to a lot of analysts so I tell them, “Right now OpenShift looks really good from investment point because most of our infrastructure is on-prem.” Basically, if you’re a Red Hat, you already have an enterprise license Red Hat, you want to go Kubernetes, this is a safe bet, but it’s going to get really ugly when you start running GKS, or GKE on Google.
- Docker: There are some good things on Docker, Solomon did some great things, he was a brilliant young man.
Amazon is still a little short of what we need.
Azure’s doing a great job.
And of course, if you’re going on a Google platform and you’re all in on Cloud then right now probably GKE’s probably the best play.
The service mesh has been introduced ongoingly, and it’s a concept has been around forever. When we start talking about Kubernetes we are not only talking about clustering containers but also how do these containers invoke APIs and services? If we don’t think about this, it becomes a wild west of ‘this connecting there, or this connecting there.’
The service mesh model really is designed to be a layer for service to service communication. Remember, in this context, it’s important to think about how you would have pods with containers in them, (let’s just say clusters of Kubernetes with containers in them,) and how they would call other services.
Now, this is where it gets interesting, the service mesh capabilities start with observability. Although you won’t read this in most of the documentation, what you’re seeing is all egress and ingress traffic being analyzed by some service mesh.
That opens up the ability to have traffic control, service discovery, load balancing, resilience, deployment strategies, security, circuit breakers, although circuit breakers are actually very more specific to the data plane aspect of this.
It’s Google’s implementation of a service mesh for Kubernetes, it’s a data plane and control plane architecture, where the data plane is pretty clear but there are a lot of arguments about the control plane. So it is still extremely, extremely early in on, but this is how the cells are starting to form.
When we talk about ISTIO in the context of this is the Google’s implementation of a service mesh, there’s a data plane and control plane. The data plane runs as a sidecar model, which means that in Kubernetes context it runs as a container that is a proxy.
It then sees all ingress and egress data and allows you to do all the magical things that you might have to do, like service discovery, etc.
Then the control plane is the separation of the meta-services that manage and configure proxies, then the data gets sent up to manage.
This is the control plane for ISTIO and it’s made up of three services Pilot, Mixer, and Auth.
- Pilot does service discovery, route rules, destination policy.
- Mixer is telemetry, ACLs, white lists, rate limits, custom metrics.
- Auth is basically all your security, CA, TLS, and encryption.
the real meat is in the proxy
The proxy that they call it runs as a sidecar model, it was actually developed by Lyft. It runs as a container basically in the pod with the other containers, and it’s layer seven. It’s basically if you said in 2017, “Hey, give me all the money in the world and I want to create the perfect proxy.” That’s what Lyft tried to do.
If you look at the traffic patterns, and how they have changed over the years, I mean, it used to be 90% North-South, and 10 or 20% East-West. That world is now 90% East-West of their traffic. Which mandates a different way of thinking about a proxy, and Envoy is that kind of proxy.
In other words, right now, it’s Envoy. Hopping back to the data plane, personally, I think you should spend more time thinking about Envoy than ISTIO, but that’s an early guess.
I will say that Nginx is not going to step out of the games, so they’ve got what they call nginMesh, so this is their version of a competitor to Envoy that fully fits the ISTIO model.
Kubernetes API Extensibility
You might have heard of this as operator framework. If you want to see the lineage of how we got here, there was a CoreOS original article about operator frameworks, that tried to address how you can run stateful apps in Kubernetes clusters.
Then there was a second generation of that discussion. Google adapted it, and you’re now seeing more of the discussion is less about operators and more into Kubernetes API extensibility.
Here’s what Joseph Jacks has to say, he’s my oracle when it comes to Kubernetes. This is a tweet he just put out recently, he said, “All complex software delivered as a service or behind the firewall should be implemented as a set of custom Kubernetes API extension controllers. Radical efficiencies abound.” I totally agree with this.
Here’s my radical hat, I think Kubernetes becomes the next Linux. I don’t know when that will happen, but then I think it’s like a 10 or 15 year run of a fabric that becomes how we run all our applications. I know this sounds crazy, but if that happens, Google has designed a sort of an event loop that would listen to every egress-ingress of the API of a Kubernetes cluster, at the millisecond level, and you could create your own custom resource controllers and customer resource definitions.
That’s how all of the stateful, like the Redises, the Mongos, the Cassandras are going to start implementing, and already have.
If you believe that this will be the foundation, it’s like if you could go back in time to the Linux kernel modules, and you knew what was going to happen over the next 20 years. This could be that, and even if we’re wrong I think you should go investigate this and figure out this technology for your organization.