One of the straightforward stories about the cloud – at least on the face of it – should have been the story about cloud networking. In order to understand how simple this story should’ve been, we only need to look at one number, and that number is the virtual LAN (VLAN ID) number. As you might already be aware, by using VLANs, network administrators have a chance to divide a physical network into separate logical networks. Bearing in mind that the VLAN part of the Ethernet header can have up to 12 bits, the maximum number of these logically isolated networks is 4,096. Usually, the first and last VLANs are reserved (0 and 4095), as is
So, basically, we’re left with 4,093 separate logical networks in a real-life scenario, which is probably more than enough for the internal infrastructure of any given company. However, this is nowhere near enough for public cloud providers. The same problem applies to public cloud providers that use hybrid-cloud types of services to – for example – extend their compute power to the cloud.
So, let’s focus on this network problem for a bit. Realistically, if we look at this problem from the cloud user perspective, data privacy is of utmost importance to us. If we look at this problem from the cloud provider perspective, then we want our network isolation problem to be a non-issue for our tenants. This is what cloud services are all about at a more basic level – no matter what the background complexity in terms of technology is, users have to be able to access all of the necessary services in as user-friendly a way as possible. Let’s explain this by using an example.
What happens if we have 5,000 different clients (tenants) in our public cloud environment? What happens if every tenant needs to have five or more logical networks? We quickly realize that we have a big problem as cloud environments need to be separated, isolated, and fenced. They need to be separated from one another at a network level for security and privacy reasons. However, they also need to be routable, if a tenant needs that kind of service. On top of that, we need the ability to scale so that situations in which we need more than 5,000 or 50,000 isolated networks don’t bother us. And, going back to our previous point – roughly 4,000 VLANs just isn’t going to cut it.
There’s a reason why we said that this should have been a straightforward story. The engineers among us see these situations in black and white – we focus on a problem and try to come to a solution. And the solution seems rather simple – we need to extend the 12-bit VLAN ID field so that we can have more available logical networks. How difficult can that be?
As it turns out, very difficult. If history teaches us anything, it’s that various different interests, companies, and technologies compete for years for that top dog status in anything in terms of IT technology. Just think of the good old days of DVD+R, DVD-R, DVD+RW, DVD-RW, DVD-RAM, and so on. To simplify things a bit, the same thing happened here when the initial standards for cloud networking were introduced. We usually call these network technologies cloud overlay network technologies. These technologies are the basis for SDN, the principle that describes the way cloud networking works at a global, centralized management level. There are multiple standards on the market to solve this problem – VXLAN, GRE, STT, NVGRE, NVO3, and more.
Realistically, there’s no need to break them all down one by one. We are going to take a simpler route – we’re going to describe one of them that’s the most valuable for us in the context of today (VXLAN) and then move on to something that’s considered to be a unified standard of tomorrow (GENEVE).
First, let’s define what an overlay network is. When we’re talking about overlay networks, we’re talking about networks that are built on top of another network in the same infrastructure. The idea behind an overlay network is simple – we need to disentangle the physical part of the network from the logical part of the network. If we want to do that in absolute terms (configure everything without spending massive amounts of time in the CLI to configure physical switches, routers, and so on), we can do that as well. If we don’t want to do it that way and we still want to work directly with our physical network environment, we need to add a layer of programmability to the overall scheme. Then, if we want to, we can interact with our physical devices and push network configuration to them for a more top-to-bottom approach. If we do things this way, we’ll need a bit more support from our hardware devices in terms of capability and compatibility.
Now that we’ve described what network overlay is, let’s talk about VXLAN, one of the most prominent overlay network standards. It also serves as a basis for developing some other network overlay standards (such as GENEVE), so – as you might imagine – it’s very important to understand how it works.
Let’s start with the confusing part. VXLAN (IETF RFC 7348) is an extensible overlay network standard that enables us to aggregate and tunnel multiple Layer 2 networks across Layer 3 networks. How does it do that? By encapsulating a Layer 2 packet inside a Layer 3 packet. In terms of transport protocol, it uses UDP, by default on port
4789 (more about that in just a bit). In terms of special requests for VXLAN implementation – as long as your physical network supports MTU 1600, you can implement VXLAN as a cloud overlay solution easily. Almost all the switches you can buy (except for the cheap home switches, but we’re talking about enterprises here) support jumbo frames, which means that we can use MTU 9000 and be done with it.
From the standpoint of encapsulation, let’s see what it looks like:
In more simplistic terms, VXLANs use tunneling between two VXLAN endpoints (called VTEPs; that is, VXLAN tunneling endpoints) that check VXLAN network identifiers (VNIs) so that they can decide which packets go where.
If this seems complicated, then don’t worry – we can simplify this. From the perspective of VXLAN, a VNI is the same thing as a VLAN ID is to VLAN. It’s a unique network identifier. The difference is just the size – the VNI field has 24 bits, compared to VLAN’s 12. That means that we have 2^24 VNIs compared to VLAN’s 2^12. So, VXLANs – in terms of network isolation – are VLANs squared.
Why does VXLAN use UDP?
When designing overlay networks, what you usually want to do is reduce latency as much as possible. Also, you don’t want to introduce any kind of overhead. When you consider these two basic design principles and couple that with the fact that VXLAN tunnels Layer 2 traffic inside Layer 3 (whatever the traffic is – unicast, multicast, broadcast), that literally means we should use UDP. There’s no way around the fact that TCP’s two methods – three-way handshakes and retransmissions – would get in the way of these basic design principles. In the simplest of terms, TCP would be too complicated for VXLAN as it would mean too much overhead and latency at scale.
In terms of VTEPs, just imagine them as two interfaces (implemented in software or hardware) that can encapsulate and decapsulate traffic based on VNIs. From a technology standpoint, VTEPs map various tenant’s virtual machines and devices to VXLAN segments (VXLAN-backed isolated networks), perform package inspection, and encapsulate/decapsulate network traffic based on VNIs. Let’s describe this communication with the help of the following diagram:
In our open source-based cloud infrastructure, we’re going to implement cloud overlay networks by using OpenStack Neutron or Open vSwitch, a free, open source distributed switch that supports almost all network protocols that you could possibly think of, including the already mentioned VXLAN, STT, GENEVE, and GRE overlay networks.
Also, there’s a kind of gentleman’s agreement in place in cloud networking regarding not using VXLANs from
1-4999 in most use cases. The reason for this is simple – because we still want to have our VLANs with their reserved range of
0-4095 in a way that is simple and not error-prone. In other words, by design, we leave network IDs
0-4095 for VLANs and start VXLANs with VNI 5000 so that it’s really easy to differentiate between the two. Not using 5,000 VXLAN-backed networks out of 16.7 million VXLAN-backed networks isn’t that much of a sacrifice for good engineering practices.
The simplicity, scalability, and extensibility of VXLAN also means more really useful usage models, such as the following:
- Stretching Layer 2 across sites: This is one of the most common problems regarding cloud networking, as we will describe shortly.
- Layer 2 bridging: Bridging a VLAN to a cloud overlay network (such as VXLAN) is very useful when onboarding our users to our cloud services as they can then just connect to our cloud network directly. Also, this usage model is heavily used when we want to physically insert a hardware device (for example, a physical database server or a physical appliance) into a VXLAN. If we didn’t have Layer 2 bridging, imagine all the pain that we would have. All our customers running the Oracle Database Appliance would have no way to connect their physical servers to our cloud-based infrastructure.
- Various offloading technologies: These include load balancing, antivirus, vulnerability and antimalware scanning, firewall, IDS, IPS integration, and so on. All of these technologies enable us to have useful, secure environments with simple management concepts.
We mentioned that stretching Layer 2 across sites is a fundamental problem, so it’s obvious that we need to discuss it. We’ll do that next. Without a solution to this problem, you’d have very little chance of creating multiple data center cloud infrastructures efficiently.
Stretching Layer 2 across sites
One of the most common sets of problems that cloud providers face is how to stretch their environment across sites or continents. In the past, when we didn’t have concepts such as VXLAN, we were forced to use some kind of Layer 2 VPN or MPLS-based technologies. These types of services are really expensive, and sometimes, our service providers aren’t exactly happy with our give me MPLS or give me Layer 2 access requests. They would be even less happy if we mentioned the word multicast in the same sentence, and this was a set of technical criteria that was often used in the past. So, having the capability to deliver Layer 2 over Layer 3 fundamentally changes that conversation. Basically, if you have the capability to create a Layer 3-based VPN between sites (which you can almost always do), you don’t have to be bothered with that discussion at all. Also, that significantly reduces the price of these types of infrastructure connections.
Consider the following multicast-based example:
Let’s say that the left-hand side of this diagram is the first site and that the right-hand side of this diagram is the second site. From the perspective of
VM1, it doesn’t really matter that
VM4 is in some other remote site as its segment (VXLAN 5001) spans across those sites. How? As long as the underlying hosts can communicate with each other over the VXLAN transport network (usually via the management network as well), the VTEPs from the first site can talk to the VTEPs from the second site. This means that virtual machines that are backed by VXLAN segments in one site can talk to the same VXLAN segments in the other site by using the aforementioned Layer 2-to-Layer 3 encapsulation. This is a really simple and elegant way to solve a complex and costly problem.
We mentioned that VXLAN, as a technology, served as a basis for developing some other standards, with the most important being GENEVE. As most manufacturers work toward GENEVE compatibility, VXLAN will slowly but surely disappear. Let’s discuss what the purpose of the GENEVE protocol is and how it aims to become the standard for cloud overlay networking.
The basic problem that we touched upon earlier is the fact that history kind of repeated itself in cloud overlay networks, as it did many times before. Different standards, different firmwares, and different manufacturers supporting one standard over another, where all of the standards are incredibly similar but still not compatible with each other. That’s why VMware, Microsoft, Red Hat, and Intel proposed GENEVE, a new cloud overlay standard that only defines the encapsulation data format, without interfering with the control planes of these technologies, which are fundamentally different. For example, VXLAN uses a 24-bit field width for VNI, while STT uses 64-bit. So, the GENEVE standard proposes no fixed field size as you can’t possibly know what the future brings. Also, taking a look at the existing user base, we can still happily use our VXLANs as we don’t believe that they will be influenced by future GENEVE deployments.
Let’s see what the GENEVE header looks like:
The authors of GENEVE learned from some other standards (BGP, IS-IS, and LLDP) and decided that the key to doing things right is extensibility. This is why it was embraced by the Linux community in Open vSwitch and VMware in NSX-T. VXLAN is supported as the network overlay technology for Hyper-V Network Virtualization (HNV) since Windows Server 2016 as well. Overall, GENEVE and VXLAN seem to be two technologies that are surely here to stay – and both are supported nicely from the perspective of OpenStack.
Now that we’ve covered the most basic problem regarding the cloud – cloud networking – we can go back and discuss OpenStack. Specifically, our next subject is related to OpenStack components – from Nova through to Glance and then to Swift, and others. So, let’s get started.