So today we're going to be talking about encapsulation. So encapsulation is a really important primitive which makes the Internet work. It's all based on this idea that if you have data and you want to send it in a network, it makes losses that take the data and chop it up into pieces called packets. You don't just take the data and send it continuously along a connection. This might seem obvious now, but in the days of telegraphs and early telephone systems, this wasn't clear. Back then the way you made networks as you'd establish a connection and keep that connection open and some deed on it. True that's much more efficient to take data and chop it up into pieces. Then the problem becomes you have these pieces, these data packets, and what do you do with them? You don't know where to send them. So encapsulation is about taking your data and putting like an envelope on it. You got take your data and putting a header on it, control information embedded in the packet itself. Very powerful technique and observation that makes the Internet work. We're going to talk about encapsulation and the way it works. It's a motivating example, I'm going to talk about TCPIP, which is one of the most famous encapsulation stacks out there. So this is what data looks like when you send it on the Internet. You have your data, you take the data and you wrap it in a header. You actually wrap it in multiple headers. The reason for this is that you're sending your data and it's going on in the Internet, there's a bunch of intermediate devices that need to know how to forward your data. How to process your data, what quality service to give it, things like that. As you take your data, you're going to put some headers on it. But it turns out in the Internet, there's multiple different layers. We talked about the Internet's architecture, how there's these different layers. These different layers need to talk to each other. So what we're going to do is we're going to do something it's called recursive encapsulation. We're actually going to take the data and encapsulate it with multiple layers, with multiple headers. It's like if you take a letter and you're sending, you guys say it's postal service. You take your letter, put it in an envelope, put in the mail. If you're going to send that letter by proxy what you would do is you take your letter, put it in an envelope and then put that in another envelope in a sense that proxy the proxy would can unwrap it and send it to the next hop. Well, we do something like that the Internet, because it turns out you don't just send data from one point to one point, you can have to sent through different layers. So in the Internet, we encapsulate multiple times, we take the data and encapsulate it with multiple layers. So this is an encapsulation using TCPIP. You take your data and you wrap it in a TCP segment. So the TCP segment contains port numbers, information about what application to deliver to and so on. But you can't just send TCP packets in the Internet. You got to encapsulate it in the Internet protocol. So there's something called IP, the Internet Protocol, which is a mechanism for encoding package to be delivered into and across the Internet. So that's the blue structure there. We take our TCP packet or TCP segment and we embe it in an IP packet, where you have the destination IP address and the sender's IP address and other information. But you get this into the IP packet under local area network. Regular physical connection to a local Ethernet, you need to wrap all this stuff in an Ethernet frame in order to send that out. So this is what your data actually looks like. Is your data and then multiple layers of encapsulation make it deliverables to the final destination. So the way this works inside your operating system, inside your kernel, is you have an application, the application does a system call to the TCP part of the kernel which has a code to do TCP, it wraps the data and TCP segment passes it down to the IP layer which in turn passes it down to the Ethernet layer, the device driver, the Nick which does the encapsulation for the Ethernet frame. So the way this is implemented is you have a set of layers, and each layer of the protocol stack encapsulates data passed to it. Then when you forward the data packet out, the packet goes out and each forwarding device, each router, each switch, each Bluetooth node, proxy, each node looks at this data and then decides what to do with the based on the headers. There's a question of which headers that should look at. Because I was thinking when you take data, you don't just take data and forward it, you forward it at a layer. So when we talk about forwarding data in the Internet or routing data, we're forwarding it at a layer or processing it at layer. So there's different layers of processing. Now, if you're doing switching, your traditional switching only looks at the Ethernet header, whereas routing only looks at the IP header and so on. This is the way we used to learn about networks. You'll hear people talk in this way. They'll say, ''You know, if you have a switch that's layer two and a router's layer three.'' That used to be really true, 20 years ago if somebody sold a router, he was very clear router would process things at layer three. He would only look at the layer three headers in the stack. So it would take the Ethernet frame in and strip off the Ethernet frame throw it away and then look at the layer three headers to forward a packet. That's what a router would do, whereas a switch looks at the Ethernet header. It would only look at the Ethernet frame header and ignore everything inside of it, it would look at the IP packet and so on. So that was the way things were 20 years ago. Over time there came to be processing at different layers. So the terminology has shifted more recently to be more specific about which layer we're processing at. So you'll hear terminology like layer three switch. I have a switch and it's forwarding at layer three or layer four switch or layer four load balancer, layer seven firewall or things like that. So there's different functions you do in the network and they are applied in different layers. There's a lot of debate over how this should be done. So if you talk to an Internet purist, somebody who really believes the founding principles of the Internet should be followed, the original RFC should be followed, they don't like this because it muddies the waters about how you process data. They believe if you have devices in the network, devices should be agnostic to the data that's contained in the packet. If you have an Ethernet switch it should just for based on the Ethernet data, it's shouldn't look into the IP packet or the TCP segment or the application layer. That makes it more complex because if you change an application at the end points is no longer changing applications at the inputs, you got change devices in the middle and convince intermediate vendors and ISPs to change their rules about forwarding traffic. So there's this principle which is known as the end-to-end principle, which is if you have a function it should be implemented as close to the edges as possible. That was followed for a long time. But nowadays we have things like network neutrality, where I'm an ISP and I'm forwarding data traffic, and I'm sending the data traffic and Google is sending a lot of traffic through me. So I can't see that because I'm forwarding at layer two and layer three but if I can pick into layer seven and look at the traffic I can figure out that's Google and I can go to Google and charge them a bunch of money for forwarding traffic. So it turns out there's money and violating this clean end-to-end principle on the Internet and you're seeing a lot of devices like layer three switches, layer four load balancers, and so on go up the stack. It's not just network neutrality, there's other reasons to do this too.So for example, if I'm Amazon and I've users and they're storing data in shopping cards, I want to do load balancing of huge servers ructure. I don't want one user's packets sprayed over my whole set of servers. I don't want to load balance at layer three, I want to load balance at layer four because I want every connection to be marked to a single server. So you can do that if you have a layer four load balancer. So these things are good,but they can be used for all activities as well. So this is the idea behind encapsulation.