Building a Multiplay Network for Dummies or Why this shit isn't as easy as it looks [ This is part two of a two part document I'm in the middle of writing explaining how the MPUK iSeries event networks run. ] There's a lot that goes on 'under the hood' at a Multiplay network event, and I get the impression that a lot of this is lost on the customers, many of whom have some network skills, and can't understand (having invited a few friends round for a LAN pary at home) why this stuff seems to be so complicated. Read on. . . [ ... snip "Nik's basic guide to IP Networking" which goes here ... ] After reading the above, you've probably got some ideas about how to build a network suitable for an iSeries event. They probably go something like this. OK, we need a network for 1,000+ hosts. Lets pick a network in the private address range (10.10 say), and just use that. With a subnet mask of 255.255.0.0 everybody will be able to see everybody else, and everything will be peachy. Sadly, it's not that simple. Here are some of the reasons why. 1. ARP uses broadcasts to find the IP addresses of other machines. If you could talk directly to 1,000 other machines your host would send a lot of broadcast traffic. As would every other host on the network. The broadcasts a) Reduce the amount of bandwidth available for other traffic. b) Consume additional CPU time on each host on the network as it has to look at the broadcast traffic to decide whether or not to respond to it. 2. DHCP uses broadcasts to find the DHCP servers. See (1). 3. Windows file sharing is an incredibly chatty protocol. It uses broadcasts a lot, and is quite anal about announcing its existence every now and then. Although we try and make sure that everybody has this turned off, we can't rely on it. See (1). 4. Many games have a "LAN" mode and an "Internet" mode when it comes to network play. Some badly written games (including the ones you want to play) decide whether or not they're on a LAN by looking at the network mask. If it's /24 they go in to LAN mode, if it's anything else they go in to Internet mode. In Internet mode they may a) Decrease the amount of data they send to and from the server, leading to increased ping times. b) Want to contact sites like won.net to verify your CD key, and refuse to work if they can't verify your key. Not all games do this, just some. And we know it's an appalling way for them to behave. But there's nothing we can do about it. 5. Even if they don't do (4), the game's built in 'browse for online servers' option may have been written to broadcast to try and find available servers. Again, not all games do, but some do. See (1). 6. If we have one big network, a problem (either on one of the MPUK hosts, or on one of the customer hosts) has the potential to bring down the entire network. For example, if a customer machine starts erroneously generating broadcast packets, or accidentally starts up a DHCP server, or begins advertising different routes -- all these things can seriously impact the ability of the network to work properly. By partitioning the network in to smaller networks we are able to contain problems like these. Note that it's not enough to hope that customer machines will be configured properly. We have to assume that they won't be, and prepare for the worst. So, now you know why the network has to be divided in to smaller networks, how do we do it? At i10, we had 11 different networks, all talking to one another 192.168.1.0/24 "Administration" network for the servers (dns: private.event.multiplay.co.uk) 10.10.0.0/24 Customer facing network for the servers (dns: public.event.multiplay.co.uk) 10.10.1.0/24 The Internet gateway was on this network 10.10.10.0/24 One half of the Concourse 10.10.11.0/24 The other half of the Concourse 10.10.20.0/24 Member's Dining Room 10.10.30.0/24 One half of the Long Room 10.10.31.0/24 The other half of the Long Room 10.10.40.0/24 One half of the boxes on floors 3, 4, and 5 10.10.41.0/24 The other half of the boxes 10.10.50.0/24 Staff machines (dns: punter.event.multiplay.co.uk) By using a /24 netmask we only have room for 254 hosts on each network. Some of the rooms (Concourse, Long Room, and the boxes) can hold more than this many machines, which is why they're split over multiple networks. In the middle of the network are two Extreme 48 port switches, linked together with a fibre optic connection. This gives us 96 100Mb ports in total. These switches are "layer 3", which means that they have IP addresses. You can think of them as being smarter than regular switches, but not quite as smart as routers. Each switch appears as .1 or .2 on each of the 10.10.x.x networks listed above, and uses VLANs (don't ask) to "route" traffic between all the networks. Each MPUK server (on 10.10.0.0/24) is plugged straight in to one or other of these switches, giving them a guaranteed 100Mb/s to everything else on the network. Each of these servers also has a second NIC. We give these NICs 192.168.1.0/24 addresses, and plug them in to a 24 port Planet switch. This 192.168.1.0/24 network is used to administer the game servers -- administrators (such as the FTP site managers) can only connect in to the servers to do their work from the 192.168.1.0/24 network, which is physically located in the Staff Room. Into the remaining ports on the Extremes we run uplinks to a large number of 24 port Planet switches. We call these "table switches", because these are the switches that physically sit out in the Concourse, Long Room, etc, next to the tables that customers put their equipment on. We used to use one uplink per Planet. Which meant that each Planet had a 100MB/s connection to an Extreme. Starting with i10 we now run two uplinks from each Planet to the Extremes, meaning that each table switch has a dedicated 200MB/s link to the Extremes (and thence to the rest of the network). Finally, you, the customer, plug your equipment in to a port on the table switch. The net effect of this is that you get 100Mb/s (more or less) to the other 21 people who are on the same switch as you (24 ports on each table switch, minus the port you're using, minus the two ports that are used for the uplinks to the Extremes, equals 21). Your table, as a whole, gets 200Mb/s connection to the Extremes, and on to the game servers, FTP servers, and so forth. A word about the FTP server. As I've already said, FTP is a very efficient protocol for sucking up all available bandwidth. If we let everyone download from the FTP server at full speed it would only take two people on a table switch downloading to saturate the 200Mb/s uplink to the Extremes, and affect everyone else on that table's bandwidth. So at i10 we did two things to the FTP server. 1. We implemented rate limiting. Uploads and downloads to the FTP server were capped at 10Mb/s per connection. So if you downloaded something from the FTP site and wondered why it seemed to be downloading slowly, that's why. 2. We only allowed one connection to the FTP server per IP address. Some FTP clients attempt to make multiple connections, and download different parts of the same file in order to get around the bandwidth restrictions described in (1). This neatly puts a stop to that, and ensures that a couple of people can't ruin the experience for everyone else on their table switch. The game servers and other core servers (DNS, FTP, DHCP, etc) sit on the 10.10.0.0/24 network. On each customer network (10.10.10.0/24 and up), addresses were handed out in the range .32 through to .254. Why didn't we start at .1? One of the things we're considering is putting multiple IP addresses on the servers. So one of the CS servers might be 10.10.0.12. We might also want it to appear as 10.10.10.12, 10.10.11.12, 10.10.20.12, and so on. We're still debating the wisdom of doing this, and we didn't do it at i10, but we wanted to have the flexibility -- so we made sure that the .3 through to .31 addresses on each subnet were available if we needed it. So with all this in place, what can still go wrong, how does it affect you, and how do we work around it? In no particular order, here are some of the things we saw at i10: 1. People not configuring their hosts to use DHCP, and just picking an IP address out of thin air. Depending on the IP address they pick they might interfere with someone else on their network. Things get especially bad if they pick a .1 or .2 address, because then everything else thinks they're one of the Extremes. Not a lot we can do about this except track down the customer and get them to reconfigure their machine. We can (centrally) track things down to the table switch that their connected to. Then we need to go out there and talk to the 22 people on that switch and find out which one has ignored our instructions. This typically takes 10-15 minutes, by which point everyone on their switch (or sometimes, everyone on their network) has been severely inconvenienced. We've thought about requiring customers to give us their MAC address first, so we know exactly which host is located where. Can you say "administrative nightmare"? Not to mention the fact that a good portion of the customers wouldn't know what their MAC address is anyway. 2. People unplugging the uplinks from the table switches. Yes, this does happen. Unbelievable isn't it? What's worse is when they unplug one of the uplinks. Each table switch has two rows of 12 ports each. We plug the uplinks in to one port on the top row and one port on the bottom row. If someone unplugs an uplink then all the ports on that row will lose connectivity to the Extremes. So you get half the people on the table having no problems, and the other half not being able to see anything except the other people on the bottom row of the switch. Once people call for a yellow shirt to take a look at the problem this normally only takes a few minutes to diagnose. But it depends on how quickly people call for a yellow shirt. Faced with this scenario many people will try and ping their neighbours machine, which is often on the same port row as they are. Naturally, this works. By the time people have done these diagnostics and scratched their heads a few times many minutes have passed. These things could be avoided if customers a) Didn't unplug cables they don't understand b) Called for a yellow shirt as soon as they have a problem. 3. Flaky master browsers The network we've designed means that we absolutely have to have a master browser working so that things like GameSpy, and the in-game browsers for newer games work properly. The master browser software we used at i10 turned out to be buggy and crash-prone. There's very little we could do about that at the event, except make sure that it got restarted each time it crashed. And each time it was restarted it takes a few minutes for it to 'learn' about all the game servers that are running. This is incredibly frustrating, and we appreciate that. We're working on replacements for i11. 4. Misconfigured DHCP allocations Originally, when we configured the servers for i10, we didn't think we'd need the 10.10.31.0/24 network. Then the racecourse let us have some extra rooms at the last minute, which overflowed the 10.10.30.0/24 network. So we had to reconfigure the DHCP servers to hand out addresses on the new network, and reconfigure the Extremes to recognise it. This took about 10 minutes once we were aware of the problem. 5. Hardware failures Part way through Saturday one of Extremes decided to lock up. Hard. At this point, roughly half the event loses connectivity to everything except the people on their own table. This is a bad thing. There's also nothing we can do about it except get the switch up and running again as quickly as possible. This takes about 10 minutes. Having a couple of redundant Extreme's hanging around just in case isn't feasible. . . Similarly, a couple of the game servers suffered from dodgy RAM during the event, leading to spontaneous crashes. Swapping out the RAM fixed the problem, but we're always faced with a dilemma -- do we shut down a server during the day for an unknown amount of time to try and troubleshoot it, or do we let it run through until the evening when the load is lower and we can take it offline without having as serious an impact. We decided to let it run during the day, then take it down at the next opportunity to repair it. From then on it ran like a charm. The core servers (DNS and DHCP) are run on two separate hosts. If one of them goes down the other picks up the load automatically. 6. People trying to crack the network We saw a number of attempts by people trying to deliberately crack the network. Some of these were as simple as changing IP addresses to match one of the central servers, some were more complex (like changing your MAC address to be one of the Extremes). Our logging systems show this pretty much as it happens. In some cases we just blackhole the culprit until they come to us to complain (at which point they get a stern talking to). In other cases we boot them out of the event for violating the AUP. Either way, these attempts will cause disruption for the people on the same table switch, and may cause problems for people on the same network, until they're resolved -- which normally takes 5-15 minutes depending on the problem. 7. Illicit file sharing As I've already said, we deliberately rate limit the FTP site and cap the number of connections that it accepts to ensure that people downloading files do not interfere with your gaming. However, anyone carrying out illicit file sharing is very unlikely to do any of this. If a couple of people on your table switch are sharing files with people on other table switches then it's entirely possible that the uplinks will get swamped with file sharing traffic, seriously impacting your gaming. At the moment we take a reactive stance to this -- our logs show any excessive traffic between ports on the switches, and we blackhole the offenders and/or boot them out of the event for violating the AUP. But this still takes us some time to track down. For future events we're considering enforcing Quality of Service (QoS) rules on the Extremes to make this less of a problem. 8. Southern Electric warning us about a power surge coming in less than five minutes time Ha ha. Just kidding. A lot of you probably work in a big corporate environment and are thinking "We have hundreds of Windows machines, and we never have these problems. Clearly, you guys suck." All well and good. But remember that we have absolutely no control over how a customer configures their machine, which software they run on it, the version of Windows they're using, the quality of their network card, and innumerable other things. Each customer is 'root' on their own machine.