CS 268 - Class Notes (Internetworking)
Internetworking
Connecting heterogeneous networks
The Internet
Dates of Importance:
- 1969 - firsts hosts at UCLA
- 1982 - TCP/IP standardized
- 1991 - WWW started at CERN (Swiss Physics Laboratory)
Growth of the Internet: The Internet has grown very quickly.
In the following table, the numbers for hosts represent the number of
machines responding to "ping" and does not count the number of dial-up
users. Networks represents how many networks are allocated. WWW Servers
represents the number of WWW servers that respond to a request. Domains
represents an estimate of how many domains have been registered in the
DNS system. This information was taken from Hobbe's Internet Timeline
at
http://www.zakon.com/robert/internet/timeline/
| Year | Hosts | Networks | WWW-Servers | Domains |
| 1969 | 4 | - | - | - |
| 1989 | 130,000 | 650 | - | 3,900 |
| 1993 | 1,776,000 | 13,761 | 130 | 26,000 |
| 1994 | 3,212,000 | 25,210 | 2,735 | 46,000 |
| 1995 | 6,642,000 | 61,538 | 23,500 | 120,000 |
| 1996 | 12,881,000 | 134,365 | 299,403 | 488,000 |
| 1997 | 19,540,000 | ? | 1,203,096 | 1,301,000 |
| 1998 | 36,739,000* | ? | 2,594,622 | ? |
| 1999 | 56,218,000 | ? | 6,598,697 | ? |
| 2000 | 93,047,000 | ? | 18,169,498 | ? |
| 2001 | 125,888,197 | ? | 31,299,572 | ? |
| 2002 | 147,344,723 | ? | 36,689,008 | ? |
| 2003 | 171,638,297 | ? | 35,424,956 | ? |
| 2004 | 233,101,481 | ? | 46,067,743 | ? |
| 2005 | ? | ? | 56,923,737 (as of 12/2004) | ? |
* - In 1998, a new method was developed for more accurately determining
the number of hosts.
Internet Architecture
The process of interconnecting two networks can be done by making a
gateway between the networks.
+-----------+ +--------+ +-----------+
| Network 1 |--------| Router |--------| Network 2 |
+-----------+ ^ +--------+ ^ +-----------+
Conection to Connection to
Network 1 Network 2
A router is a "multihomed" machine. This means it has multiple
interfaces. One interface for each network it is attached to.
Routers route packets based on destination networks and not
destination hosts. This makes it scalable to the number
of networks on the Internet and not the number of hosts.
Routers used to be called gateways or bridges. However,
those terms are now used for other entities and it can be
confusing if they are used. These entities are also sometimes
called switches.
Tracing the routes packets take from WVU to Berkeley in CA, we
see that packets pass through 17 routers (or hops) in all.
Of these, the distribution breaks down like so:
- 2 routers at WVU
- 2 routers at WVNet
- 5 routers at AlterNet
- 4 routers at SprintLink
- 4 routers at Berkeley
The protocol suite defining the Internet, called TCP/IP, defines an
abstraction of "network" that hides the details of physical networks,
such as Ethernet, Token Ring, etc. To do this, a network achitecture
should have several properties:
- Open System. The standards running the system must be public.
- Fault Tolerance. Decentralized control.
- Ability to be quickly deployed.
These properties are usually also present in the properties of
general network architectures. These properties are generally agreed
on and are:
- Scope: solve as general a problem as possible.
- Scalability: work efficiently with large networks and small
networks.
- Robustness: tolerant to failures of links and nodes. This usually
includes these types of subproperties.
- Firewalls: disruptions only affect a portion of the
network.
- Self-Stabilization: corruption should eventually
"flush" out and the network return to normal operation without
human intervention and in a reasonable amount of time.
- Fault Detection: detect faults in a timely manner. Some
degree should exist for fault detection in all networks.
- Byzantine Robustness: ability to detect ongoing corrupting
agents. Very difficult to do and usually not included in most
networks.
- Autoconfigurability: come with reasonable default values for most
environments.
- Tweakability: ability to optimize performance through variable
adjustments.
- Determinism: identical conditions give identical results.
- Migration: ability to allow modifications easily.
Most network architectures have adopted a "layered" software architecture.
This is a typical software technique that uses simpler and usually more
general purpose functions to construct more complex functions. This is
ideal for communications where distinct functionality builds from other
functionality.
The International Standards Organization (ISO) came up with a "complete"
7-layer network description. The system is called Open System
Interconnection (OSI). TCP/IP, however, is typically thought of as
having only 4 layers. The OSI layers, there responsibilities, and the
corresponding TCP/IP layers are given below. Each higher layer
builds on the layers below it to provide its services.
+----------------------+
| Application Layer * |
+----------------------+
| Presentation Layer |
+----------------------+
| Session Layer |
+----------------------+
| Transport Layer * |
+----------------------+
| Network Layer * |
+----------------------+
| Data Link Layer * |
+----------------------+
| Physical Layer |
+----------------------+
|
- Application: particular application needs
- Presentation: agreement on representation (encoding) of
data such as structures, records, floats, doubles, etc.
- Session: connection establishment/teardown and session
management.
- Transport: flow of data between two hosts. end-to-end
(i.e. host-to-host) concern. TCP, UDP, etc.
- Network: routing and base protocol operation. hop-by-hop
concerns. IP, ICMP, etc.
- Data Link: lower level technology, media dependent protocols,
headers and trailers for media.
- Physical: even lower level aspects of technology, pin diagrams,
timing charts, etc.
|
* - Those layers also present in TCP/IP. TCP/IP combines the Physical
and Data Link layers into a single Link layer. TCP/IP also
combines the Session layer into the Transport layer.
Some Handy definitions:
- Routers connect network at the Network layer
- Bridges connect networks at the Link layer(s)
This indicates that to everything above the Link layer, bridges
are transparent. In the OSI model, the layers are also numbered,
starting at the bottom with 1 and moving up. So, a Level 2 (L2)
switch would be a Bridge by the definition above.
Usually, protocols interact at the same layer. We will be looking at
the TCP/IP protocols in various layers. First the Link layer. In the
Link layer, the operaiton of Ethernet, Token Ring, and other technologies
would be defined. In the Network layer, we will look at two protocols:
- Internet Protocol (IP)
- Defines the "datagram", the basic unit of transmission in the
Internet. Datagrams will be of varying sizes base don what technology
it is running on (Ethernet, Token Ring, phone line, etc.). The size of
a datagram on a technology is called the Maximum Transmission Unit (MTU)
of the technology. Typically, the slower the media, the smaller the MTU.
- Defines the Internet Addressing scheme.
- Defines how datagrams are routed.
- Defines fragmentation (spliting up a single datagram into
multiple datagrams) and reassembly (putting multiple datagrams back into
a single datagram).
- IP is unreliable. It does not guarantee a datagram arrives to its
eventual destination. It does its best, but it is not guaranteed.
- IP is connectionless. It does not contain a handshake between the
source and destination. It also does not exchange state information between
the source and destination and successive transmissions are treated
separately by routers.
- Internet Control Message Protocol (ICMP)
- Uses IP to send and receive datagrams
- Used to send control messages from a destination back to a
source. Such control messages as "unreachability" messages for
host, network, and port unreachable.
- Used to redirect routes for datagrams
- Used for diagnostics such as ping and traceroute.
TCP/IP includes TCP and UDP in the Transport layer. These protocols
are described briefly below.
- User Datagram Protocol (UDP)
- Uses IP to send and receive datagrams
- UDP is unreliable (just like IP)
- UDP is connectionless (just like IP)
- "lighter weight" than TCP because it is unreliable and connectionless
- Uses 16-bit port number in datagrams for demultiplexing into
applications.
- Transmission Control Protocol (TCP)
- Uses IP to send and receive datagrams
- Adds reliable delivery through Positive Acknowledgements. The
receiver sends a message back to the source letting it know it
received the datagram.
- TCP is connection-oriented. It has a handshake/teardown for
session/connection management. The source and destination continually
exchange state information.
- TCP has a byte-stream abstraction. This is a network "pipe"
abstraction from the users point of view, a continuous stream of
data.
- TCP is full-duplex. Data can flow in both directions.
- Uses 16-bit port number in datagrams for demultiplexing into
applications.
Internet Addresses
IPv4, IP version 4 (the most widely deployed version of IP), addresses are
32-bit Integer numbers. Addresses are structured and unique to a single
host (or interface on a multi-homed machine) on the Internet. IP addresses
are usually written in a dotted decimal notation format, a.b.c.d, where
each part is an 8-bit value.
| Class | Structure | Address Range |
| A |
1 7 bits 24 bits
+-+------+--------------------+
|0|Net ID| Host ID |
+-+------+--------------------+
| 0.0.0.0 - 127.255.255.255 |
| B |
1 1 14 bits 16 bits
+-+-+-----------+-------------+
|1|0| Net ID | Host ID |
+-+-+-----------+-------------+
| 128.0.0.0 - 191.255.255.255 |
| C |
1 1 1 21 bits 8 bits
+-+-+-+--------------+--------+
|1|1|0| Net ID | Host ID|
+-+-+-+--------------+--------+
| 192.0.0.0 - 223.255.255.255 |
| D |
1 1 1 1 28 bits
+-+-+-+-+---------------------+
|1|1|1|0| Multicast Group |
+-+-+-+-+---------------------+
| 224.0.0.0 - 239.255.255.255 |
| E |
1 1 1 1 1 27 bits
+-+-+-+-+-+-------------------+
|1|1|1|1|0| Reserved |
+-+-+-+-+-+-------------------+
| 240.0.0.0 - 247.255.255.255 |
The class of an IP address specifies where in the address the
network ID is and where the host ID. The Net ID specifies the
network that the address is connected to. The Host ID specifies
the ID of the address on the network. Addresses
are allocated by a central agency called the InterNIC, the Internet
Network Information Center, to entities desiring an Internet
address. Internet addresses are tied to the type of transmission
the address specifies. The types of transmissions are.
- Unicsat: transmit to single destination
- Multicast: transmit to a subset of all hosts
- Broadcast: transmit to all hosts
Class A, B, and C addresses specify unique destinations on the
Internet (unicasts) and maybe used for source and destination
addresses. Class D addresses specify multicast addresses. These
addresses may only be destination addresses. Broadcast addresses
are based on the resolution of the broadcast. We will discuss
them shortly. An IP address with Host ID set to 0 indicates a
network address is being specified.
The Domain Name System (DNS) provides a distributed database that
provides a mapping between text string representing IP addresses
(hostnames) and the IP address value.
IP addresses specify network connections. A multihomed machine will
have multiple network connections and therefore will have an IP
address for every interface it has. Strictly speaking, it is therefore
incorrect to say that an IP address specifies a single host. However,
this is generally a true statement.
This addressing scheme allows efficient routing because the network
information is encoded in the address. However, this also makes it
difficult to move a host across networks. When this happens, the
IP address must change. This is normally not a problem unless the
host must operate while the change must occur.
Subnet Addressing (Netmasks)
It should be obvious that the class A and class B addresses have a
large amount of hosts per network. Normal networks can not have
or do not want to have so many hosts on them. This has lead the
use of subnet addressing. The concept is to split the hostid part
up into a Subnet ID and a Host ID. The Host ID part fills the same
function. The Subnet ID part specifies the subnetwork (within
the larger network) that the host resides. Subnet addresses are specified
as a 32-bit mask value where 1's signify Net ID and Subnet ID and
where 0's signify Host ID. Netmasks are not allocated and need not
be unique within the Internet. Netmasks are usually specified as
either a hex value, FFFFFF00, or in dotted decimal form, 255.255.255.0.
As an example, use the IP address of naur.csee.wvu.edu, 157.182.194.28.
The netmask used by naur and others on the same subnet is 255.255.255.0
or FFFFFF00. The class of naur's IP address is B. So, the Net ID portion
is 14 bits, the Subnet ID is the 8 bits between the Net ID and the
Host ID. The Host ID part is also 8 bits, the last 8 bits more specifically.
An alternate way to specify the netmask and IP address together is by
following the IP address by a / and the number of leading 1's in the netmask.
As an example, naur would be 157.182.194.28/24. The 24 stands for 24 leading
1's in the netmask.
Given a sources IP address and netmask, it is easy to determine any
other hosts relative location from its IP address. The choices are:
- The host may be on the same subnet as the source.
- The host may be on another subnet, but the same network as the source.
- The host may be on another network than the source.
To better understand subnetting, let's look at typical subnet masks
and how many networks and nodes per network we have given any Class C
address. (The nodes per network assumes special address given below).
| Subnet Mask (last octet) | # of Networks | Nodes/Network |
| 255.255.255.0 (00000000) | 1 | 254 |
| 255.255.255.128 (10000000) | 2 | 126 |
| 255.255.255.192 (11000000) | 4 | 62 |
| 255.255.255.224 (11100000) | 8 | 30 |
| 255.255.255.240 (11110000) | 16 | 14 |
| 255.255.255.248 (11111000) | 32 | 6 |
| 255.255.255.252 (11111100) | 64 | 2 |
| 255.255.255.254 (11111110) | 128 | 0
|
Special Internet Addresses
The Internet address space has several addresses that have special
meanings and can not be reserved. A general idium you can use is:
0 means "this" and 1 means "all".
Broadcast Addresses
Broadcasts have several different resolutions. These are below.
Broadcast addresses may only be used as destination addresses.
- net-directed broadcast: An A, B, or C address with its
Host ID and Subnet ID part set to all 1's. This sends to all of the
hosts on the same network.
- subnet-directed broadcast: An A, B, or C address with
its Host ID part set to all 1's. This sends to all of the hosts on
the same subnet.
- limited broadcast: The IP address 255.255.255.255 (all 1's).
This sends to all of the hosts on the attached network. It may never
be forwarded by routers. This address is used if the machine does
not know its IP address or netmask as is possible during bootstrapping.
Normally, the source address used in a broadcast is the address of
the host sending the broadcast. However, if the host does not know
its IP address (due to initialization, for example), two additional
addresses are specially reserved. These are:
- 0.0.0.0 indicates the current host on the current network.
- Net ID set to 0, Host ID not 0 (and not all 1's) indicates the
designated host on the current network.
Loopback Addresses
TCP/IP is designed to work transparently no matter where the
source and destination are located. A whole class A network
is reserved to indicate the current host. It may be used
as a source or destination address. The class A network address 127.0.0.0
is reserved to indicate the current machine and may not be assigned.
Packets destined for 127.0.0.0 will not go onto the network and
packets coming from 127.0.0.0 have not come from the network.
Most systems use a specific host on the 127.0.0.0 network to
signify loopback. Typically, this is 127.0.0.1, but it may be different
on some systems.
IPv6, the next version of IP, modifies addressing by making addresses
128-bits in length. There are no broadcast transmissions, it uses multicast
instead. And IPv6 introduces the concept of Anycasting.
Port Numbers and Demultiplexing
Port numbers are 16-bit values used by TCP and UDP to identify
applications and services. Port numbers in the 1-1023 range are
managed by the Internet Assigned Numbers Authrotiy (IANA). The
IETF document, Request For Comments (RFC) 1700 is a list of some
of the currently allocated ports and what services they offer.
Ports are used as sources and destinations. Destination ports are
used for identifying services. Some common services and there
ports are:
- FTP (TCP port 21)
- Telnet (TCP port 23)
- TFTP (UDP port 69)
- echo (TCP port 7 and UDP port 7)
- daytime (TCP port 13 and UDP port 13)
- HTTP (TCP port 80)
Source ports are "ephemeral" (or short lived, temporary) identifiers
and may be recycled over a long period of time. Ports are used internally
to demultiplex packets from the protocol processing software to the
applications.
Demultiplexing
Demultiplexing is how a protocol stack fits together. At the lowest
level, the Link layer, a module reads in a packet. Within that packet is
a field that indicates which protocol it is destined for (say IP),
the module forwards that packet into the IP module for processing.
The IP module looks further in the packet and examines a field, the
IP protocol field, and sends the packet out to the ICMP, UDP, or
TCP modules based on the value in the field. The TCP and UDP modules then
examine the port numbers and sends the packet out to various applications
based on the port numbers.
Todd L. Montgomery (revised 08.23.1999)