Tuesday, February 21, 2012

Part I: Introduction to Networking and Internet Applications



Chapter 1: Introduction and Overview

There are five key aspects of networking:
  1. Network applications and network programming - a programmer who understands the underlying network mechanisms and technologies can write network applications that are more reliable and efficient.  
  2. Data communications - although it deals with many low-level details, it provides a foundation of concepts on which the rest of networking is built.
  3. Packet switching and networking technologies - because each technology is created to meet various requirements for speed, distance, and economic cost, many packet switching technologies exist. Technologies differ in details such as the size of packets and the method used to identify a recipient.
  4. Internetworking with TCP/IP - the Internet is formed by interconnecting multiple packet switching networks. Internetworking is substantially more powerful than a single networking technology because the approach permits new technologies ti be incorporated at any time without requiring the replacement of old technologies.
  5. Additional networking concepts and technologies
    • Public and private parts of the Internet
      • A public network is owned by a service provider, and offers service to any individual or organization that pays the subscription fee.
      • The term public means a service is available to the general public; data transferred across a public network is not revealed to outsiders.
      • A network is said to be private if use of the network is restricted to one group. A private network can include circuits leased from a provider.
        • Networking vendors divide private networks into four categories:
          • Consumer
          • Small office/home office (SOHO)
          • Small-to-medium business (SMB)
          • Large enterprise
    • Networks, interoperability and standards
      • Communication involves multiple entities that must agree on details ranging from the electrical voltage used to the format and meaning of messages. To insure that entities can interoperate correctly, rules for all aspects of communication are written down.
      • A communication protocol specifies the details for one aspect of computer communication, including actions to be taken when errors or unexpected situations arise. A given protocol can specify low-level details, such as the voltage and signals to be used, or high-level items, such as the format of messages that application programs exchange.
    • Protocol suites and laying standards
      • TCP/IP Stack
        • Layer 5 - Application (specifications for email exchange, file transfer, web browsing, telephone services, and video conferencing)
        • Layer 4 - Transport (specifications that control the maximum rate a receiver can accept data, mechanisms to avoid network congestion, and techniques to insure that all data is received in the correct order)
        • Layer 3 - Internet (Internet addressing structure, the format of Internet packets, the method for dividing a large Internet packet into smaller packets for transmission and mechanisms for reporting errors)
        • Layer 2 - Network interface (specifications about network addresses and the maximum packet size that a network can support, protocols used to access the underlying medium and hardware addressing)
        • Layer 1 - Physical (specifies details about the underlying transmission medium and associated hardware)
      • Headers and layers
        • When an application sends data, the data is placed in a packet, and the outgoing packet passes down through each layer of protocols. When it has passed through all layers of protocols on the sending computer, the packet leaves the computer and is transmitted across the underlying physical network. When it reaches the receiving computer, the packet passes up through the layers of protocols. If the application on the receiving computer sends a response, the process is reversed.
        • Each layer on the sending computer prepends extra information (called a header) onto the packet; the corresponding protocol layer on the receiving computer removes and uses the extra information. 
        • Headers are not of uniform size, and a physical layer header is optional.
      • ISO and the OSI Seven Layer Reference Model
        • International Organization for Standardization (ISO)
        • International Telecommunications Union, Telecommunication Standardization Sector (ITU-T)
        • Open Systems Interconnection Seven-Layer Reference Model
          • Layer 7 - Application
          • Layer 6 - Presentation (unfilled and unnecessary)
          • Layer 5 - Session  (unfilled and unnecessary)
          • Layer 4 - Transport
          • Layer 3 - Network
          • Layer 2 - Data Link
          • Layer 1 - Physical
        • TCP/IP technology is technically superior to OSI. Applications are still often referred to as Layer 7 protocols.

Chapter 2: Internet Trends
Early computer networks were designed to permit sharing of expensive, centralized resources. However, the Internet has experienced exponential growth for the past 25 years, doubling in size every nine to fourteen months. The availability of high-speed computation and communication technologies shifted the focus of the Internet from resources sharing to general-purpose communication. Overall the Internet has transitioned from the transfer of static, textual documents to the transfer of high-quality multimedia content. 
  • Telephone system have switched from analog to VoIP
  • Cable TV has switched form analog to IP
  • Cellular has switched from analog to digital cellular services (3G)
  • Internet access has switched from wired to Wi-Fi
  • Data access has switched from centralized to distributed services (P2P)
New applications have also emerged:
  • High-quality teleconferencing (B2B communication)
  • Navigation systems (military, shipping industry, consumers)
  • Sensor networks (environment, security, fleet tracking)
  • Social networks (consumers, volunteer organizations)

Chapter 3: Internet Applications and Network Programming
The Internet provides a general purpose communication mechanism on which all services are built, and individual services are supplied by application programs that run on computers attached to the Internet.
  • Two basic Internet communication paradigms
    • Stream paradigm (connection oriented, 1-to-1 communication, sequence of individual bytes, arbitrary length transfer, used by most applications, built on TCP protocol)
      • Although it delivers all bytes in sequence, the stream paradigm does not guarantee that the chunks of bytes passed to a receiving application correspond to the chunks of bytes transferred by the sending application.
    • Message paradigm (connectionless, many-to-many communication, sequence of individual messages, each message limited to 64K, used for multimedia applications, built on UDP protocol)
      • Permits messages to be LOST, DUPLICATED (more than one copy arrives), and delivered OUT-OF-ORDER.
  • Client-Server Model of Interaction
    • Server Application 
    • Client Application
      • Starts second, must know which server to contact, initiates a contact whenever communication is needed, communicates with a server by sending and receiving data, may terminate after interacting with a server.
    • Although it provides basic communication, the Internet does not initiate contact with, or accept contact from, a remote computer; application programs known as clients and servers handle all services.
    • Information can flow in either or both directions between a client and server. Although many services arrange for the client to send one or more requests and the server to return responses, other interactions are possible.
    • A single, powerful computer can offer multiple services at the same time; a separate server program is needed for each service.
  • Server Identification and Demultiplexing
    • Internet protocols divide identification into two pieces:
      • An identifier for the computer on which a server runs
        • 32-bit Internet Protocol address (IP address)
      • An identifier for a particular service on the computer
        • 16-bit protocol port number
  •  Concurrent servers
    • A concurrent server uses threads of execution to handle requests from multiple clients at the same time.
  • Peer-to-Peer Interactions
    • Traffic between server and Internet is 1/N as much as the single-server architecture (N = # of distributed servers)
    • Server software can run on the same computers as clients
  • Network Programming and the Socket API
    • The interface an application uses to specify communication is known as an Application Program Interface (API).
      • Socket API is the defacto standard for software that communicates over the Internet
      • When an application creates a socket, the operating system returns a small integer descriptor that the application uses to reference the socket.
      • Major functions in the Socket API include: 
        • accept: newsock=accept(socket, caddress, caddresslen)
        • bind: bind(socket, localaddr, addrlen)
        • close: close(socket)
        • connect: connect(socket, saddress, saddresslen)
        • getpeername
        • getsockopt
        • listen: listen(socket, queuesize)
        • recv: recv(socket, buffer, length, flags)
        • recvmsg: recvmsg(socket, msgstruct, flags)
        • recvfrom: recvfrom(socket, buffer, length, flags, sndraddr, saddrlen)
        • send (write): send(socket, data, length, flags)
        • sendmsg: (sendmsg(socket, msgstruct, flags)
        • sendto: sendto(socket, data, length, flags, destaddress, addresslen)
        • setsockopt
        • shutdown
        • socket:  descriptor = socket(protofamily, type, protocol)
  • Sockets, Threads and Inheritance
    • Each new thread that is created inherits a copy of all open sockets from the thread that created it.
    • The original socket used to accept connections exists as long as the main server thread executes; a socket used fro a specific connection exists only as long as the thread exists to handle that connection.

Chapter 4: Traditional Internet Applications
  • Application-Layer Protocols
    • Whenever a programmer creates two applications that communicate over a network, the programmer specifies details such as:
      • The syntax and semantics of messages that can be exchanged
      • Whether the client or server initiates interaction
      • Actions to be taken if an error arises
      • How the two sides know when to terminate communication
    • Two types of application layer protocols that depend on the intended use:
      • Private communication
      • Standardized service
        • To allow applications for standardized services to interoperate, an application-layer protocol standard is created independent of any implementation.
    • Application layer protocols specify two aspects of interaction:
      • Data representation - syntax of data items that are exchanged, specific form used during transfer, translation of integers, characters and files between computers
      • Data transfer - interaction between client and server, message syntax and semantics, valid and invalid exchange error handling, termination of interaction
        • As a convention, the word Transfer in the title of an application layer protocol means that the protocol specifies the data transfer aspect of communication.
  • Web Protocols
    • HyperText Markup Language (HTML) - a representation standard used to specify the contents and layout of a web page
      • Uses a textual representation
      • Describes pages that contain multimedia
      • Follows a declarative rather than procedural paradigm
      • Provides markup specifications instead of formatting
      • Permits a hyperlink to be embedded in an arbitrary object
      • Allows a document to include metadata
    • Uniform Resource Locator (URL) - a representation standard that specifies the format and meaning of web page identifiers
      • protocol://computer_name:port/document_name%parameters
    • HyperText Transfer Protocol (HTTP) - a transfer protocol that specifies how a browser interacts with a web server to transfer data
      • Uses textual control messages
      • Transfers binary data files
      • Can download or upload data
      • Incorporates caching
        • Caching in browsers - a browser can reduce download times significantly by saving a copy of each image in a cache on the user's disk and using the cached copy.
      • Four major request types:
        • GET - requests a document; server responds by sending status information followed by a copy of the document 
        • HEAD - requests status information; server responds by sending status information, but does not send a copy of the document
        • POST - sends data to a server; the server appends the data to a specified item
        • PUT - sends data to a server; the server uses the data to completely replace the specified item
      • When using HTTP, a browser sends version information which allows a server to choose the highest version of the protocol that they both understand.
    • File Transfer Protocol (FTP)
      • The most widely-deployed file transfer service in the Internet. It is characterized by:
        • Arbitrary file contents
        • Bidirectional transfer
        • Support for authentication and ownership
        • Ability to browse folders
        • Textual control messages
        • Accommodates heterogeneity
      • Inverts the client-server relationship for data connections. When opening a data connection the client acts like a server (waits for the data connection) and the server acts like a client (initiates the data connection). After it has been used for one transfer, the data connection is closed.
    • Electronic Mail
      • Mail software is divided into two conceptually separate pieces:
        • An email interface application
        • A mail transfer program
      • Transfer - a protocol used to move a copy of an email message from one computer to another
        • Simple Mail Transfer Protocol (SMTP)
          • Follows a stream paradigm
          • Uses textual control messages
          • Only transfers text messages
          • Allows a sender to specify recipients' names and check each name
          • Sends one copy of a given message
      • Access
        • Mail access follows one of two forms:
          • A special-purpose email interface application
          • A web browser that accesses an email web page
        • Mail access protocols:
          • Provide access to a user's mailbox
          • Permit a user to view headers, download, delete, or send individual messages
          • Client runs on user's personal computer
          • Server runs on a computer that stores user's mailbox
        • POP3 - post office protocol version 3
        • IMAP - Internet mail access protocol
      • Representation
        • Two important representation standards exist:
          • RFC2822 Mail Message Format
            • IETF request for comments 2822
            • A mail message is represented as a text file and consists of a header section, a blank line, and a body
            • Format = Keyword: information
              • Keywords include: from, to, subject, cc, etc.
            • Header lines that start with 'X' can be added without affecting mail processing
              • X-ColorCodeMyEmail: green
          • Multipurpose Internet Mail Extensions (MIME)
            • Extends the functionality of email to allow the transfer of non-text data in a message. 
            • The MIME standard inserts extra header lines to allow non-text attachments to be sent within an email message. An attachment is encoded as printable letters, and a separator line appears before each attachment.
    • Domain Name System (DNS)
      • Provides a service that maps human-readable symbolic names to computer addresses by specifying values for the top-level domain (TLD) which are controlled by the Internet Corporation for Assigned Names and Numbers (ICANN).
        • TLDs include: aero, arpa, asia, biz, com, coop, edu, gov, info, int, jobs, mil, mobi, museum, name, net, org, pro, travel, [country code]
      • DNS system is largely autonomous to allow organizations to assign names to computers without informing a local authority.
        • Replication is particularly important for root servers.
      • Name Resolution
        • The translation of a domain name into an address is called name resolution. Software to perform this service is known as a name resolver.
      • Caching in DNS Servers
        • The locality of reference principle applies to DNS is two ways:
          • Spacial: a user tends to look up the names of local computers more often than the names of remote computers.
          • Temporal: a user tends to look up the same set of domain names repeatedly.
      • Because each DNS resource record generated by an authoritative server specifies a cache timeout, items can be removed from a DNS cache when they become stale.
      • Each entry in a DNS server has a type. When a resolver looks up a name, the resolver specifies the type that is desired, and the DNS server returns only entries that match the specified type.
      • CNAME type entries provide an alias to another DNS entry.
      • The IDNA standard for international domain names encodes each label as an ASCII string, and relies on applications to translate between the character set a user expects and the encoded version stored in the DNS.
    • Extensible Markup Language (XML)
      • Allows the sender to specify the form of data, rather than being specified a priori, and describes the structure of data and provides names for each field.

No comments:

Post a Comment