Distributed Computing

I used the texts:

for notes for this section. Also web pages:

Networking - Introduction

Networking - making connections from your applet or application to a system over the network.

Java provides cross-platform abstractions for simple networking operations, including connecting and retrieving files by using Web protocols and creating basic Unix-like sockets.

Used in conjunction with I/O streams, reading and writing files over the net becomes very easy.

Restrictions: Java applets cannot read or write from the disk on the machine that is running them. An applet can only establish a socket connection to the server on which it resides. Thus, an applet cannot be used as the server program.

Classes for networking are in package java.net

Very very simple starting out

Do the example in Core Java, Volume 2, Chapter 3 So, on a command-line, do
telnet time-a.timefreq.bldrdoc.gov 13

I got back:
50914 98-04-11 22:10:07 50 0 0 50.0 UTC(NIST) *
This is a simple example of connecting to a service (the National Institute of Standards and Technology in Boulder, Colorado) and receiving information (a measurement of a Cesium atomic clock).

Before jumping into client/server programming for Java, one needs to understand some basics: Hosts, Ports, URLs and Sockets.

Ports

As a general (but far from absolute) rule each computer only has one Internet address. Addresses would be all one needs if each computer did no more than one thing at a time. However, on a given computer, there may be multiple ftp sessions, a few web connections, and a chat program all running at the same time.

To make this possible the computer's network interface is logically subdivided into 65,536 different ports. This is an abstraction. A port does not represent anything physical like a serial or parallel port. However as data traverses the Internet in packets, each packet carries not only the address of the host but also the port on that host to which it's aimed. The host is responsible for reading the port number from each packet it receives to decide which program should receive that chunk of data.

On Unix systems you must be root to listen for connections on ports between 1 and 1024. Anyone can listen for connections on ports of 1025 to 65,535 as long as the port is not already occupied. (No more than one program can listen on a given TCP port at the same time.) However on Windows NT, Windows 95, and the Mac any user can listen to any port. No special privileges are required.

Some examples (from "Java Network Programming", page 20). On UNIX machines a fairly complete listing of assigned ports is stored in the file etc/services

PROTOCOLPORT ENCODING PURPOSE
echo7 tcp/udpEcho is a test protocol used to verify that two machines are able to connect by having one echo back the other's input
ftp-data 20 tcp FTP uses two well known ports. This port is used to transport files
ftp21 tcp This port is used to send FTP commands like "put" and "get"
telnet23 tcp Telnet is a protocol used for interactive, remote command-line sessions
smtp25 tcp The "Simple Mail Transfer Protocol" is used to send email between machines
whois43 tcp A simple directory service for internet network administrators
finger79 tcp Gets information about a user or users
http80 tcp HyperText transfer Protocol the underlying protocol of the WWW

Any remote host can connect to a server that's listening on a port below 1024. Furthermore, multiple simultaneous connections may be made to a remote host on a remote port. For example, a high volume web server listening on port 80 may be processing several dozen connections at the same time, all connected to port 80.

In short, no more than one process on the local host can use a port at one time. However many remote hosts may connect to the same remote port.

Protocols and Internet Standards

Loosely speaking, a protocol defines how two hosts talk to each other. For example, in radio communications a protocol might say that when one participant is finished speaking, he or she says "Over" to tell the other end that it's OK to start talking. In networking a protocol defines what is and is not acceptable for one participant in a conversation to say to the other at a given moment in time.

If you need detailed information about any protocol, the definitive source is the collection of Internet drafts and requests for comments (RFCs). For a table of selected RFC, see "Java Network Programming", page 29, or Proposed Standards and Informational Material and another

Internet Addresses

Every computer on the Internet is identified by a unique, four-byte IP address. This is typically written in dotted quad format like 132.241.2.90 where each byte is an unsigned value between 0 and 255.

Since humans have trouble remembering numbers like this, these addresses are mapped to names like "plasticman.ecst.csuchico.edu". However it's the numeric address that's fundamental, not the name.

Java's java.net.InetAddress class retrieves such addresses. Among others it contains methods to convert numeric addresses to host names and host names to numeric addresses. For example: getByName(String host) determines the IP address of a host, given the host's name, getHostAddress() returns the IP address string "%d.%d.%d.%d"

The InetAddress class is a little unusual in that it doesn't have any public constructors. (See factory method design pattern) Instead you pass the host name or string format of the dotted quad address to the static InetAddress.getByName() method like this:

try {
  InetAddress pip = InetAddress.getByName("pip.ecst.csuchico.edu");
  InetAddress mycin = InetAddress.getByName("132.241.7.225");
}
catch (UnknownHostException e) {
  System.err.println(e);
}  

URLs

A URL, short for "Uniform Resource Locator", is a way to unambiguously identify the location of a resource on the Internet. Some typical URLs look like:

http://www.javasoft.com/
file:///Macintosh%20HD/Java/Docs/JDK%201.1.1%20docs/api/java.net.InetAddress.html#_top_
http://www.macintouch.com:80/newsrecent.shtml
ftp://ftp.info.apple.com/pub/
mailto:elharo@sunsite.unc.edu
telnet://utopia.poly.edu
Most URLs can be broken into about five pieces, not all of which are necessarily present in any given URL. These are:

The java.net.URL class

The java.net.URL class represents a URL.
java.net.URLConnection represents a communications link between the application and a URL.

There are constructors to create new URLs and methods to parse the different parts of a URL. However the heart of the class are the methods that allow you to get an InputStream from a URL so you can read data from a server.

The URL class is closely tied to protocol and content handlers. The objective is to separate the data being downloaded from the protocol used to download it. The protocol handler is responsible for communicating with the server, that is moving bytes from the server to the client. It handles any necessary negotiation with the server and any headers. Its job is to return only the actual bytes of the data or file requested. The content handler takes those bytes and translates them into some kind of Java object such as an InputStream or ImageProducer.

When you construct a URL object, Java looks for a protocol handler that understands the protocol part of the URL such as "http" or "mailto". If no such handler is found, the constructor throws a MalformedURLException. The exact protocols that are supported vary from implementation to implementation though http and file are supported pretty much everywhere. Sun's JDK 1.1 understands ten:

The last four are custom protocols defined by Sun and used internally by the JDK and HotJava.

Example 7.1 (in Nutshell Ed1 book) shows an simple example of creating a URL from a specified WWW address and then downloading its contents with the getcontent() method.

// This example is from the book _Java in a Nutshell_ by David Flanagan.
// Written by David Flanagan.  Copyright (c) 1996 O'Reilly & Associates.
// You may study, use, modify, and distribute this example for any purpose.
// This example is provided WITHOUT WARRANTY either expressed or implied.

import java.awt.*;
import java.io.*;
import java.net.*;

// The fetch() method in this class only works for fetching text/plain 
// data.  If you specify a file: URL, you may well need to specify a
// file that ends with a .txt extension so that the internal content
// handlers can tell it is a plain text file.  The standard Java 
// distribution doesn't contain content handlers for other types (such
// as text/html), and this application exits with an exception if it
// doesn't recognize the type or doesn't know how to load the type.
// The fetchimage() method works for .gif and a few other common image
// formats for which content handlers have been written.  See the
// FetchImageTest class for a demonstration of the fetchimage() method
// defined here.
//
// This class serves to demonstrate the URL.getContent() method.  In
// general, however, there are much better ways to load files and images
// over the net.  See Applet.getImage() for example:
public class Fetch {
    // Get the contents of a URL and return it as a string.
   public static String fetch(String address) 
       throws MalformedURLException, IOException 
    {
        URL url = new URL(address);
        return (String) url.getContent();
    }

    // Get the contents of a URL and return it as an image
    public static Image fetchimage(String address, Component c) 
        throws MalformedURLException, IOException 
    {
        URL url = new URL(address);
        return c.createImage((java.awt.image.ImageProducer)url.getContent());
    }

    // Test out the fetch() method.
    public static void main(String[] args) 
        throws MalformedURLException, IOException 
    {
        System.out.println(fetch(args[0]));
    }
}
These plain text and image fetching examples rely on content handlers internal to the Java implementation. They work on JDK systems, but may not work on other Java implementations, if those implementations do not include the appropriate content handlers. If you attempt to download a file with an unsupported content type, the example will generate an exception and exit.

Note their note: this example is intended only to demonstrate the use of URL.getContent(). In general, this is not the best way to load text or image files over the net. See for example: Applet.getImage(), or java.awt.Toolkit.createImage
public Image createImage(byte imagedata[])

GetURLInfo.java (example 7-2) demonstrates how to obtain more information about a URL, and how to have more control over downloading the contents of the URL. URLInfoGetter rather than GetURLInfo as class name?

net API

// This example is from the book _Java in a Nutshell_ by David Flanagan.
// Written by David Flanagan.  Copyright (c) 1996 O'Reilly & Associates.
// You may study, use, modify, and distribute this example for any purpose.
// This example is provided WITHOUT WARRANTY either expressed or implied.

import java.net.*;
import java.io.*;
import java.util.*;

public class GetURLInfo {
    public static void printinfo(URLConnection u) throws IOException {
        // Display the URL address, and information about it.
        System.out.println(u.getURL().toExternalForm() + ":");
        System.out.println("  Content Type: " + u.getContentType());
        System.out.println("  Content Length: " + u.getContentLength());
        System.out.println("  Last Modified: " + new Date(u.getLastModified()));
        System.out.println("  Expiration: " + u.getExpiration());
        System.out.println("  Content Encoding: " + u.getContentEncoding());
        
        // Read and print out the first five lines of the URL.
        System.out.println("First five lines:");
        DataInputStream in = new DataInputStream(u.getInputStream());
        for(int i = 0; i < 5; i++) {
            String line = in.readLine();
            if (line == null) break;
            System.out.println("  " + line);
        }
    }
    
    // Create a URL from the specified address, open a connection to it,
    // and then display information about the URL.
    public static void main(String[] args) 
        throws MalformedURLException, IOException
    {
        URL url = new URL(args[0]);
        URLConnection connection = url.openConnection();
        printinfo(connection);
    }
}
URLConnectionTest from text.

More examples are found on the tutorial for Java Network Programming (mirrored below)

There are four constructors in the java.net.URL class. All can throw MalformedURLExceptions.

 public URL(String u) 
               throws MalformedURLException
 public URL(String protocol, String host, String file) 
               throws MalformedURLException
 public URL(String protocol, String host, int port, String file) 
               throws MalformedURLException
 public URL(URL context, String u) 
               throws MalformedURLException
Given a complete absolute URL like http://www.poly.edu/schedule/fall97/bgrad.html#cs, you construct a URL object for that URL like this:

  URL u = null;
  try {
    u = new URL("http://www.poly.edu/schedule/fall97/bgrad.html#cs");
  }
  catch (MalformedURLException e) {
  
  }
You can also construct the URL by passing its pieces to the constructor, like this:

  URL u = null;
  try {
    u = new URL("http", "www.poly.edu", "/schedule/fall97/bgrad.html#cs");
  }
  catch (MalformedURLException e) {
  
  }
You don't normally need to specify a port for a URL. Most protocols have default ports. For instance, the http port is 80; but sometimes this does change and in that case you can use the third constructor:

  URL u = null;
  try {
    u = new URL("http", "www.poly.edu", 80, "/schedule/fall97/bgrad.html#cs");
  }
  catch (MalformedURLException e) {
  
  }
Finally, many HTML files contain relative URLs. For example, this (mirrored) page's URL is http://sunsite.lanet.lv/javafaq/course/week12/07.html
However, this entire site is mirrored in several places around the world. Rather than having to rewrite the internal links for each mirror site, relative URLs inherit the host, port, protocol, and possibly directory from the current page. Thus on this page a link to
"08.html"
refers to
http://sunsite.lanet.lv/javafaq/course/week12/08.html.

However when this same page is loaded from
http://sunsite.lanet.lv/javafaq/course/week12/07.html,
then the link to "08.html" refers to
http://sunsite.lanet.lv/javafaq/course/week12/08.html
instead.

The fourth constructor above creates URLs relative to a given URL. For example,

  URL u1, u2;
  try {
    u1 = new URL("http://sunsite.unc.edu/javafaq/course/week12/07.html");
    u2 = new URL(u1, "08.html");
  }
  catch (MalformedURLException e) {
  
  }
This is particularly useful when parsing HTML.

Parsing URLs

The java.net.URL class has five methods to spilt a URL into its component parts. These are:

 public String getProtocol()
 public String getHost()
 public int getPort() 
 public String getFile()
 public String getRef()
For example,
  try {
    URL u = new URL("http://www.poly.edu/schedule/fall97/bgrad.html#cs");
    System.out.println("The protocol is " + u.getProtocol());
    System.out.println("The host is " + u.getHost());
    System.out.println("The port is " + u.getPort());
    System.out.println("The file is " + u.getFile());
    System.out.println("The anchor is " + u.getRef());
  }
  catch (MalformedURLException e) {
  
  }
If a port is not explicitly specified in the URL it's set to -1. This does not mean that the connection is attempted on port -1 (which doesn't exist) but rather that the default port is to be used.

If the ref doesn't exist, it's just null, so watch out for NullPointerExceptions. Better yet, test to see that it's non-null before using it.

Finally if the file is left off completely, e.g. http://www.javasoft.com, then it's set to "/".

Reading Data from a URL

The following example reads a series of host names and URLs from the command line. It attempts to form a URL from each command line argument, connect to the specified server, and download the data, which is then printed on System.out.

import java.net.*;
import java.io.*;

public class Webcat {

  public static void main(String[] args) {

    for (int i = 0; i < args.length; i++) {
      try {
        URL u = new URL(args[i]);
        InputStream is = u.openStream();
        InputStreamReader isr = new InputStreamReader(is);
        BufferedReader br = new BufferedReader(isr);
        String theLine;
        while ((theLine = br.readLine()) != null) {
          System.out.println(theLine);
        }
      }
      catch (MalformedURLException e) {
        System.err.println(e);
      } 
      catch (IOException e) {
        System.err.println(e);      
      } 
    }
  }
}

Sockets

Before data is sent across the Internet from one host to another using TCP/IP, it is split into packets of varying but finite size called datagrams. Each datagram contains a header and a payload. The header contains the address and port the package is going to, the address and port the package is coming from, and various other housekeeping information. The payload contains the data itself.

Datagrams range in size from a few dozen bytes to about 60,000 bytes. Anything larger than this, and often things smaller than this, needs to be split into smaller pieces before it can be transmitted. The advantage is that if one packet is lost, it can be retransmitted without requiring redelivery of all other packets.

The socket represents a reliable connection for the transmission of data between two hosts. It isolates you from the details of packet encodings, lost and retransmitted packets, and packets that arrive out of order.

Socket objects are used to establish connections. They are endpoints of communication links between processes. They appear as file descriptors to a process, so data can be exchanged with another process by transmitting or receiving through the socket.

The type of socket used, describes the way the data is transferred through the socket.

Back to Sockets in general: all of this is transparent to the Java programmer. The host's native networking software transparently handles the splitting of data into packets on the sending end of a connection, and the reassembly of packets on the receiving end.

Instead,the Java programmer is presented with a higher level abstraction called a socket. Specifically, for networking applications beyond the URL and URLConnection, Java uses the classes Socket and ServerSocket.

There are four fundamental operations a socket performs. These are:

  1. Connect to a remote machine
  2. Send data
  3. Receive data
  4. Close the connection
A socket may not be connected to more than one host at a time.

Java and Sockets

The java.net.Socket class allows you to perform all four fundamental socket operations. You can connect to remote machines; you can send data; you can receive data; you can close the connection.

There are four public constructors in the Socket class. (There are also two protected constructors and two deprecated constructors.) You need to at least specify the remote host and port you want to connect to. The host may be specified as either a string like "pip.ecst.csuchico.edu" or as an InetAddress object. The port should be an int between 1 and 65535. For that matter it should be a particular one. You need to know the port just as much as the hostname.

SocketTest.java with 5.0 Scanner use. Below is older version with InputStreamReader wrapped in a BufferedReader

/*
 * Cay S. Horstmann & Gary Cornell, Core Java
 * Published By Sun Microsystems Press/Prentice-Hall
 * Copyright (C) 1997 Sun Microsystems Inc.
 * All Rights Reserved.
 *
 * Permission to use, copy, modify, and distribute this 
 * software and its documentation for NON-COMMERCIAL purposes
 * and without fee is hereby granted provided that this 
 * copyright notice appears in all copies. 
 * 
 * THE AUTHORS AND PUBLISHER MAKE NO REPRESENTATIONS OR 
 * WARRANTIES ABOUT THE SUITABILITY OF THE SOFTWARE, EITHER 
 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE 
 * IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A 
 * PARTICULAR PURPOSE, OR NON-INFRINGEMENT. THE AUTHORS
 * AND PUBLISHER SHALL NOT BE LIABLE FOR ANY DAMAGES SUFFERED 
 * BY LICENSEE AS A RESULT OF USING, MODIFYING OR DISTRIBUTING 
 * THIS SOFTWARE OR ITS DERIVATIVES.
 */
 
/**
 * @version 1.10 27 Jun 1997
 * @author Cay Horstmann
 */
import java.io.*;
import java.net.*;
 
class SocketTest
{  public static void main(String[] args)
   {  try
      {  // Socket t = new Socket("time-a.timefreq.bldrdoc.gov", 13);

         Socket t = new Socket("132.163.4.101", 13);

         BufferedReader is = new BufferedReader
            (new InputStreamReader(t.getInputStream()));
         boolean more = true;
         while (more)
         {  String str = is.readLine();
            if (str == null) more = false;
            else
               System.out.println(str);
         }
         
      } 
      catch(IOException e) 
      { System.out.println("Error" + e); }
   }
}

The Socket() constructors do not just create a Socket object. They also attempt to connect the underlying socket to the remote server. All the constructors throw an IOException if the connection can't be made for any reason.

Note that, just like with URLs, sending and receiving data is accomplished with output and input streams. There are methods to get an input stream for a socket and an output stream for the socket. Exactly what the data you send and receive means often depends on the protocol.

Most of the time you'll want to chain the InputStream to some other input stream or reader class to more easily handle the data (as was done in one line above and explicitly in the next example)

For example, the following code fragment connects to the daytime server on port 13 of sunsite.unc.edu, and displays the data it sends.

    try {
      Socket s = new Socket("sunsite.unc.edu", 13);
      InputStream is = s.getInputStream();
      InputStreamReader isr = new InputStreamReader(is);
      BufferedReader br = new BufferedReader(isr);
      String theTime = br.readLine();
      System.out.println(theTime);
    }
    catch (IOException e) {
      return (new Date()).toString();
    }
Port Scanner

You cannot just connect to any port on any host. The remote host must actually be listening for connections on that port. You can use the constructors to determine which ports on a host are listening for connections.

Harold, in "Java Network Programming" and his tutorial provide the following:

import java.net.*;
import java.io.IOException;


public class PortScanner {

  public static void main(String[] args) {
    for (int i = 0; i < args.length; i++) {
      try {
        InetAddress ia = InetAddress.getByName(args[i]);
        scan(ia);
      }
      catch (UnknownHostException e) {
        System.err.println(args[i] + " is not a valid host name.");
      }
    }
  }

  public static void scan(InetAddress remote) {
    // Do I need to synchronize remote?
    // What happens if someone changes it while this method
    // is running?

    String hostname = remote.getHostName();
    for (int port = 0; port < 65536; port++) {
     try {
        Socket s = new Socket(remote, port); 
        System.out.println("A server is listening on port " + port
         + " of " + hostname);
        s.close();
      }
      catch (IOException e) {
        // The remote host is not listening on this port
      }
    }
  }

  public static void scan(String remote) throws UnknownHostException {

    // Why throw the UnknownHostException? Why not catch it like I did
    // in the main() method?
    InetAddress ia = InetAddress.getByName(remote);
    scan(ia);
  }
}

Warning: Pointing PortScanner at a machine you do not own would generally be considered a hostile act by the owner of the machine.

Next: Client Server examples. Use Socket and ServerSocket to make Client Server tools.