CS 268 - Assignment 1: HTTP Retrieval Program


Given: August 26, 1999
Due: September 16, 1999
Language Options: C, C++, or Java
OS Options: It must run on Solaris 2.5.1

Your assignment is to construct a command line HTTP retrieval program. The program is to retrieve an HTTP URL and store the URL body in a local file. The program must accept as input a command line argument that specifies the HTTP URL to retrieve. Informative error messages should be displayed for HTTP errors. Upon successful retrieval the program must display the number of bytes downloaded and the HTTP headers returned by the server. The program should work for any arbitrary file length.

The HTTP URL will be of the form: http://machinename[:portnum]/pathname. The part in [] is optional. The machinename part may be a hostname or a numeric dotted decimal value, such as 157.182.194.28. The pathname portion may be any pathname. Invalid URLs that do not match this description should generate some informative error message and not be processed. The following example should retrieve the SRL.gif file and place it in the SRL.gif file in the local directory. An example output from the program is also given.


$ http_get http://naur.csee.wvu.edu/~tmont/SRL.gif
Connected to 157.182.194.28:80, asking for /~tmont/SRL.gif
Retrieving SRL.gif to ./SRL.gif:
Headers from server:
 Date: Mon, 23 Aug 1999 16:40:48 GMT
 Server: Apache/1.3.4 (Unix) PHP/3.0.10
 Last-Modified: Mon, 15 Jul 1996 20:41:39 GMT
 Accept-Ranges: bytes
 Content-Length: 21209
 Connection: close
 Content-Type: image/gif
Body:
 Retrieved 21209 bytes
HTTP 1.0 Specification at http://www.w3.org/pub/WWW/Protocols/HTTP/1.0/spec.html

Extra Credit

Value: up to +7% onto final grade

For extra credit, make the program support WWW basic authentication and support If-Modified-Since headers for files which already exist [4%]. And/Or add support for optional arguments: (1) percent completion indicator for URLs that return Content-Length headers (-i), (2) filename to store URL body in (-o filename), (3) If-Modified-Since argument (-m "Mon, 15 Jul 1996 00:00:00 GMT"), and (4) an arbitrary number of HTTP URLs [+3%].


Todd L. Montgomery (revised 08.23.1999)