COMP211 – Internet Principles Assignment 1

[PART ONE]
PART TWO
So, the capability to send mail across a mail server using an SMTP connection has been established. For the second main part of the assignment, we were to introduce the ability to download objects across an HTTP connection from an inputted URL. This involved first splitting the URL into ‘host’ and ‘path’. A URL usually consists of 3 parts: scheme, host and path. For example in http://www.miriamjennings.co.uk/uncategorized/comp211-internet-principles-assignment-1/, we have the scheme: http://, teling us which protocol is used; the host: www.miriamjennings.co.uk, the domain holding the resource; and the path: /uncategorized/comp211-internet-principles-assignment-1/, identifying the specific resource at that host.
In this assignment, it was presumed that hypertext transfer protocol (HTTP) was being used and the user is asked to input the rest of the URL. I did this using a simple substring approach:

The indexOf() method should return the location of the first instance of the specified character (“/”) within the String it is applied to, or -1 if it does not occur. If -1 was returned, the path is simply “/” otherwise, it is the rest of the URL beginning from the first “/” until the end. The host is the URL from the beginning to the first “/”. Unfortunately, when I received my feedback on the assignment, I was told “paths with more than one ‘/’ not handled correctly”.

I then constructed the request message to the server. HTTP request messages take the form:
GET path HTTP/1.1 (if using HTTP version 1.1)
followed by a header line:
Host: host identified in URL split
Connection: close (as we were told to “add a header line so that server closes connection after one response”)

Another socket is then established with the URL host as the host and 80 as the port. (Port 80 is for HTTP connections whereas port 25 was for SMTP connections). Again, a BufferedReader and a DataOutputStream are connected to the socket to allow reading from and writing to the socket; an InputStream and InputStreamReader are created for this socket…

To send an HTTP request, a buffer is created to read an object in. The request message can then be sent along the DataOutputStream to the http server and the status line response read in using the BufferedReader. The status code is extracted using a substring and confirmed to be correct or an exception is thrown. If the response status code is 301 or 302, there is a new location at which the object we are attempting to get can be found. In this case, the point at which this new location appears in the response headers is found and the new loaction extracted. This is then fed into HttpInteract from the beginning as the url to be used.
The lines of the header can then be read in using the BufferedReader, extracting the length from the Content-Length line of the header and assigning it to a variable called “bodyLength”. The lines are read in until an empty line is found (signifying the end of the header). If the bodyLength is found to be greater than the maximum object size constant (102400), the Socket is closed and the user is informed that the message is too long.
Finally, the body (object) is read in using the BufferedReader method read(bufferName, offset, length) and the buffer set up earlier. The data can then be extracted from the buffer 4096 chars at a time and added to the body[] array of characters. “At this point body[] should hold the body of the downloaded object and bytesRead should hold the number of bytes read from the BufferedReader.”
Before returning the gathered information, a message to inform the user that the file has been read is outputted and the Socket, BufferedReader, and DataOutputStream are all closed.