What is HTTP in Java


13.9 How a web server works *

This section covers the basics of a web server. We will first discuss the HTTP protocol on which the web is based.


13.9.1 The Hypertext Transfer Protocol (HTTP)

HTTP is a protocol for hypermedia systems and is described in detail in RFC 2616. HTTP has been used extensively since the advent of the World Wide Web - see http://www.w3.org/ for more information. The web is based on the page description language HTML, which was developed in 1992 by Tim Berners-Lee. The development was done at CERN, and since the prototypes development has advanced not only in the HTML area, but also in the protocol. Berners-Lee did not get rich with his invention, however, because he only wanted to provide a tool for the publication of scientific reports without patent or copyright claims.

HTTP defines communication between client and server. Typically the server listens on port 80 (or 8080) for requests from the client. The protocol uses a TCP / IP socket connection and is clearly text-based. All HTTP requests have a general format:

  • One line at the beginning: This can either be a request (i.e. a message from the client) or a response from the server.

  • some headers: information about the client or server, for example about the content. The header always ends with a blank line.

  • a body: the content of the message; either user data from the client or the response from the server

This protocol is very simple and also independent of data types, which makes it ideal for use in distributed hypermedia information systems.


13.9.2 Requests to the server

Once the connection has been established, a request is made as to which object (document or program) is to be accessed. In addition to the request, the protocol used for transmission is also specified. HTTP defines several main methods, three of which are among the most important:

  • GET: A simple request for information. The client can append data to the URL and thus send it to the server.

  • POST: The POST method allows the client to send data to the server via a data stream.

  • HEAD: Works in a similar way to GET, except that not the entire document is sent, only information about the object. For example, this method sends the information contained within an HTML page.

A sample request to a web server

Here you can see a typical example of a GET request from the client to the standard web server:

GET /directory/index.html HTTP / 1.1

The first word is the method of the call (also request). In addition to the three methods GET, POST and HEAD listed above, there are others. Usually, however, these are only used for special applications.

method

task

GET

Returns a file.

HEAD

Provides file information only.

POST

Sends data to the server.

PUT

Sends files to the server.

DELETE

Deletes a resource.

Table 13.1 The main request methods of HTTP

The second entry in the query to the server is the file path. It can be seen as a relative path specification and follows the method.

Url

Generated request

www.tutego.com/

GET / HTTP / 1.1

www.tutego.com/../index.html

GET /../index.html HTTP / 1.1

www.tutego.com/classes/applet.html

GET /classes/applet.html HTTP / 1.1

Table 13.2 GET requests

The request ends with a blank line (line that only contains a carriage return () and a linefeed ()).

[eg] example

We can easily test this in a Telnet session. (From Windows 7 onwards this tool has to be activated first. [118] (Described for example at http://praxistipps.chip.de/telnet-client-unter-windows-7-aktivieren_3601.)) Let us first start a Telnet session :

$ telnet www.tutego.com 80

Then we request a file with and two returns in the input.

Optional lines can be sent after the line with the request. For example, an HTTP client makes the following request to the server:

GET / HTTP / 1.0
Connection: Keep-Alive
User agent: SuperBrowser 2000
Host: merlin
Accept: image / gif, image / x-xbitmap, image / jpeg, image / pjpeg, * / *

Here we see that this chat makes it easy to generate statistics on browser usage.


13.9.3The responses from the server

The server also replies with a status line, a header (with information about itself) and the content. The web browser must therefore take care of a response from the web server. A response generated by the Microsoft web server can look something like this:

HTTP / 1.0 200 OK
Server: Microsoft-PWS / 2.0
Date: Sat, Feb 10, 2002 19:03:45 GMT
Content-Type: text / html
Accept ranges: bytes
Last-Modified: Sat, 09 May 1998 09:52:22 GMT
Content-Length: 26
Here comes the HTML page

The answer is again separated by a blank line. The HTTP header is made up of three parts: the general header (this includes, for example), the response header (this includes) and the entity header (and). The client can also use a request header.

Each field in the header consists of a name followed by a colon. Then the value follows. The field names are not case-sensitive.

The first line is called the status line and contains the version of the protocol and the status code. The following text is optional and describes the status code in a human-readable form.

The version

The first version of the HTTP protocol (HTTP / 0.9) only provided for a simple transmission of data over the Internet. HTTP / 1.0 was already an extension because the data could be sent as MIME messages. In addition, metadata (such as the length of the message) were available. However, since HTTP / 1.0 has disadvantages in caching and establishes a new connection for each file, i.e. it does not support persistent connections, HTTP / 1.1 was introduced.

The status code

The status code provides information about the result of the request. It consists of a number with three digits. The additional optional text is for human only.

The first character of the status code defines the response class (similar to what the FTP server does). The following numbers cannot be assigned to any category. There are five classes for the first character:

  • 1xx: informing
    The request has arrived and everything goes on.

  • 2xx: successful
    The action was successfully received, understood and accepted.

  • 3xx: Inquiry
    Further information is required to carry out the request.

  • 4xx: Error at the client
    The syntax of the request is incorrect or it cannot be carried out.

  • 5xx: Error in the server
    The server cannot carry out what is probably the correct query.

Let's take a closer look at the types of errors:

Status code

Optional text

200

OK

201

Created

202

Accepted

204

No content

300

Multiple Choices

301

Moved permanently

302

Moved Temporarily

304

Not modified

400

Bad request

401

Unauthorized

403

Forbidden

404

Not Found

500

Internal server error

501

Not implemented

502

Bad gateway

503

Service Unavailable

Table 13.3 Some status codes for responses from the HTTP server

The most common return values ​​are:

  • 200 OK: The request from the client was correct and the response from the server provides the requested information.

  • 404 Not Found: The referenced document cannot be found.

  • 500 Internal Server Error: Mostly caused by bad CGI programs.

The text in the table can differ from the status code.

General header fields

There are fields that can be queried for every transmitted message (not entity). These apply to both the client and the server. These include: Cache-Control, Connection, Date, Pragma, Transfer-Encoding, Upgrade and Via. The header information also includes the time of the packet sent. The date can be sent in three different formats, the first of which is part of the Internet standard and is therefore desirable. It has the advantage over the second format that it has a fixed length and represents the year with four digits:

Sun 06 Nov 1994 08:49:37 GMT; RFC 822, update in RFC 1123
Sunday 06-Nov-94 08:49:37 GMT; RFC 850, obsolete since RFC 1036
Sun Nov 6 08:49:37 1994; ANSI C asctime () format

An HTTP / 1.1 client that reads the date values ​​must therefore accept three date formats.

Fields in the response header

The response header allows the server to transmit additional information that is not coded in the status line. The fields provide information about the server. The following fields are possible:,,,,,,, and.

Entity header fields

An entity is information that is sent as a result of a request. The entity consists of meta information (entity header) and the message itself (transmitted in the entity body). The meta information transmitted in one of the entity header fields is information about the length of the block or about the last change in length. If no entity body is defined, the fields provide information about the resources without actually sending them in the entity body:,,,,,,,,,, and.

The file content and the content type

HTTP uses the Internet media types (related to the MIME types) in the content type. This content type indicates the data type of the transmitted data stream. Some examples with references:

Type

Subtype

description

text

plain

ASCII text (RFC 1521)

html

Hyper-Text Markup Language (RFC 1866)

multipart

mixed

multi-part content (RFC 1521)

form-data

Form data from HTML (RFC 1867)

application

octet stream

general binary data (RFC 1521)

postscript

PostScript from Adobe (RFC 1521)

rtf

Rich Text Format

pdf

PDF from Adobe

vnd.ms-excel

Microsoft Excel

vnd.ms-powerpoint

Microsoft PowerPoint

Table 13.4 Content types

Every message with an entity body transmitted via HTTP / 1.1 should contain a header with content type. If this type is not given, the client tries to find out which type it is based on the URL extension or by looking at the data stream. If this remains unclear, however, it is assigned the type "application / octet-stream". Content types are often used to compress data. They do not lose their identity as a result. In this case, the "Content-Encoding" field is set in the entity header, and with a GNU ZIP packing method (gzip) the following line is then included in the data stream:

After the headers, the response is the file. After this has been transmitted, the socket connection is closed. Since each request-response pair ends in a socket connection, this procedure is not particularly fast and does not protect the network either, since many packets have to be sent to take care of the establishment of the line.


13.9.4 Web server with com.sun.net.httpserver.HttpServer

Java has been offering an API since version 6 to address web services and also to define new web services and to register them on your own computer. However, in order to be able to offer web services and grant access to remote clients, a web server is always required. For this reason, Sun has integrated a simple HTTP (S) server that can also be used independently; Although the classes are more or less privately declared in the package, we still want to venture an example. [119] (The class does not appear in the normal API documentation. In the ZIP with the documentation, however, the class is at docs \ jre \ api \ net \ httpserver \ spec.)

The focus is on the class / - an abstract upper class, of which the factory methods and a concrete example provide. In the next step, the method connects a path (such as "/ webapp1 /") with a specific one that takes over the request for the path:

Listing 13.19 com / tutego / insel / httpserver / HttpServerDemo.java

public class HttpServerDemo {
public static void main (String [] args) throws IOException {
HttpServer server = HttpServer.create (new InetSocketAddress (80), 0);
server.createContext ("/", new DateHandler ());
server.start ();
}
}

Each handler implements the interface with the method that enables access to the header, request body, and result via the parameter.

A simple one that answers requests with an HTML page that includes the date and the request path can look like this:

Listing 13.20 com / tutego / insel / httpserver / HttpServerDemo.java, DateHandler

class DateHandler implements HttpHandler {
@Override public void handle (HttpExchange httpExchange) throws IOException {
httpExchange.getResponseHeaders (). add ("Content-type", "text / html");
String response = "" + new Date () + " for" +
httpExchange.getRequestURI ();
httpExchange.sendResponseHeaders (200, response.length ());

try (OutputStream os = httpExchange.getResponseBody ()) {
os.write (response.getBytes ());
}
}
}

The method returns an object (one) for setting the response headers. The example sets the content type to "text / html". The method concludes the header information with a status code (response code) and the content length. returns the path and enables us to break down the elements by directory, file, file extension, anchors and parameters.

provides someone who formulates the result document. The method returns one for what the client is sending. The example writes the bytes from the string to and then closes it.

After starting the server, we can enter URLs such as http: // localhost / or http: // localhost / webapp / bla in the web browser, and we get the date and the path.

How did you like the Openbook? We always look forward to your feedback. Please send us your feedback as an e-mail to [email protected]