2016-04-23

Woomera

Introduction

Woomera is a Dart package for implementing Web servers.

It is used to create server-side Dart programs that function as a Web
server. A Web server listens for HTTP requests and respond to them
with HTTP responses: a simple task, but one that can get complicated
(and difficult to maintain) when the program has many different pages
to display, handle errors and maintain state. This package aims to
reduce that complexity.

Main features include:

URL pattern matching inspired by the
Sinatra Web framework;

Pipelines of patterns to allow sophisticated processing, if needed;

Exception handling to ensure error pages are reliably generated;

Session management using cookies or URL rewriting;

Responses can be generated into a buffer;

Responses can be read from a stream of data.

This following is a tutorial which provides an overview the main
features of the package. For details about the package and its
advanced features, please see the API documentation.

Tutorial

1. A basic Web server

1.1. Overview

This is a basic Web server that serves up one page. It creates a
server with one response handler.

The most important feature of the package is to organise response
handlers, so that HTTP requests can be matched to Dart code to process
them and to generate a HTTP response.

A Server has of a sequence of pipelines, and each pipeline has a
sequence of rules. Each rule consists of the HTTP method (e.g. GET or
POST), a path pattern, and a request handler method.

When a HTTP request arrives, the pipelines are search (in order) for a
rule that matches the request. A match is when the HTTP method is the
same and the pattern matches the request URL's path. If found, the
corresponding handler is invoked to produce the HTTP response. If no
rule is found (after searching through all the rules in all the
pipelines), the resource is treated as not found.

1.2. Importing the package

Any program that uses the framework must first import the package:

1.3. The server

For the Web server, a Server object is created and configured for
the TCP/IP address and port it will listen for HTTP requests on.

Typically, when deployed in production, the bind address will be
InternetAddress.LOOPBACK_IP_V6 (the default) and the service is
running behind a reverse Web proxy (e.g. Apache or Nginx). The service
will only be accessed from the same host it is running on.

For testing, the above example sets it to InternetAddress.ANY_IP_V6,
so the service can be accessed from any external machine.

A port number 1024 or greater should be used, because the lower port
numbers are require special permission to use.

1.4. The pipeline

The Server (by default) automatically creates one pipeline, since
that is the most common scenario. The pipelines member is a List
of ServerPipeline objects, so retrieve it from the server using
something like:

1.5. The rules

Rules are registered with the pipeline. The get method on the
ServerPipeline object will register a rule for the HTTP GET method,
and the post method will register a rule for the HTTP POST
method. The first parameter is the pattern. The second parameter is
the handler method: the method that gets invoked when the rule matches
the HTTP request.

The tilde ("~") indicates this is relative to the base path of the
server. The default base path is "/". See the API documentation for
information about changing the base path. For now, all paths should
begin with "~/".

1.6. Running the server

After configuring the [Server], start it using its run method. The
run method returns a Future that completes when the Web server
finishes running; but normally a Web server runs forever without
stopping.

1.7. Request handlers

A request handler method is used to process the HTTP request to
produce a HTTP response. It is passed the HTTP request as a Request
object; and it returns a HTTP response as represented by a Response
object.

There are different types of Response objects. The commonly used one
for generating HTML pages is the ResponseBuffered. It acts as a
buffer where the contents is appended to it using the write
method. After the response is returned from the request handler, the
framework uses it to generate the HTTP response that is sent back to
the client.

This first example request handler returns a simple HTML page.

The "name" query parameter is retrieved from the request. If it is the
empty string, a default constant value is used instead. The square
bracket operator returns the empty string if the parameter does not
exist.

The name is used in the HTML heading. The HEsc.text method is used
to escape any special characters, to prevent accidential or malicious
HTML injection.

When a Web browser sends a request to the site's URL the HTML page is
returned. In this document, the example URLs will show the hostname of
the server as "localhost"; if necessary, change it to the hostname or
IP address of the machine running your server.

Run the server and try visiting:

http://localhost:1024/

http://localhost:1024/?name=friend

http://localhost:1024/?name=me,+%3Cbr%3Emyself+%26+I

The last example demonstrates the importance of using HEsc.text
to escape values.

Also visit something like http://localhost:1024/nosuchpage and the
basic built-in error page appears. To customize the error page, a
custom exception handler is used.

1.8. Exception handler

An exception handler processes any exceptions that are raised: either
by one of the request handlers or by the framework.

It is similar to a request handler, because it is a method that
returns a Response object. But it is different, because it is also
passed the exception and sometimes a stack trace.

When setting up the server, set its exception handler in main
(anywhere before the server is run):

And define the exception handler method as:

This exception handler customizes the error page when the
NotFoundException is encountered: it is raised when none of the
rules matched the request. Notice that it reports a different status
code if no rules for the method could be found (405
method not allowed), versus when some rules for the method exist but
their pattern did not match the requested path (404 not found).

Other exceptions can be detected and handled differently. But in this
example, they all produce the same error page.

Run this server and visit http://localhost:1024/nosuchpage to see
the custom error page.

2. HTML escaping methods

The HEsc class defines three static methods which are useful for
converting objects into Strings that are then escaped for embedded
into HTML.

attr for escaping values to be inserted into attributes.

text for escaping values to be inserted into element content.

lines which is the same as text, but adds line breaks elements
(i.e. <br/>) where newlines exist in the original value.

These methods will be used to escape values which might contain
characters with special meaning in HTML.

3. Parameters

The request handler methods can receive three different types of
parameters:

path parametrs;

query parameters; and

post parameters.

3.1. Path parameters

The path parameters are extracted from the path of the URL being
requested.

The path parameters are defined by the rule's pattern, which is made
up of components separated by a slash ("/"). Path parameters are
represented by a component starting with a colon (":") followed by the
name of the parameter.

The path parameters are made available to the handler via the
pathParams member of the Request object.

This is an example of a rule with a fixed path, where each component
must match the requested URL exactly and there are no path
parameters.

This is an example with a single parameter:

This is an example with two parameters:

The wildcard is a special path parameter that will match zero or more
segments in the URL path.

Here is an example request handler that shows the parameters in
the request.

Here are a few URLs to try:

http://localhost:1024/foo/bar/baz

http://localhost:1024/user/jsmith

http://localhost:1024/user/jsmith/123

http://localhost:1024/product/widget

http://localhost:1024/product/abc/def/ghi

3.2. Query parameters

The query parameters are the query parameters from the URL. That is,
the name-value pairs after the question mark ("?").

The path parameters are made available to the handler via the
queryParams member of the Request object. They are not (and
cannot) be specified in the rule.

Here are a few URLs to try:

http://localhost:1024/foo/bar/baz?a=b

http://localhost:1024/foo/bar/baz?greeting=Hello&name=World

http://localhost:1024/foo/bar/baz?item=a&item=b&item=c&code=123

3.3. Post parameters

The post parameters are extracted from the contents of a HTTP POST
request. Obviously, they are only available when processing a POST
request.

The path parameters are made available to the handler via the
postParams member of the Request object, which is null unless it
is a POST request. They are not (and cannot) be specified in the
rule.

For example, try this form:

processed by the above handler prints out:

3.4. Common aspects

The three parameter members are instances of the RequestParams
class.

It is important to remember that parameters can be repeated. For
example, checkboxes on a form will result in one instance of the named
parameter for every checkbox that is checked. This can apply to path
parameters, query parameters and post parameters.

3.4.1. Retrieving parameters

The RequestParams class can be thought of as a Map, where the keys
are the names of the parameters which maps into a List of values. If
there is only one value, there is still a list: a list containing only
one value.

The names of all the available parameters can be obtained using the
keys method.

All the values for a given key can be obtained using the values method.

If your request handler is expecting only one value, the
square-bracket operator can be used to retrieve a single value instead
of a list.

3.4.2. Raw vs processed values

The methods described above for retrieving value(s) returns a cleaned up
processed version of the value. The processing:

removes all leading whitespaces;

removes all trailing whitespace;

collapses multiple whitespaces in a row into a single whitespace; and

convert all whitespace characters into the space character.

To obtain the unprocessed value, set raw to true with the values method:

3.4.3. Expecting the unexpected

To make a robust application, do not make any assumptions about what
parameters may or may not be present: check everything and fail
gracefully. The parameters might be different from what is expected
because of programming errors, misuse or (worst case, but very
important to deal with) the application is under malicious attack.

If a parameter is missing, the square bracket operator returns an
empty string, and the values method returns an empty list when it is
returning proceesed values. In raw mode, the values method returns
null if the value does not exist: which is the only way to detect the
difference between the presence of a blank/empty parameter versus the
absence of the parameter.

An application might be designed to expect exactly one instance of a
parameter, but a malicious client might try to send two or more values
to break. The square bracket operator, which is used when only one
value is expected, will return the empty string if the multiple copies
of the parameter exist.

Both the names and values are always strings.

4. Exceptions

Exception handlers are a type of handler used to process exceptions
that are raised. They are passed the request and the exception, and
are expected to generate a Response. The exception handler should
create a response that serves as an error page for the client.

Exception handlers can be attached to the pipelines and the server.

A hierarchy determines which exception handler is invoked. If an
exception occurs inside a request handler method (and has not been
caught and processed within the handler) it is passed to the exception
handler attached to the pipeline: the pipeline with the rule that
invoked the request handler method. If no exception handler was
attached to the pipeline, the exception handler attached to the server
is used. If no exception handler was attached to the server, a default
exception handler is used.

The hierarchy is also used if an exception handler itself throws an
exception. (Though, hopefully, exception handlers will not throw an
exception). In that situation, a ExceptionHandlerException is
thrown.

4.1. Standard exceptions

The framework throws exceptions that are also processed by the same
exception handling hierarchy.

The NotFoundException is thrown when a matching rule is not found.
The exception handler should produce a "page not found" error page
with a HTTP response status of either HttpStatus.NOT_FOUND or
HttpStatus.METHOD_NOT_ALLOWED.

Other exceptions defined in the package are subclasses of
WoomeraException.

5. Responses

The request handlers and exception handlers must return a Future
that returns a Response object. The Response class is an abstract
class and three subclasses of it have been defined in the package:

ResponseBuffered

ResponseStream

ResponseRedirect

5.1. ResponseBuffered

This is used to write the contents of the response into a buffer,
which is used to create the HTTP response after the request hander
returns.

The HTTP response is only created after the request handler finishes.
If an error occurs while generating the response, the partially
created ResponseBuffered object can be discarded and a new response
created. The new response can be created in the response handler or in
an exception handler. The new response can show an error page, instead
of trying to output an error message at the end of a partially
generated page.

5.2. ResponseRedirect

This is used to generate a HTTP redirect, which tells the client to go
to a different URL.

5.3. ResponseStream

This is used to produce the contents of the response from a stream.

5.4. Common features

With all three types of responses, the application can:

Set the HTTP status code;

Create HTTP headers; and/or

Create or delete cookies.

5.5. Static file response

The package includes a request handler for serving up files and
directories from the local disk. It can be used to serve static files
for all or some of the Web server (for example, the images and
stylesheets).

See the API documentation for the StaticFiles class.

6. Sessions

The framework provides a mechanism to manage sessions. HTTP is a
stateless protocol, but sessions have been added to support the
tracking of state.

A session can be created and attached to a HTTP request. That session
will be attached to subsequent Request objects. The framework
handles the preserving and restoration of the session using either
session cookies or URL rewriting. The application can terminate a
session, or they will automatically terminate after a nominated
timeout period after they were last used.

7. References

Dart tutorial on Writing HTTP clients and servers
https://www.dartlang.org/docs/tutorials/httpserver/ (the package
Woomera is built on to of).

Open Web Application Security Project
https://www.owasp.org/index.php/Guide_Table_of_Contents

Show more