Nginx.com

Using NGINX Amplify Custom Dashboards and Filters for Better NGINX Monitoring

2016-12-08

This blog post is the third in a series about how to monitor NGINX and make the most out of your web delivery infrastructure with NGINX Amplify:

Setting Up NGINX Amplify in 10 Minutes

Improving Server Configuration with NGINX Amplify Reports

Using NGINX Amplify Custom Dashboards and Filters for Better NGINX Monitoring (this post)

In a recent blog post, we introduced and briefly described custom dashboards. This blog post describes how you can use custom dashboards and filters to focus in on the metrics that most exactly indicate how well NGINX and your applications are functioning. By alerting you of potential outages and performance bottlenecks, the improved monitoring data helps you improve uptime and NGINX performance.

Architectural Components of NGINX Amplify

The dashboards and filters in NGINX Amplify interact with these other NGINX Amplify components:

NGINX Amplify Agent – A lightweight software client that you install alongside your NGINX and NGINX Plus instances. The agent, which is open source, Python‑based software, collects metadata and metrics from the operating system, and all metrics available from NGINX.

NGINX Amplify Receiver – Located in the cloud and managed by NGINX, Inc.

NGINX Amplify Web UI – Located in the cloud and managed by NGINX Inc.

For a more detailed description of NGINX Amplify’s architecture, see Improving Server Configuration with NGINX Amplify Reports.

Metric and Metadata Collection

The NGINX Amplify Agent collects metrics from several sources.

System‑level tools provide data about CPUs, disk, and networking via the /proc filesystem and sysfs. NGINX Amplify Agent collects the data using the psutil library.

System metrics are prepended with the word system in the NGINX Amplify metric list. Examples of system metric names include system.cpu.user, system.disk.free, system.net.bytes_sent.

The stub_status interface (Stub Status module) provides HTTP access to a set of general NGINX metrics, including the number of active connections and requests. You can see sample output here. To make this page available in your NGINX environment, include the stub_status directive in the configuration.

The NGINX Plus live activity monitoring API (Status module) provides additional metrics for NGINX Plus users. You can see an example of the API output in JSON format here.

Metrics that are exclusively available in NGINX Plus are prepended with the word plus, for example: plus.cache.hit.bytes, plus.http.status.2xx, and plus.upstream.fails.count.

Note: When the NGINX Amplify Agent parses NGINX configuration files, it looks for the location blocks where the stub_status and status directives are included, and verifies that it can access the corresponding URLs via HTTP. For example, with the following configuration, the agent requests extended NGINX Plus status information from http://status.example.com:8070/status_info:

Make sure that the server_name and location directives are properly configured and grant localhost access.

Log Collection

Log files allow you to collect the most precise information on every request that is processed by NGINX. Custom log formats can include additional variables related to both requests and responses.

The NGINX Amplify Agent parses the configuration files to find the access_log and log_format directives. It understands the log format for multiple NGINX configuration files and properly assigns logged values to the variable names.

The following sample log_format directive includes variables related to request and response times, in addition to the variables from the default combined format:

The NGINX Amplify Agent opens the log files and collects aggregated metrics in real time, while NGINX is adding lines to the files.

Syslog Collection

What if you cannot store logs in local files? Reasons for this may include:

High number of HTTP requests

Slow disk arrays

Remote file systems

NGINX can send logs to syslog servers. The NGINX Amplify Agent can accept syslog messages directly from NGINX, effectively acting as a syslog server.

Follow these instructions to configure the syslog functionality of the NGINX Amplify Agent:

Make sure you are using NGINX version 1.9.5 or later (or NGINX Plus Release 8 or later). The latest version or release is always preferred (for NGINX, the latest mainline version).

Edit the NGINX Amplify Agent configuration file, /etc/amplify-agent/agent.conf, and specify the address and port that you want to use for the syslog listener:

Restart the NGINX Amplify Agent to have it reparse the configuration and start listening on the specified IP address and port.

Edit the NGINX configuration file and specify the syslog listener address as the first parameter to the access_log directive. Include the amplify tag, and your preferred log format:

Note: To send the logs to both the existing logging facility and the NGINX Amplify Agent, include a separate access_log directive for each destination.

Reload NGINX:

Custom Dashboards

The NGINX Amplify Web UI enables you to create custom dashboards where you can display only data you have identified as critical, or the data related to a specific server, application, or deployment. You can create a dashboard using the Dashboards menu item in the top navigation bar.

A dashboard can display graphs and current values of any of the metrics collected by the NGINX Amplify Agent.

Dashboards can be fully customized. Click the Edit button in the top right corner to add, remove, or modify graphs and values. You can drag and drop items in the dashboard:

When you add a graph, you need to specify a name and a metric that you want to display. You can add multiple metrics to the same graph.

Each metric that you specify can be displayed for the aggregate of all NGINX instances or a subset of NGINX instances. If you use NGINX Plus with live activity monitoring, for metrics related to server zones or upstream zones you can drill down to the specific zone, as shown in this screenshot:

To finish adding or modifying the graph, click the Save button in the dialog box and then the Done button in the top right corner of the dashboard.

Filters

Filters allow you to apply additional matching conditions to measurements from metrics. When a filter is applied, the metric is collected and aggregated into the custom dashboard graph or value only when the filter conditions are met. Filters are an additional custom dimension to a metric.

Filters can only be added to metrics collected from NGINX logs. You cannot assign a filter to a metric that was aggregated by NGINX itself, such as ones from the NGINX stub_status interface or the NGINX Plus live activity monitoring feature.

NGINX has a pre‑defined log format called combined. As of today, when you use this log format, the following metrics support filters:

nginx.http.method.delete

nginx.http.status.2xx

nginx.http.method.get

nginx.http.status.3xx

nginx.http.method.head

nginx.http.status.4xx

nginx.http.method.options

nginx.http.status.5xx

nginx.http.method.post

nginx.http.status.discarded

nginx.http.method.put

nginx.http.v0_9

nginx.http.request.body_bytes_sent

nginx.http.v1_0

nginx.http.request.malformed

nginx.http.v1_1

nginx.http.status.1xx

nginx.http.v2

You can enable the following additional metrics by using a custom log_format:

nginx.cache.bypass

nginx.upstream.header.time

nginx.cache.expired

nginx.upstream.header.time.count

nginx.cache.hit

nginx.upstream.header.time.max

nginx.cache.miss

nginx.upstream.header.time.median

nginx.cache.revalidated

nginx.upstream.header.time.pctl95

nginx.cache.stale

nginx.upstream.next.count

nginx.cache.updating

nginx.upstream.request.count

nginx.http.gzip.ratio

nginx.upstream.response.length

nginx.http.request.bytes_sent

nginx.upstream.response.time

nginx.http.request.length

nginx.upstream.response.time.count

nginx.http.request.time

nginx.upstream.response.time.max

nginx.http.request.time.count

nginx.upstream.response.time.median

nginx.http.request.time.max

nginx.upstream.response.time.pctl95

nginx.http.request.time.median

nginx.upstream.status.1xx

nginx.http.request.time.pctl95

nginx.upstream.status.2xx

nginx.upstream.connect.time

nginx.upstream.status.3xx

nginx.upstream.connect.time.count

nginx.upstream.status.4xx

nginx.upstream.connect.time.max

nginx.upstream.status.5xx

nginx.upstream.connect.time.median

nginx.upstream.status.discarded

nginx.upstream.connect.time.pctl95

You can access a full description of the available NGINX and NGINX Plus metrics.

Configuring a Filter

Here, we provide the high‑level steps for configuring a filter. Because filters are so powerful in NGINX Amplify, you will want to use these steps as a starting point for your own experimentation.

To configure a filter, follow these steps:

Create a new graph or a new value in a custom dashboard.

Give the graph or the value a name in the Title field.

Select a metric that supports filters (see above) from the Choose metric to display drop‑down menu.

Select one or more NGINX instances.

Click Apply filter.

Click the Click to select filter key drop‑down menu that appears and select or type a variable name, then click the Type or select filter value drop‑down menu that appears and type a static value or regular expression. (In the screen shot, we’ve specified $http_host as the key and example.net as the value.)

Click Use “your‑expression” as a value.

Add additional filters if needed.

Click the Save button in the dialog box and then the Done button in the top right corner of the dashboard.

Configuration Examples for Custom Dashboards and Filters

Following are some examples of how to use NGINX Amplify custom dashboards and filters to address common concerns in NGINX web server deployments.

Monitor 500 Internal Server Errors for Critical URLs

Most large deployments have an acceptable rate of 500 errors (Internal Server Error). These errors happen for multiple reasons, and some of the errors are not critical to the operation of your website. However, business‑critical URLs need a special monitoring level. For ecommerce sites, this usually includes URLs for adding items to a shopping cart or specifying the payment method.

Solution: If your business‑critical URL is /cart/add.php, create a graph for metric nginx.upstream.status.5xx, and add the filter $request_uri ~ /cart/add.php.

Monitor NGINX Response Time for Database‑Intensive Requests

Different requests impact databases differently. A simple average of response time does not always show the impact of specific requests on your database performance. Monitoring the $upstream_response_time metric with NGINX Amplify provides you initial data for database optimization.

Solution: If your database‑intensive requests fetch the URL /reports/view.do, create a graph for metric nginx.upstream.response.time and a filter for your database‑intensive requests by URL: $request_uri ~ /reports/view.do.

You can also monitor just the 95th percentile of these requests with the metric nginx.upstream.response.time.pctl95.

Monitor POST Requests for Multiple Virtual Servers

If you host multiple virtual servers (vhosts) and need to know how many POST requests come from their users, NGINX Amplify can help you.

Solution: Add the $http_host variable to your log_format. Then, create one or more graphs for the metric nginx.http.method.post, filtering on the $http_host variable.

Monitor Mobile Browser Usage

Many websites use the same infrastructure for multiple clients. In most cases the clients identify themselves through the User‑Agent HTTP header. You can create a filter on that header and monitor mobile browser usage.

Solution: Add a graph charting the metric nginx.http.status.2xx and create a filter $http_user_agent ~ Mobile.

Monitor 401 Errors

Many monitoring systems provide aggregated metrics for 2xx, 3xx, 4xx, and 5xx status and error code ranges. When you need to drill down to a specific error code, you can create a custom filter.

Solution: Create a graph on the metric nginx.http.status.4xx and the filter $status ~ 401. In addition, you can create a filter on a specific URI.

Use Regular Expressions to Monitor 404 Errors for GIF Files

With regular expressions, you can select specific file types or other patterns in the logged variables.

Solution: Create a graph with the metric nginx.http.status.4xx and a combination of filters, $status ~ 404 and $request_uri ~ .*\.gif.

Summary

With NGINX Amplify, you can quickly and easily configure advanced monitoring for your NGINX deployment. Set up multiple custom dashboards with multiple graphs, then extend the monitoring configuration with a set of filters. Filter and dashboard configuration is done via the web interface, and additional configuration of the NGINX host is minimal.

Sign up for free and start using NGINX Amplify today.

The post Using NGINX Amplify Custom Dashboards and Filters for Better NGINX Monitoring appeared first on NGINX.