Planet.postgresql.org

Scott Mead: Quick stats with pgBadger

2016-10-24

A while back, I wrote about ‘Helpful Postgres logging and defaults‘. The real question is, how helpful are these, really? So, let’s take a quick look at how we can use the increased logging in order to do something useful. If the previous post is TL;DR, here’s the quick list:

Essentially, these cover the basics when it comes to monitoring, without adding a whole lot of stuff to your logs. These settings are default in the BigSQL sandboxes so that you’re running out of the gate. If you’ve installed PostgreSQL another way, you can simply modify your postgresql.conf file to include these items.

Follow along

Quick Install

If you’d like to follow along, you can quickly setup a test pg96 environment by using the ‘standalone’ installer for postgres that BigSQL provides. If you already have PostgreSQL, you can skip this section.

Grab BigSQL

Install PostgreSQL 9.6.current

Init pg96

Start pg96

Since the BigSQL package automatically sets the appropriate logging defaults, we’re set to go!

Modify Existing PG

open up your postgresql.conf file

place the above values at the end of the file

save & close postgresql.conf

restart postgres (if you already had ‘logging_collector=on’, you can just reload).

Workload

Now that you’re logging, it’s time to generate some workload. My database isn’t actually serving clients at this point, so, I’m going to use the included pgbench tool to generate some data and traffic. (NB: If you run this step, you’ll use up about 2.5 GB of disk space).

pgbench -i -s 100 -U postgres

# This initializes a new pgbench database with a ‘scaling factor’ (i.e. sizing factor) of 100 bank branches (pgbench simulates debit/credit). You’ll see a bunch of output fly-by as you initialize 10,000,000 rows (tuples) of data. Don’t worry, this only translates to about 2.5 GB of space.

pgBadger-ing

Great! Now we’re logging at a verbosity that lets us see what’s going on in PostgreSQL, how do I take advantage? Enter pgBadger; this tool is essentially a high-performance log parser for the postgresql information logs. If you’re following along with my BigSQL install from above, you can grab pgBadger with:

If you’re not using the BigSQL package, you can:

Now, let’s run pgBadger against the logs.

If you’re following along:

By default, pgBadger puts its output in a file called ‘out.html’ (see perl pgbadger/pgbadger –help for the flag to change this).

I’m going to open up the ‘out.html’ file in my web browser and view my report (I’ve attached the out.html report I generated for this blog so you can download it and play if you’d like).

Hmmm, looks like I have an empty report? Strange? Well, keep in mind, we’ve told postgres to log certain things:

checkpoints

Vacuum

temp files

lock waits

We never told postgres to log queries (we’ll come back to this in a minute). For now, let’s take a look at the ‘checkpoints’ and see what’s there.

Once we choose the ‘Checkpoints’ item from the top menu, we’ll see that our data-load created some activity for us (your data will most likely look somewhat different).

Now I’m starting to see data about my checkpoint buffer utilization! Cool! I can actually see how memory is being used in the server. This is coming out of the ‘log_checkpoints’ parameter that we set earlier.

To get richer data, we need to push more activity against our DB server. With pgBench:

This will run a 5 minute (300 second) test with 10 concurrent client sessions.

Once that’s complete, re-run pgBadger:

(Your output may appear slightly different based on the number of log files that you have)

Now, your ‘out.html’ will have data based on your workload.

We also have some interesting data surrounding VACUUM behavior:It should be noted, these are just quick screen grabs, take a look at the attached ‘out.html’ for more.

The point is, using some wise logging defaults gets you some awesome pgBadger joy, without paying a TON in logging overhead. Stay tuned for part 2 where I’ll crank up logging even more and show you how to get per-statement data.