2014-08-13

Posted by Everett

This is Inflow's process for doing content audits.
It may not be the "best" way to do them every time, but we've managed to keep it fairly agile in terms of how you choose to analyze, interpret and make recommendations on the data. The fundamental parts of the process remain about the same across numerous types of websites no matter what their business goals are: Collect all of the content URLs on the site and fetch the data you need about each URL. Then analyze the data and provide recommendations for each content URL. Theoretically it's simple. In practice, however, it can be a daunting exercise if you don't have a plan or process in place. By the end of this post we hope you'll have a good start on both.

Table of Contents

The many purposes of a content audit

A content audit case study

50,000-foot overview of the process

Our documents

Content audit scenarios

Content audit dashboard spreadsheet

Content strategy

Recommended exports and data sources

A step-by-step example of our process

Step 1: Assess the situation and choose a scenario

Step 2: Scan the site

Step 3: Import the URLs and start the tool

Step 4: Import the tool output into the dashboard

Step 5: Import GWT data

Step 6: Perform keyword research

Step 7: Tying the keyword data together

Step 8: Time to analyze and make some decisions!

Step 9: Content gap analysis and other value-adds

Step 10: Writing up the content audit strategy document

Resources, links and post-scripts...

The many purposes of a content audit

A content audit can help in a variety of different ways, and the approach can be customized for any given scenario. I'll write more about potential "scenarios" and how to approach them below. For now, here are some things a content audit can help you accomplish...

Determine the most effective way to escape a Panda penalty

Determine which pages need copywriting / editing

Determine which pages need to be updated and made more current, and prioritize them

Determine which pages should be consolidated due to overlapping topics

Determine which pages should be pruned off the site, and what the approach to pruning should be

Prioritize content based on a variety of metrics (e.g. visits, conversions, PA, copyscape risk score...)

Find content gap opportunities to drive content ideation and editorial calendars

Determine which pages are ranking for which keywords

Determine which pages "should" be ranking for which keywords

Find the strongest pages on a domain and develop a strategy to leverage them

Uncover content marketing opportunities

Auditing and creating an inventory of content assets when buying/selling a website

Understanding the content assets of a new client (i.e. what you have to work with)

And many more...

A content audit case study



Inflow's technical SEO specialist
Rick Ramos performed an earlier version of our content audit last year for
Phases Design Studio, who graciously permitted us to share their case study. After taking an inventory of all content URLs on the domain, Rick outlined a plan to noindex/follow and remove from their sitemap many of the older blog posts that were no longer relevant, and weren't candidates for a content refresh. The site also had a series of campaign-based landing pages dating back from 2006. These pages typically had a life cycle of a few months, but were never removed from the site or Google's index. Rick recommended that these pages be 301 redirected to a few evergreen landing pages that would be updated whenever a new campaign was launched—a tactic that works particularly well on seasonal pages for eCommerce sites (e.g. 2014 New Years Resolution Deals). Still more pages were candidates to be updated / refreshed, or improved in other ways.

The results

Shortly after the recommendations were implemented the client called to ask if we knew why they were suddenly seeing eight times the amount of leads they were used to seeing month over month.



Why we think it worked

There are several probable reasons why this approach worked for our client. Here are a few of them...

The ratio of useful, relevant, unique content to thin, irrelevant, duplicate content was greatly improved.

The PageRank from dozens of expired campaign landing pages was consolidated into a relatively few evergreen pages (via 301 redirects and consolidation of internal linking signals).

Crawl budget is now being used more efficiently.

This improved the overall customer experience on the site, as well as organic search rankings for important topic areas that were consolidated.

Since then we have refined and improved the process and have been performing them on a variety of sites with great success. It works particularly well for panda recoveries on large-scale content websites, and for prioritizing which eCommerce product copy needs to be rewritten first.

A 50,000-foot overview of our process

Inflow's content auditing process changes depending on the client's goals, needs and budget. Generally speaking, however, here is how we approach it...

Gather all available URLs on the site

Use Screaming Frog (or another crawl tool), CMS Exports, Google Analytics and Webmaster Tools

Import URLs into a tool that gathers KPIs and other data for each URL

Use URL Profiler, a custom in-house tool, or other data-gathering resources

Things to gather: Moz Metrics, Google Analytics KPIs, GWT Data, Magestic SEO metrics, Titles, Descriptions, Wordcounts, canonical tags...

Analyze the content

Choose to keep as-is, improve, remove or consolidate.

Write detail strategies for each.

Perform keyword research

Optional: Provide relevancy scores, topic buckets and buying stage/s for each keyword

Match keywords to pages that already rank within a keyword matrix

Match non-ranking keywords to the best page for guiding on-page changes

Do content gap ideation

Use keywords that did not have an appropriate page match to fill in the Content Gap tab.

Optional: Incorporate buying cycles into content gap ideation

Write the content strategy

Summarize the findings and present a strategy for optimizing existing pages, creating new pages to fill gaps, explain how many pages are being removed, redirected, etc…

Each piece of the process can be customized for the needs of a particular website.

For example, when auditing a very large content site with lots of duplicate/thin/overlapping content issues we may skip the entire keyword research and content gap analysis part of the process and focus on pruning the site of these types of pages and improving the rest. Alternatively, a site without much content may need to focus on keyword research and content gaps. Other sites may be looking specifically for content assets that they can improve, repeat in new ways or leverage for newer content. One example of a very specific goal would be to identify interlinking opportunities from strong, older pages to promising, newer pages. For now it is sufficient to know that
the framework can be changed as needed in a way that could dramatically affect where you spend your time in the process, or even which steps you may want to skip altogether.

Our documents

There are several major steps in the content auditing process that require various documents. While I'm not providing links to our internal SOP documentation (mainly because it's still evolving), I will describe each document and provide screenshots and links to examples / templates so you can have a foundation around which to customize one for your own needs.

Content audit scenarios

We keep a list of recommendations for common scenarios to guide our approach to content audits. While every situation is unique in its own ways, we find this helps us get 90% of the way to the appropriate strategy for each client much faster. I discuss this in more detail later, but if you'd like to take a peek click here.

Content audit dashboard spreadsheet

We were originally working within Google Docs, but as we started pulling in from more sources and performing more vLookups the spreadsheet would load so slowly on big sites as to make it nearly impossible to complete an audit. For this reason we have recently moved the entire process over to Excel, though
this template we're providing is in Google Docs format. Below are some of the tabs you may want in this spreadsheet...

The "Content Audit" tab

This tab within the dashboard is where most of the work is done. Other tabs pull data from this one by VLookup. Whether the data is fetched by API and compiled by one tool (e.g. URL Profiler) or exported manually from many tools and compiled manually (by VLookup), the end result should be that you have all of the metrics needed for each URL in one place so you can begin sorting by various metrics to discern patterns, spot opportunities and make educated decisions on how to handle each piece of content, and the content strategy of the site as a whole.



You can customize the process to include whatever metrics you'd like to use. Here are the ones we've ended up with after some experimentation, as well as the source of the data:

Action (internal)

Leave As-Is

Improve

Consolidate

Remove

Strategy (internal)

A more detailed version of "action". Example: Remove and 301 redirect to /another-page/.

Page Type (internal via URL patterns or CMS export)

This is and optional step for certain situations. Example: Article, Product, Category…

Source (original source of the URL, e.g. Google Analytics, Screaming Frog)

CopyScape Risk Score (copyscape API)

Title Tag (Screaming Frog)

Title Length (Screaming Frog)

Meta Description (Screaming Frog)

Word Count (Screaming Frog)

GA Entrances (Google Analytics API)

GA Organic Entrances (Google Analytics API)

Moz Links (Moz API)

Moz Page Authority (Moz API)

MozRank (Moz API)

Moz External Equity Links (Moz API)

Stumbleupon (Social Count API)

Facebook Likes (Social Count API)

Facebook Shares (Social Count API)

Google Plus One (Social Count API)

Tweets (Social Count API)

Pinterest (Social Count API)

Our recommendations typically fall into one of four "Action" categories: "Keep As-Is", "Remove", "Improve", or "Consolidate". Further details (e.g. remove and 404, or remove and 301? If 301, to where?) are provided in a column called "Strategy". Some URLs (the important ones) will have highly customized strategies, while others may have been bulk processed, meaning thousands could share the same strategy (e.g. rewriting duplicate product description copy). The "Action" column is limited in choices so we can sort the data effectively (e.g. see all pages marked as "removed") while the "Strategy" column can be more free-form and customized to the URL (e.g. consolidate /buy-blue-widgets/ content into /buying-blue-widgets/ and 301 redirect the former to the latter to avoid duplicating the same topic).

The "Keyword Research" tab

This tab includes keywords gathered from a variety of sources, including brainstorming for seed keywords, mining Google Webmaster Tools, PPC campaigns, the AdWords Keyword Planner and several other tools. Search Volume and Ad Competition (not shown in this screenshot) are pulled from Google's
Keyword Planner. The average ranking position comes from GWT, as does the top ranking page. The relevancy score is something we typically ask the client to do once we've cleaned out most of the obvious junk keywords.

The "Keyword Matrix" tab

This tab includes URLs for important pages, and those that are ranking for - or are most qualified to rank for - important topics. It essentially matches up keywords with the best possible page to guide our copywriting and on-page optimization efforts.

Sometimes the KWM tab plays an important role in the process, like when the site is relatively new or unoptimized. Most of the time it takes a back-seat to other tabs in terms of strategic importance.

The "Content Gaps" tab

This is where we put content ideas for high-volume, highly relevant keywords for which we could not find an appropriate page. Often it involves keywords that represent stages in the buying cycle or awareness ladder that have been overlooked by the company. Sometimes it plays an important role, such as with new and/or small sites. Most of the time this also takes a back-seat to more important issues, like pruning.

The "Prune" tab

If it was marked for "Remove" or "Consolodate" it should be on this tab. Whether it is supposed to be removed and 301 redirected, canonicalized elsewhere, consolidated into another page, allowed to stay up but with a robots "noindex" meta tag, removed and allowed to 404/410... or any number of "strategies" you might come up with, these are the pages that will no longer exist once your recommendations have been implemented. I find this to be a very useful tab. For example, one could export this tab, send it to a developer (or a company like
WP Curve), and have someone get started on most or all of the implementation. Our mantra for low-quality, under-performing content on sites that may have a Panda-related traffic drop is to improve it or remove it.

"Imported Data" tabs

In addition to the tabs above, we also have data tabs that are in the spreadsheet to house exported data from the various sources so we can perform Vlookups based on the URL to populate data in other tabs. These data tabs include:

GWT Top Queries

GWT Top Pages

CopyScape Scores (typically for up to 1,000 URLs)

Keyword Data

The more data that can be compiled by a tool like URL Profiler, the fewer data tabs you'll need and the faster this entire process will go. Before we built the internal tool to automate parts of the process, we also had tabs for GA data, Moz data, and the initial Screaming Frog export.

If you don't know how to do a Vlookup there are plenty of online tutorials for Excel and GoogleDocs Spreadsheets.
Here's one I found useful for Excel. Alternatively, you could import all of the data into the tabs and ask someone more spreadsheet-savvy on your team to do the lookups. Our resident spreadsheet guru is Caesar Barba, and he has great hair. Below is an example of a simple Vlookup used to bring the "Action" over from the Content Audit tab for a URL in the Keyword Matrix tab...

=VLOOKUP(A2,'Content Audit'!A:C,3,FALSE)

Content Strategy

The Content Audit Dashboard is just what we need internally: A spreadsheet crammed with data that can be sliced and diced in so many useful ways that we can always go back to it for more insight and ideas. Some clients appreciate it as well, but most are going to find the greater benefit in our final content strategy, which includes a high-level overview of our recommendations from the audit.

Recommended exports and data sources

There are many options for getting the data you need into one place so you can simultaneously see a broad view of the entire content situation, as well as detailed metrics for each URL. For URL gathering we use
Screaming Frog and Google Analytics. For data we use Google Webmaster Tools (GWT), Google Analytics (GA), Social Count (SC), Copyscape (CS), Moz, CMS exports, and a few other data sources as needed.

However we've been experimenting with using
URL Profiler instead of our internal tool to pull all of these data-sources together much faster. URL Profiler is a few hundred bucks and is very powerful. It's also somewhat of a pain to set up the first time, so be prepared for several hours of wrangling down API keys before getting all of the data you need.

No matter how you end up pulling it all together in the end, doing it yourself in Excel is always an option for the first few times.

A step-by-step example of our process

Below is the step-by-step process for an "average" client - whatever that means. Let's say it is
a medium-sized eCommerce client with about 800-900 pages indexed by Google, including category, product, blog posts and other pages. They don't have an existing penalty that we know of, but could certainly be at risk of being affected by Panda due to some thin, overlapping, duplicate, outdated and irrelevant content on the site.

Step 1: Assess the situation and choose a scenario

Every situation is different, but we have found common similarities based on two primary factors - The size of the site and its content-based penalty risk. Below is a screenshot from our list of recommended strategies for common content auditing scenarios, which can be found
here on GoInflow.com.

Each of the colored boxes drops down to reveal the strategy for that scenario in more detail.

Hat tip to
Ian Lurie's Marketing Stack for design inspiration.

The site described above would fall into the second box within purple column (
Focus: Content Audit with an eye to Improve and/or Prune, followed by KWM for key pages). Here is the reasoning behind that...

The site is in danger of a penalty (though it does not appear to have one "yet") so we follow the Panda matra:
Improve it or Remove it. The size of the site determines which of those two (improve or remove) gets the most attention. Smaller sites need less pruning (scalpel), while larger sites need much more (hatchet). Smaller sites often need some keyword research to determine if they are covering all of the topic areas for various stages in the customer's buying cycle, while larger sites typically have the opposite problem ---> too many pages covering overlapping topic areas with low-quality (thin, duplicate, irrelevant, outdated, poorly written, automated...) content. Such a site would not require the keyword research, and would therefore not be getting a keyword matrix or content gap analysis, as the focus would be primarily about pruning the site.

Our focus in this example will be to audit the content with an eye to improve and/or Remove low performing pages, followed by keyword research and a keyword matrix for the primary pages, including the home page, categories, blog home and key product pages, as well as certain other topical landing pages.

As it turns out, this hypothetical website has lots of manufacturer-supplied product descriptions. We're going to need to prioritize which ones get rewritten first because the client does not have the cash-flow to do them all at once. When budget and time is a concern, we typically shoot for the 80/20 rule: Write great content for the top 20% of pages right away, and do the other 80% over the course of 6-12 months as time/budget permit.

Because this site doesn't have an existing penalty, we will recommend that all pages stay indexed. If they had a penalty already, we would recommend they noindex,follow the bottom 80% of pages, gradually releasing them back into the index as they are rewritten. This may not be the way you choose to handle the same situation, which is fine, but the point is you can easily sort the pages by any number of metrics to determine a relative "priority". The bigger the site and tighter the budget, the more important it is to prioritize what gets worked on first.

Causes of Content-Related Penalties

For the purpose of a content audit we are only concerned with content-related penalties (as opposed to links and other off-page issues), which typically fall under three major categories: Quality, Duplication, and Relevancy. These can be further broken down into other issues, which include - but are not limited to:

Typical low quality content

Poor grammar, written primarily for search engines (includes keyword stuffing), unhelpful, inaccurate...

Completely irrelevant content

OK in small amounts, but often entire blogs are full of it.

A typical example would be a "linkbait" piece circa 2010.

Thin / Short content

Glossed over the topic, too few words, all image-based content...

Curated content with no added value

Comprised almost entirely of bits and pieces of content that exists elsewhere.

Misleading Optimization

Titles or keywords targeting queries for which content doesn't answer or deserve to rank

Generally not providing the information the visitor was expecting to find

Duplicate Content

Internally duplicated on other pages (e.g. categories, product variants, archives, technical issues...)

Externally duplicated (e.g. manufacturer product descriptions, product descriptions duplicated in feeds used for other channels like Amazon, shopping comparison sites and eBay, plagiarized content...)

Stub Pages (e.g. "No content is here yet, but if you sign in and leave some user-generated-content then we'll have content here for the next guy". By the way, want our newsletter? Click an AD!)

Indexable internal search results

Too many indexable blog tag or blog category pages

And so forth and so-on...

If you are unsure about the scale of the site's content problems, feel free to do step 2 before deciding on a scenario...

Step 2: Scan the site

We use
Screaming Frog for this step, but you can adapt this process to whatever crawler you want. This is how we configure the spider's "Basic" and "Advanced" tabs...

And the advanced tab...

Notice that "crawl all subdomains" is checked. This is optional, depending on what you're auditing. We are respecting "meta robots noindex", "rel = canonical" and robots.txt. Also notice that we are
not crawling images, CSS, JS, flash, external links.... This type of stuff is what we look at in a Technical SEO Audit, but would needlessly complicate a "Content" Audit. What we're looking for here are all of the indexable HTML pages that might lead a visitor to the site from the SERPs, though it may certainly lead to the discovery of technical issues.

Export the complete list of URLs and related data from Screaming Frog into a CSV file.

Step 3: Import the URLs and start the tool

We have our own internal "Content Auditing Tool", which takes URLs and data from Screaming Frog and Google Analytics, de-dupes them, and pulls in data from Google Webmaster Tools, Moz, Social Count and Copyscape for each URL. The tool is a bit buggy at times, however, so I've been experimenting with
URL Profiler, which can essentially accomplish the same goal with fewer steps and less upkeep. We need the "Agency" version, which is about $400 per year, plus tax. That's not too bad, considering we'd already spent several thousand on our internal tool by the time Gareth Brown released URL Profiler publicly. :-/

Below is a screenshot of what you'll see after downloading the tool. I've highlighted the boxes we currently check, though it depends on the tools/APIs to which you already subscribe and will differ by user. We've only just started playing with uClassify for the purpose of semi-automating our topic bucketing of pages, but I don't have a process to share yet (feel free to comment with advice)...

Right-click on the URL List box and choose "Import From File", then choose the ScreamingFrog export or any other list of URLs. There are also options to import from the clipboard or XML sitemap. Full documentation for URL Profiler
can be found here. Below are two output screenshots to give you an idea of what you're going to end up with...

The output changes depending on which boxes you check and what API access you have.

Step 4: Import the tool output into the dashboard

As described in the 50,000 foot overview above, we have a spreadsheet template with multiple tabs, one of which is the "Content Audit" tab.
The tool output gets brought into the Content Audit tab of the dashboard. Our internal tool automatically ads columns for Action, Strategy, Page Type and Source (of the URL). You can also add these to the tab after importing the URL Profiler output. Page Type and URL Source are optional, but Action and Strategy are key elements of the process.

Our hypothetical client requires a Keyword Matrix. However, if your "scenario" does not involve keyword research (i.e. if it is a big site with content penalty risks) you can skip steps 5-7 and move straight to "Step 8 - Time to Analyze and Make Some Decisions".

Step 5: Import GWT data

Match existing URLs from the content audit to keywords for which they already rank in Google Webmaster Tools

There may be a way to do this with URL Profiler. If so, I haven't found it yet. Here is what we do to grab the landing page and associated keyword/query data from Google Webmaster Tools, which we then import into two tabs (GWT Top Queries and GWT Top Pages). These tabs are helpful when filling out the Keyword Matrix because they tell you which pages Google is already associating with each ranking keyword. This step can actually be skipped altogether for huge sites with major content problems because the "Focus" is going to be on pruning the site of low quality content, rather than doing any keyword research or content gap analysis.

Instructions for Importing Top Pages from GWT

Copy and Paste the following script into the console window and press Enter.

Log into GWT from a Chrome browser

Go to Search Traffic ---> Search Queries

Switch the view to "Top pages" (default is "Top queries")

Change the date range to start as far back as possible (i.e. 3 months)

Expand the amount of rows to show to the maximum of 500 rows

This will put the s=500 parameter in the URL. Change s=500 to s=10000 or however many rows of data are available

See bottom of GWT page (e.g. 1-500 of ####).

In the Chrome menu go to View ---> Developer ---> Javascript Console

This action should expand all of the drop-downs to show the keywords under each "page" URL and then open up a dialog window that will ask you to save a CSV file: (more info here and here).

The script is also available in a javascript bookmarklet on Lunametrics.com...

Ignore any dialog windows that pop up.

You can check "Prevent this page from creating additional dialogs" to disable them.

Import the resulting download.csv file from GWT into the "GWT Top Pages" tab in the Content Auditing Dashboard.

Instructions for Importing Top Queries from GWT

Within GWT switch back to Top Queries.

Adjust the date to go back as far as you can.

Expand the amount of rows to show to the maximum of 500 rows

This will put the s=500 parameter in the URL. Change s=500 to s=10000 or however many rows of data are available

See bottom of GWT page (e.g. 1-500 of ####).

Select "Download this table" as a CSV file

Import the resulting TopSearchQueries.csv file from GWT into the "GWT Top Queries" tab in the Content Auditing Dashboard.

Step 6: Perform keyword research

This is another optional step, depending on the focus/objective of the audit. It is also highly customizable to your own KWR process. Use whatever methods you like for gathering the list of keywords (e.g. brainstorming, SEMRush, Google Trends, Uber Suggest, GWT, GA...). Ensure all "junk" and irrelevant keywords are removed from the list, and run the rest through a single tool that collects search volume and competition metrics. We use the Google Adwords Keyword Planner, which is outlined below.

Go to www.google.com/sktool/ while logged into our Google email account associated AdWords.

Select "Get search volume for a list of keywords or group them into ad groups", paste in your list of keywords and click "Get search volume".

Note: At this point you should have already expanded the list as much as you need/want to so you're just gathering data and organizing them now.

Note: The copy/paste method is limited to 1,000 keywords. You can get up to 3,000 by uploading your simple .txt file.

Go to the "Keyword Ideas" tab on the next screen and Add All keywords to the plan.

Go to the "Ad Group Ideas" tab and choose to Add All of the ad groups to the plan.

Download the plan, as seen in the screenshot below.

Import the data into the AdWords Data tab of the Content Auditing Dashboard

Use the settings below when downloading the plan:

Step 7: Tying the keyword data together

Again, you don't need to do this step if you're working on a large site and the focus is on pruning out low quality content. The GWT Queries and KWR steps provide data needed to develop a "Keyword Matrix" (KWM), which isn't necessary unless part of your focus is on-page optimization and copywriting of key pages. Sometimes you just need to get a client out of a penalty, or remove the danger of one. The KWM comes in handy for the important pages marked as "Improve" within the Content Audit tab just so the person writing the copy understands which keywords are important for that page. It's SEO 101 and you can do it anyway you like using whatever tools you like.

Google Adwords has given you the keyword, search volume and competition. Google Webmaster Tools has given you the ranking page, average position, impressions, clicks and CTR for each keyword. Pull these together into a tab called "Keyword Research" using Vlookups. You should end up with something like this:

The purpose of these last few steps was to help with the
KWM, an example of which is shown below:

Step 8: Time to analyze and make some decisions!

All of the data is right in front of you, and your path has been laid out using the
Content Audit Scenarios tool. From here on the actual step-by-step process becomes much more open to interpretation and your own experience / intuition. Therefore, do not consider this a linear set of instructions meant to be carried out one after another. You may do some of them and not others. You may do them a little differently. That is all fine as long as you are working toward the goal of determining what to do, if anything, for each piece of content on the website.

Sort by Copyscape Risk Score

Which of these pages should be rewritten?

Rewrite key/important pages, such as categories, home page, top products

Rewrite pages with good Link and Social metrics

Rewrite pages with good traffic

After selecting "Improve" in the Action column, elaborate in the Strategy column:

"Improve these pages by writing unique, useful content to improve the Copyscape risk score."

Which of these pages should be removed / pruned?

Remove guest posts that were published elsewhere

Remove anything the client plagiarized

Remove content that isn't worth rewriting, such as:

No external links, no social shares, and very few or no entrances / visits

After selecting "Remove" from the Action column, elaborate in the Strategy column:

"Prune from site to remove duplicate content. This URL has no links or shares and very little traffic. We recommend allowing the URL to return 404 or 410 response code. Remove all internal links, including from the sitemap.

Which of these pages should be consolidated into others?

Presumably none, since the content is already externally duplicated

Which of these pages should be marked "Leave As-Is"

Important pages which have had their content stolen

In the Strategy column provide a link to the CopyScape report and instructions for filing a DMCA / Copyright complaint with Google.

Sort by Entrances or Visits (filtering out any that were already finished)

Which of these pages should be marked as "Improve"?

Pages with high visits / entrances but low conversion, time-on-site, pageviews per session...

Key pages that require improvement determined after a manual review of the page

Which of these pages should be marked as "Consolidate"?

When you have overlapping topics that don't provide much unique value of their own, but could make a great resource when combined.

Mark the page in the set with the best metrics as "Improve" and in the Strategy column outline which pages are going to be consolidated into it. This is the canonical page.

Mark the pages that are to be consolidated into the canonical page as "Consolidate" and provide further instructions in the Strategy column, such as:

Use portions of this content to round out /canonicalpage/ and then 301 redirect this page into /canonicalpage/ Update all internal links.

Campaign-based or seasonal pages that could be consolidated into a single "Evergreen" landing page (e.g. Best Sellers of 2012 and Best Sellers of 2013 ---> Best Sellers).

Which of these pages should be marked as "Remove"?

Pages with poor link, traffic and social metrics related to low-quality content that isn't worth updating

Typically these will be allowed to 404/410.

Irrelevant content

The strategy will depend on link equity and traffic as to whether it gets redirected or simply removed.

Out-of-Date content that isn't worth updating or consolidating

The strategy will depend on link equity and traffic as to whether it gets redirected or simply removed.

Which of these pages should be marked as "Leave As-Is"?

Pages with good traffic, conversions, time on site, etc... that also have good content.

These may or may not have any decent external links

Another Way of Thinking About It...

For big sites It is best to use a hatchet-approach as much as possible, and finish up with a scalpel in the end. Otherwise you'll spend way too much time on the project, which eats into the ROI.

This is not a process that can be documented step-by-step. For the purpose of illustration, however, here are a few different
examples of hatchet approaches and when to consider using them.

Parameter-based URLs that shouldn't be indexed

Defer to the Technical Audit, if applicable. Otherwise, use your best judgement:

e.g. /?sort=color, &size=small

Assuming the Tech Audit didn't suggest otherwise these pages could all be handled in one fell swoop. Below is an "example" action and an "example" strategy for such a page:

Action = Consolodate

Strategy = Rel canonical to the base page without the parameter

Internal search results

Defer to the Technical Audit if applicable. Otherwise, use your best judgement:

e.g. /search/keyword-phrase/

Assuming the Tech Audit didn't suggest otherwise:

Action = Remove

Strategy = Apply a noindex meta tag. Once they are removed from the index, disallow /search/ in the robots.txt file.

Blog tag pages

Defer to the Technical Audit if applicable. Otherwise...:

e.g. /blog/tag/green-widgets/ , blog/tag/blue-widgets/ ...

Assuming the Tech Audit didn't suggest otherwise:

Action = Remove

Strategy = Apply a noindex meta tag. Once they are removed from the index, disallow /search/ in the robots.txt file.

eCommerce Product Pages with Manufacturer Descriptions

In cases where the "Page Type" is known (i.e. it's in the URL or was provided in a CMS export) and Risk Score indicates duplication...

e.g. /product/product-name/

Assuming the Tech Audit didn't suggest otherwise:

Action = Improve

Strategy = Rewrite to improve product description and avoid duplicate content

eCommerce Category Pages with No Static Content

In cases where the "Page Type" is known...

e.g. /category/category-name/ or category/cat1/cat2/

Assuming NONE of the category pages have content...

Action = Improve

Strategy = Write 2-3 sentences of unique, useful content that explains choices, next steps or benefits to the visitor looking to choose a product from the category.

Out-of-Date Blog Posts, Articles and Other Landing Pages

In cases where the Title tag includes a date or...

In cases where the URL indicates the publishing date....

Action = Improve

Strategy = Update the post to make it more current if applicable. Otherwise, change Action to "Remove" and customize the Strategy based on links and traffic (i.e. 301 or 404)

Step 9: Content gap analysis and other value-adds

Although most of these could be put as optional items during the keyword research process, I prefer to save them until last because I never knows how much time I'll have after taking care of more pressing issues.

Content gaps

If you've gone through the trouble of identifying keywords and the pages already ranking for them, it isn't much of a step further to figure out which keywords could lead to ideas about how to fill content gaps.

At Inflow we like to use the "Awareness Ladder" developed by Ben Hunt, as featured in his book
Convert!. You can learn more about it here.

Content levels

If time permits, or the situation dictates, we may also add a column to the Keyword Matrix or Content Audit which identifies which level of content the page would need to compete in its keyword space. We typically choose from Basic, Standard and Premium. This goes a long way in helping the client allocate copywriting resources to work where they're needed the most (i.e. best writers do the Premium content).

Landing page or keyword topic buckets

If time permits, or the situation dictates, we may provide topic bucketing for landing pages and/or keywords. More than once this has resulted in recommendations for adding to or changing existing taxonomy with great results. The most frequent example is in the "How To" or "Resources" space for any given niche.

Keyword relevancy scores

This is a good place to enlist the help of a client, especially in complicated niches with a lot of jargon. Sometimes the client can be working on this while the strategist is doing the content audit.

Step 10: Writing up the content audit strategy document

The Content Strategy, or whatever you decide to call it, should be delivered at the same time as the audit, and summarizes the findings, recommendations and next steps from the audit. It should start with an Executive Summary and then drill deeper into each section outlined therein.

Here is a
real example of an executive summary from one of Inflow's Content Audit Strategies:

As a result of our comprehensive content audit, we are recommending the following, which will be covered in more detail below:

Removal of about 624 pages from Google index by deletion or consolidation:

203 Pages were marked for Removal with a 404 error (no redirect needed)

110 Pages were marked for Removal with a 301 redirect to another page

311 Pages were marked for Consolidation of content into other pages

Followed by a redirect to the page into which they were consolidated

Rewriting or improving of 668 pages

605 Product Pages are to be rewritten due to use of manufacturer product descriptions (duplicate content), these being prioritized from first to last within the Content Audit.

63 "Other" pages to be rewritten due to low-quality or duplicate content.

Keeping 26 pages as-is with no rewriting or improvements needed unless the page exists in the Keyword Matrix, in which case it requires on-page optimization best practices be reviewed/applied.

On-Page optimization focus for 25 pages with keywords outlined in the Keyword Matrix tab.

These changes reflect an immediate need to "improve or remove" content in order to avoid an obvious content-based penalty from Google (e.g. Panda) due to thin, low-quality and duplicate content, especially concerning Representative and Dealers pages with some added risk from Style pages.

The Content Strategy should end with recommended next steps, including action items for the consultant and the client. Here is a real example from one of our documents:

We recommend the following actions in order of their urgency and/or potential ROI for the site:

Remove or consolidate all pages in the "Prune" tab of the Content Audit Dashboard

Detailed instructions for each page can be found in the "Strategy" column of the Prune tab

Begin a copywriting project to improve/rewrite content on Style pages to ensure unique, robust content and proper keyword targeting.

Inflow can provide support for your own copywriters, or we can use our in-house copywriters, depending on budget and other considerations. As part of this process, these items can also be addressed:

Improve/rewrite all pages in the Keyword Matrix to match assigned keywords.

Include on-page optimization (e.g. Title, description, alt attributes, keyword use, etc.)

See the "Strategy" column for more complete instructions for each page.

Improve/rewrite all remaining pages from the "Content Audit" tab listed as "Improve".

Resources, links, and post-scripts...

Example Content Auditing Dashboard

Make a copy of this Google Docs spreadsheet, which is a basic version of how we format ours at Inflow.

Content Audit Strategies for Common Scenarios

This page/tool will help you determine where to start and what to focus on for the majority of situations you'll encounter while doing comprehensive content audits.

How to Conduct a Content Audit on Your Site by Neil Patel of QuickSprout

Oh wait, I can't in send everyone to a page that makes them navigate a gauntlet of pop-ups to see the content, and another one to leave. So nevermind...

How to Perform a Content Audit by Kristina Kledzik of Distilled

This one focuses mostly on categorizing pages by buying cycle stage.

Expanding the Horizons of eCommerce Content Strategy by Dan Kern of Inflow

Dan wrote an epic post recently about content strategies for eCommerce businesses, which includes several good examples of content on different types of pages targeted toward various stages in the buying cycle.

Distilled's Epic Content Guide

See the section on Content Inventory and Audit.

The Content Inventory is Your Friend by Kristina Halvorson on BrainTraffic

Praise for the life-changing powers of a good content audit inventory.

How to Perform a Content Marketing Audit by Temple Stark on Vertical Measures

Temple did a good job of spelling out the "how to" in terms of a high-level overview of his process to inventory content, assess its performance and make decisions on what to do next.

Why Traditional Content Audits Aren't Enough by Ahava Leibtag on Content Marketing Institute's blog
While not a step-by-step "How To" like this post, Ahava's call for marketing analysts to approach these proejcts from both a quantitative (content inventory) and qualitative (content quality audit) resonated with me the first time I read it, and is partly responsible for how I've approached the process outlined above.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

Show more