Leave No Content Behind Part 2: Conducting a Content Inventory

In Part 1 of this series, we established that content migration is its own stand-alone project - usually, a big one. We outlined a solid migration strategy for your content migration project team using the 3/7 method. And we showed you how to identify the four pieces of your content puzzle.

Now that you understand what content you are looking for, you need to decide how you want to create your master content inventory list. How you approach this tactically is really up to you. There are three basic methods for conducting a good content inventory, each with its own pros and cons. When deciding which one is best for your organization, you need to consider the hygiene of your current repositories, your confidence in the accuracy of your database layers, and how much human and technical bandwidth is at your disposal.

Option 1: The UX Approach

With this option, you will use scraping tools to crawl current sites and create an inventory list by page URL.

This is a good approach when:

  • Your sites are based on highly standardized page templates
  • Your content is mostly lengthy knowledge-based copy
  • You are unsure of asset sources or suspect you have asset duplication within or across multiple repositories
  • You want to focus only on high priority content that is being shown to users today (Note: You may be able to DIFF this list against repositories lists later to find your lower priority and legacy content.)

This approach won't: 

  • Provide a componentized inventory of what on each page; it will help, but you will still need to conduct an object-based inventory to align to your new page templates
  • Tell you where the copy objects shown to viewers are actually stored
  • Capture legacy, seasonal, social media, 3rd party-owned, or syndicated content

Option 2: The Repository Approach

Depending on your systems, it may be more practical to extract a list of all objects in your current repositories. This is the most comprehensive approach of the three as it surfaces every asset or piece of copy you’ve ever created.

This is a good approach when:

  • You know all possible content repositories across your organization (no original content is living on someone’s local drive)
  • You are confident in the integrity of the content in the repositories (no manipulation or “fixing” was ever done to the content at the presentation layer)
  • You need to find priority content not currently visible to users due to seasonality or product availability
  • You are conducting a DAM migration with minimal copy assets

This approach won’t:

  • Tell you if or where the content object is being shown to users (For this reason, we suggest you still provide your QA and UAT teams a URL list that cross references objects to target pages)
  • If copy was changed or ‘fixed’ by an author in the final CMS or syndication systems

Option 3: The Expectations Approach

This approach starts by making no assumptions. Instead, you are going to rely on SMEs to tell you what pages, assets and copy they expect to be on the sites in their region. These local experts will provide you an inventory list which you will analyze and align in order to create inheritance hierarchies in your new system. After alignment, you will ask them to provide the assets or copy required to populate the component, page, and site templates in your new system.

This is a good approach when:

  • You have multiple global sites where content is managed wholly or in significant part by regional web editors
  • You want to divide your inventory and analysis workload across a broader team
  • You have low confidence in the integrity of content in your repositories or are unsure who owns repositories in each region
  • Translation quality has been a problem
  • There are significant discrepancies between corporate e-Commerce sites and 3rd party seller sites
  • The local marketing team leverages content channels that corporate does not (they do more promotion on local social media or blogs than website maintenance)

Like the other two approaches, this one won’t show you where content is displayed or where it is stored unless you ask. When reaching out to SMEs, be sure to include a list or table asking them to provide everything from page URL to object repository to region-specific copyright information. When working with global markets, I like to send a list of open-ended questions about their content creation processes, system tools, marketing channels, and regional content laws before I request their actual inventory lists.

In the end, it’s not unusual to use more than one approach outlined above. Maybe you take a UX approach for your navigation and footer pages but rely on local expectations for your subdomain and subdirectory pages. Be flexible but persistent in your pursuit of your content inventory master list.

Leave No Content Behind

No matter which approach you take to arrive at your master inventory list, make sure you do a final check that you’ve included:

  • All P1, P2 and legacy content
  • All social media images and videos
  • All digital and print artwork if those assets are in scope
  • All agency-created content like microsite copy or splash page assets
  • All content pushed out via syndication
  • Any unique copy or assets used exclusively by partners
  • Any support documents
  • All mobile SDK strings including error message copy

Ask yourself do I have all objects, from all sources, used in all channels, in all markets?

Now that you are fairly sure you’ve left nothing behind, there is one last content type I encourage you to consider including in your migration project - contextual content. My colleague John Kottcamp writes about the value of this unseen content in his paper Digital Transformation in an Ever-Changing World.

In summary, it’s content that goes unseen but will be key to delivering quality customer experiences from your new systems. Think of it as “ride-along assets” or a small string of important data that are tightly coupled with your visible content. These contextual snippets are key to making your content searchable and relevant and include things like page descriptions and keywords; image alt text; subtitle files; DRM and IPTC data; and semantic tools like ontologies, vocabularies, glossaries, and structured content schemas. If you haven’t leveraged contextual content before or are unsure if you are doing it well, your content migration project provides a unique opportunity to add this ride-along data to your assets in bulk during the content enrichment stage.

Marli Larimer

Senior Content Strategist

Related Articles

The Reality of Social Media Data Tracking


CX 101 Series: Solutions & Stakeholders


Link has been copied to your clipboard!