Back to Blog

The Content Audit Nobody Does Before a CMS Migration (And Why It Costs You Months)

March 24, 2026 6 min read
The Content Audit Nobody Does Before a CMS Migration (And Why It Costs You Months)

A company decides to migrate their website to a new CMS. They pick the platform. They hire a team or agency. They set a launch date. And then, somewhere around week three or four, everything starts falling apart.

Pages are missing. Metadata is incomplete. Nobody knows which blog posts are still relevant and which ones should have been retired two years ago. The dev team is migrating content that nobody even wants anymore. Redirects are a mess. And the timeline? It just doubled.

The root cause is almost always the same. They skipped the content audit.

What a content audit actually means in a migration context

Most people hear "content audit" and think of a spreadsheet with URLs and traffic numbers. That is part of it, sure. But in a migration context, a content audit is something much more specific.

It is the process of understanding every single piece of content on your current site, what it is, what it does, whether it still matters, and how it maps to the new system you are moving to.

This includes pages, blog posts, landing pages, images, PDFs, embedded videos, forms, metadata, redirects, authors, tags, categories, HubDB tables, and anything else your CMS is storing. If it lives in your CMS, it needs to be accounted for before you move it anywhere.

The reason this matters so much during a migration is simple. You are not just copying files from one folder to another. You are moving structured content from one architecture to a completely different one. If you do not know exactly what you have, you cannot plan how to move it.

What happens when you skip it

I will give you a real scenario. A company with around 800 blog posts decides to move from a traditional CMS to a headless platform. They assume the migration will take six weeks. The team starts building content types and templates on the new platform while another team begins exporting content from the old one.

Two weeks in, they realize that 200 of those blog posts have embedded custom modules that do not have equivalents in the new CMS. Another 150 have broken internal links pointing to pages that were deleted years ago. The author field references people who no longer work at the company. Alt text is missing on about 60 percent of images. And metadata is inconsistent across the entire site, with some posts having custom meta descriptions and others pulling auto-generated ones.

Now the team has to stop, go back, and figure all of this out while the migration is already in progress. That is where timelines blow up. Not because the migration itself is hard, but because the preparation was skipped.

The three things a content audit answers before migration

A proper pre-migration content audit answers three core questions.

First, what do you actually have? This means building a complete inventory of every content asset on your site. Not just pages and posts, but every content type, every field, every relationship between content items. If your CMS has 50 content types with 15 fields each, that is 750 data points you need to understand before you start mapping anything.

Second, what is worth keeping? Not everything deserves to be migrated. Low traffic pages with outdated information, duplicate content, test pages that were never unpublished, blog posts from 2018 that reference discontinued products. All of this can be retired. Moving it to your new CMS just adds noise and complexity. This is the step where you categorize content into three buckets. Migrate as is, migrate with updates, or retire.

Third, how does it map to the new system? This is the technical part. Your current CMS organizes content one way. Your new CMS organizes it differently. Fields might have different names, different types, or different relationships. A "module" in HubSpot is not the same thing as a "content type" in a headless CMS. A "global group" is not the same as a "global field." You need a mapping document that connects every piece of your current architecture to its equivalent in the new system. Without this, your dev team is guessing.

The content model mapping problem

This is where most guides stop. They tell you to "do a content audit" and move on. But the hardest part of a migration audit is not counting your pages. It is understanding your content model.

Your content model is the underlying structure that defines how your CMS stores and organizes content. In a traditional CMS like HubSpot, this might include modules, module groups, templates, theme settings, HubDB tables, and various field types. In a headless CMS like ContentStack, Contentful, or Sanity, the equivalent structures are content types, global fields, modular blocks, taxonomies, and references.

The problem is that these structures do not map one to one. You might have five different accordion modules in your current CMS that all do slightly different things. In your new CMS, you probably want one standardized accordion component. That decision, consolidating five into one, needs to happen during the audit phase. Not during development.

Similarly, you might discover that your current CMS stores author information as a simple text field on each blog post. But your new CMS expects author to be a separate content type with its own fields, linked by reference. If you do not catch this during the audit, your migration script will either break or produce entries with missing author data.

These are the kinds of issues that add weeks to a project. And they are entirely preventable with a thorough content audit.

How to actually do it

Here is a practical approach that works.

Start with a crawl to get a full picture of every URL on your site. Export this into a spreadsheet. This gives you your baseline inventory.

Next, go deeper into your CMS. Export your content types, fields, and relationships. If you are on HubSpot, this means looking at every template, every module, every HubDB table, and understanding how they connect. A bulk export tool can save you significant time here, especially if you are dealing with hundreds or thousands of pages. Being able to pull all your metadata, authors, tags, and content fields into a single spreadsheet view makes the audit dramatically faster than clicking through each item individually.

Then build your mapping document. For every content type and field in your current CMS, define where it goes in the new system. Be specific. Do not just write "maps to blog post content type." Write "HubSpot blog post title field maps to ContentStack Blog Entry title field (single line text, required, unique)." That level of detail saves your dev team from making assumptions.

Flag your exceptions. Every site has content that does not fit neatly into the standard migration path. Custom modules, one-off landing pages, embedded third-party widgets, legacy URL structures. Document these separately. They need individual attention during migration, and knowing about them upfront lets you plan for them.

Finally, make your retire decisions early. Go through your inventory and tag everything that should not make the trip. Low traffic, outdated, duplicate, broken, or irrelevant content should be flagged for retirement before the migration starts. This reduces the total volume of content your team needs to handle, which directly impacts timeline and cost.

The 90/10 rule

In our experience, about 90 percent of content on any given site follows predictable patterns. Standard blog posts with standard fields. Regular pages with consistent templates. These can be migrated programmatically once the mapping is solid.

The remaining 10 percent is where the complexity lives. Custom layouts, one-off modules, edge cases in how fields were used, content that was entered inconsistently over the years. This 10 percent is what blows up timelines if you do not identify it early.

A good content audit separates the 90 from the 10. It lets your team automate what can be automated and plan carefully for what cannot.

The audit is not overhead. It is the project.

There is a mindset shift that needs to happen in how teams approach CMS migrations. The audit is not a preliminary step you rush through to get to the "real work." The audit is the foundation of the entire project.

Every decision your team makes during migration, which content types to build, how to structure fields, what transformation rules to apply, how to handle edge cases, all of it flows from the audit. If the audit is incomplete or inaccurate, every downstream decision is built on shaky ground.

The teams that migrate smoothly are not the ones with the best developers or the most expensive tools. They are the ones that took the time to understand exactly what they were working with before they wrote a single line of migration code.

If you are planning a CMS migration, start with the audit. Invest the time upfront. Document everything. Map your content model properly. Separate the 90 from the 10. And then, when you actually start migrating, you will know exactly what you are dealing with.

That is how you hit your timeline. That is how you avoid the surprise discoveries in week four. And that is how you end up with a clean, well-structured site on your new platform instead of a messy copy of your old one.


Smuves helps content teams audit and manage HubSpot CMS content at scale, making pre-migration content audits faster and more reliable. If you are planning a migration and need to get a clear picture of your content before you move, start with a bulk export of your site and see what you are actually working with.