You've got ten thousand pages. Maybe more. Every listing, every category, every city, every tag, every filtered view — the directory generates them automatically, which felt like a superpower right up until you opened Search Console and saw that Google has indexed maybe a fifth of them.
The rest sit in that purgatory of “Discovered – currently not indexed” and “Crawled – currently not indexed.” Google knows the pages exist. It has decided they're not worth keeping. And your traffic, which you assumed would climb with your page count, has flatlined.
Here's the thing nobody tells directory operators: more pages is not more SEO. Past a certain size, it's the opposite. Google gives every site a rough budget of attention — how much it'll crawl, how much it'll bother indexing — and a big directory blows through that budget fast. If most of what it finds is thin or repetitive, it doesn't just ignore the weak pages. It starts trusting the whole site a little less. You're spending your budget teaching Google that your pages mostly aren't worth it.
And directories are practically engineered to produce weak pages:
- Thin pages by the thousand. A category with one listing in it. A city page with two. An empty tag. Each one is a real URL Google has to crawl, and each one says “not much here.”
- Near-duplicate permutations. “Plumbers in Springfield,” “Plumbing in Springfield,” “Springfield plumbers” — category-times-location math generates pages that are 90% the same, competing with each other and diluting all of them.
- Faceted navigation gone feral. Every filter combination — category plus price plus rating plus sort order — can mint its own crawlable URL. Left unchecked, a few filters generate effectively infinite pages, and Google can spend your entire crawl budget wandering through filter combinations that should never have been indexable in the first place.
So the fix isn't “write more content” or “build more pages.” It's a design decision you probably never got to make on purpose: what on this site actually deserves to be in Google, and how do we make sure that's all Google spends its time on?
That means deciding which pages are your real assets (the listings worth ranking, the categories with genuine depth), and deliberately keeping Google out of the rest — the thin permutations, the filter URLs, the empty tags — with the right mix of noindex, canonical tags, and crawl controls. It means consolidating the thin stuff instead of letting a hundred one-listing categories each beg for attention. It means templates that give your real pages a reason to exist — actual differentiated content, not the same boilerplate with the city name swapped. And it means your internal linking points crawl attention at the pages that matter instead of spreading it evenly across ten thousand of them.
This is a different kind of work than fixing a broken import. It's architecture — taxonomy design and indexation strategy — and it's the part of directory SEO that almost nobody does well, because it requires understanding both how the platform generates pages and how search engines decide what to trust. Get it right and the same content suddenly performs, because Google's finally spending its budget on your good pages instead of drowning in your thin ones.
If your page count keeps climbing and your traffic won't, that gap is the tell, and sorting out what should and shouldn't be indexable is one of the highest-return things you can do to a large directory. It's also most of what a real directory SEO audit is for — mapping the page explosion and deciding, deliberately, what Google gets to see.
