Why Your Directory Import Keeps Creating Duplicate Listings

You ran the import. It finished. You go look at the site, and there are two of everything.

Two listings for the same business. Same title, same description, sometimes the same phone number sitting right next to a slightly different one. Now your category pages are padded with junk, your search results look broken, and somewhere an advertiser is about to email you asking why their listing shows up twice.

Here's the thing: the import didn't malfunction. It did exactly what you told it to. The problem is that it had no way to know that “Joe's Plumbing” and “Joe's Plumbing LLC” are the same business — and you never told it how to decide.

That's what's actually happening underneath almost every duplicate-import mess. The importer matches on one field — usually the listing title, or nothing at all — and treats anything that doesn't match exactly as a brand-new record. So the moment your source data is even slightly inconsistent, you don't get an update. You get a clone.

A few of the usual culprits:

No unique key. If your import isn't matching on a stable ID — something that never changes, like a business ID or a clean email — it's matching on the title. And titles are the least stable thing in any listing dataset. One trailing space, one “LLC,” one ampersand instead of “and,” and the system thinks it's new.
The spreadsheet changed shape between runs. The advertiser sent you a fresh export, and this time the column is called “Business Name” instead of “Company.” The importer can't find the field it's supposed to match on, so it matches on nothing, and every row comes in fresh.
You're re-importing the whole file instead of the changes. A lot of duplicate piles are just the same file run twice because nobody was sure the first one took.
“Update existing” was off. Half the directory plugins default to “create” and bury the “update if it already exists” setting three screens deep.

So what do you actually do about it.

Before the next import, decide what makes a listing the same listing. Pick the field that's truly unique and truly stable, and make the import match on that. Not the title. If your source genuinely has no stable ID, you build one — a normalized key from a couple of fields (name + address, cleaned and lowercased) so “Joe's Plumbing LLC” and “joe's plumbing” collapse to the same thing on purpose.

Then clean the data before it ever touches the site, not after. Trim the whitespace, standardize the obvious variants, flag the near-matches a human should eyeball. It's tedious, and it's exactly the kind of tedious that compounds: skip it once and you're hand-deleting duplicates for an hour every cycle forever.

And if you've already got the duplicate pile staring at you — don't bulk-delete on instinct. Some of those “duplicates” are the version with the good data, and the one you keep might be the empty one. Match them up first, decide which record wins, merge the good fields, then delete. In that order.

This is the part of running a directory nobody warns you about. The platform sells you on “import your listings in minutes,” and the minutes are real — it's the next cycle, when the data drifts, that costs you the afternoon.

If you're staring at two of everything right now and you'd rather not spend the weekend on it, that's literally the thing I do — send me the mess and I'll send back clean, import-ready data with the edge cases flagged. But even if you never talk to me: match on a stable key, clean before you import, and you'll stop manufacturing clones.

← Buying a Website? The Hidden Problems That Don't Show Up in the P&L