What is Duplicate Content?: What youdon’t know may in fact be hurting you

duplicate-featuredFor many businesses, the number one headache is having a regular stream on content to put on your site, in the form of web pages, blog posts, whitepapers, videos, data sheets, etc. SEO experts constantly extol the virtues of re-purposing content, in the hopes of easing the need for a flow of new ideas. With that in mind, people who re-purpose content are often cognizant of paying attention to be sure that they are not plagiarizing themselves, and creating duplicate text for multiple assets. But did you know that duplicate content is much more than that, and can, in fact, be hurting your website?

Superficially, duplicate content is defined as content that appears on more than one web page. But this “duplication” can be caused by many factors. Let’s take a look a few common causes:

Non-original content in two different places on the site

I know, this sounds like a no-brainer. But you would be surprised to learn that more than half of my clients face this challenge. Say you serve several distinct vertical markets, and it is your intention that, when a person comes to your site and looks under their market, that all pages with the solution be under that market category. Problem is, you provide the same exact solution for several different markets. And so, you write the content for the page once, and then copy and paste it into each vertical segment, thinking you are doing right by your website visitors. Let’s use an example:

You sell feed bags, harnesses, and brushes for animals. The animals you have products for are pigs, horses and chickens. So, you have a solution section where you list “bags”, “harnesses”, and “brushes”. You also have a customer section for “chickens”, “horses”, and “pigs”. The smart architecture says you speak in the customer pages about the products you offer (“bags”, “harnesses”, and “brushes”) and point to a common page in the solution section. But, instead, you only want the horse brush people to see a horse brush page. Same with pigs. But the content is identical, so you copy and paste the same exact information on three pages. BUZZZZ! That’s duplicate content.

If your content cannot be differentiated, the problem is not with the content, but your site architecture. Consider re-examining your “silos” and looking at actual Google Analytics data that ail dictate whether you should architect your site with a solutions focus, or an industry focus. I am willing to wager that more people search on what they need than who they are.

A hitch in your content management system (CMS)

Sometimes, depending on how a CMS is set up, there can be multiple URLs that go to the exact same page. Consider the example:

http://www.example.com/keyword-x/

http://www.example.com/article-category/keyword-x/

In this circumstance, the writer has done everything right, but the CMS creates this “ghost” page, which, to a search engine crawling your site, is considered duplicate content. The most common circumstance of this is when the home page (example.com) can also be found as example.com/index.

The best fix is to remove the other instances of the page and create a permanent 301 redirect from the offending page URL (/index) to the page you want to be found. You can also add a canonical link element (rel=canonical) that points to the main page from any of the offending pages.

Duplication in <META> tags

Your on-page content isn’t the only place where duplicate content lies. Another common area that can trigger duplicate content issues are your title tags and meta descriptions. Each page must have a unique title tag (no longer than 70 characters with spaces) and meta description (around 165 characters with spaces).

Screen Shot 2013-02-19 at 12.55.50 PM

You can determine duplicate, long, short and missing meta tags in Google’s Webmaster Tools. Go to the HTML improvements section under “Optimization” and you will be given a list of offending tags for each circumstance.

Canonicalization issues

Most people cannot even pronounce this issue, and a vast majority of sites suffer from this ailment. Websites can be reached via www and non www: emagineusa.com vs www.emagine.com. Trouble is, if one doesn’t redirect (here we go with the 301 redirect again) to the other, search engines think you have two exact duplicates of your site. Therefore, every page of content is duplicated.

In some scenarios, you may have purchased multiple domains, all pointing to the same content, but not redirecting to one another. In this scenario, you not only have the two duplicate sites, but that multiplied by the amount of domains pointing to the same content. There is a clear distinction between “pointing” and “redirecting”. Make sure your hosting provider has a 301 redirect from all domains to a single domain, and all non-www domains to the www-domain (or vice versa, if you prefer).

Printer-friendly and email-page versions of your page

Many sites offer a stripped-down version of your web pages to either print or email to someone. Problem is, it’s merely a stripped down version of the page that’s more suitable for either print or email. And that means duplicate content.

So, what can you do? You can create a print/email style sheet. or, you can disallow the search engines from crawling the email and print versions in your robots.txt file.

Bottom line: duplicate content happens, and it happens often. But with some common sense, and a little know-how, it is always fixable. Seek out the help of your webmaster, graphics person or web agency to help you identify and fix these problems. When you do, you improve the overall SEO of your site, and can increase your site’s visibility on the search engine results pages.

Have you had duplicate content issues? What steps did you take to solve the problem?

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>