Search Expert tips and news: March 2009

Author: Ray "Catfish" Comstock

Removing Duplicate Penalties

Upon further review of the new tag rel=canonical, that is used to let search engines know what URL you want indexed for any given page of your site, I can't find any reason why you wouldn't want to make it a standard part of every Web page on your site. This tag proactively handles a number of potential duplicate content issues.

*special note about "duplicate content penalties": Some people feel that it is very important to be clear that a duplicate content "penalty" is not really a penalty. Google is not punishing your site for having a Web page that has the same content as another Web page.

But what it is happening is, if your page is not the original version of the content, it is highly likely that your page will be put into the supplemental index which makes it less likely to rank well for competitive terms. So in effect, it's the same end result of incurring a penalty but in fact there has been no "penalty." For some reason this game of semantics is very important to some members of the SEO community, so I wanted to make sure I was clear with everyone that I know the phrase "duplicate content penalty" is not entirely accurate, but it is how most people view the situation.

Having said that, here are some of the issues that the rel=canonical tag can help with:

• Capitalization: The search engines treat each unique URL string as a separate entity. So any variation in a URL, no matter how slight, creates a new URL in the eyes of the search engine. This includes differences in capitalization, although Google specifically has been much better at figuring this out on its own since the "Big Daddy" update. So for example, when an engine indexes these URLs:
http://www.businessol.com/seo-blog/ and http://www.businessol.com/Seo-Blog/ it sees two different URLs for the same content.

This happens often because Webmasters are not standardized in the way that they link to 3rd party websites. Some Webmasters and blogs capitalize letters in URLs out of coding practices or habit.

Search Engine Optimization Tips and News:

In the old days, Google would have not been able by itself to figure out that these two URLs are the same page. And therefore one of these two URLs (most likely the one with the least amount of Page Rank) would have been put into supplemental results and the links that pointed to it would essentially be lost (because now they point to a page in the supplemental results that isn't going to rank for much and they are not helping the other URL that is listed in Google). But nowadays, Google is pretty good about figuring this stuff out although not perfect. And the other engines are not very good at this kind of thing. So by including the Rel=Canonical tag on every page of the site, you make it easy for all the engines to consolidate URLs that have capitalization problems.

• Dynamic URL Strings: Whether its tracking codes like www.domain.com?tracking-code or CMS systems that generate multiple URLs for the same page, the issues with these pages are the same as the issues for capitalization. But now instead of an elaborate 301 redirect strategy or costly adjustments to your backend system, this simple tag solves the problem.

• Other Canonical Issues: Some other issues that Google already does fairly well but that can still cause problems include www versus non www version of the Web site (domain.com versus www.domain.com), session IDs and also linking using IP addresses. This tag if correctly implemented, should fix all of these problems.

Given the number of potential issues that this tag can correct, it should be added to every page of your site. If your site is dynamic, this should be a pretty easy addition.

Search Engine Optimization Tips and News:

Google SEO Tool to Stop Duplicate Content Problems!
Good news from Google, Yahoo and MSN/Live who have announced support for the rel="canonical" tag.

This new tag with the following syntax:
http://www.example.com/product.php?item=whatever" rel="canonical"

This should be placed in the head section of each page of the site, with the URL that should be used as the primary URL for that page.

So if for example, if this page is linked to somewhere in the site with this URL:
http://www.example.com/product.php?item=whatever&trackingid=1234

where a tracking id has been appended, Google, Yahoo and MSN will now understand that this page is really the original:
http://www.example.com/product.php?item=whatever

Therefore the duplicate URL is not indexed and the link connectivity is assigned to the right page. This is a wonderful new tool that now allows sites with extensive tracking codes to use that technology without duplicate content issues. Additionally, sites have CMS systems or dynamic Web platforms that generate multiple URL combination for the same page can now be brought under control without a huge list of 301 redirects. Cheers to the search engines for getting this one right.

Search Engine Optimization Tips on avoid supplemental results from Google.

1) Always have a unique Page Title and Meta Description for EVERY page of the Web site.

2) Try to limit the number of dynamic parameters in your URL strings to three or less. I know Google will index more than that, but Google is not the only search engine and the others are less consistent. Additionally, the longer the URL, the less likely Google will always get it right.

3) ***IMPORTANT. LOL. Make sure you only have ONE distinct URL path to any page of content and your site and that you are consistent in the way you link to that page (file name versus folder name). Don't use capital letters in your URL string. Often Webmasters will link to you with only lowercase letters because of their coding standards and you will have both the URL with the capital letters and the one without indexed in the engines. They are seen as separate pages by the search engines which can cause duplicate content issues and at best case will split your Page Rank.

4) Make sure you have your pages crossed linked in such a way that all relevant pages link to each other. Not only does this increase the usability of your site, but it increases the opportunity for deep level pages to get Page Rank. A page that is 8 levels deep in your site with only one inbound link is not likely to be seen as being very important.

5) Make sure the content on all your pages is unique. Pages with similar or the same content, whether they appear on your own site or a different Web site are actively filtered by Google. Google will not going to show the same content 10 times for a specific keyword phrase. So in the past, the document they considered to be the original was included in the main index and the copies were indexed in the supplemental results. Thus the phrase "duplicate content penalty". I am sure that a similar mechanism still exists because nothing has changed in Google don't like similar results for the same query.

p/s :This post is aim for educational purpose only.Do not Duplicating others author work without credits.

Search Expert tips and news

Credits toThe Authors

Duplicate Penalties and rel="canonical"

Recommended Blogs:

Google Related Links

Yahoo Related inks:

Blog Archive

Blogger Dashboard