Cloaking Overview

Cloaking is the process of delivering one version of a page to a user, and a different version to another user such as a search engine. The benefit is code and copyright protection. There are alternative uses for cloaking such as custom language delivery, IP based delivery for broadband users, and geotargeted advertising based on users earthly location.

Your browser connects to a website by sending a request. That request is a few lines of text formatted according to the HTTP protocol.

For example, a request for the root index page of a site: GET: /index.html HOST: www.webmasterworld.com USER_AGENT=Mozilla/4.71 (Windows 98;US) Opera 5.50 [en] REFERER=http://foobar.com REMOTE_ADDR=123.145.125.125

That is called a http "header".

When the webserver software like Apache receives the request, it looks up in its configuration file to see how the request should be processed. It looks at the host name and determines that the home files for www.webmasterworld.com are actually stored in /home/webmasterworld/. It then looks at the GET portion and see's that file it needs "index.html". So it then fetches /home/webmasterworld/index.html and serves it back to the user.

When Apache gets the index.html file from disk, it looks at it and determines what type of file it is ans if it should do anything different with the file. Sometimes the file includes SSI (server side include) directives in it. If it finds SSI directives, it then processes them. Other times a request may be for "index.cgi" and Apache sees that the .cgi files are to be executed as scripts. Once Apache executes our script, we now have control over what is sent back to the user.

With that as a back drop, this is where we start with the real Cloaking process.

As part of the Apache file serving process, it sets "environment" variables that match those http header values shown above. Our script can see the "User_Agent" and the "REMOTE_ADDR" from the headers that are the users ip address and browser. With those two little bits of information, we can determine the users hostname, and their IP address (we know who they are at this point - called IP Cloaking ).

With that info, we can custom build a page for the user. If the user is a search engine, we want to give it our best most optimized stuff. If it is a user, we want to give it a pretty page that is tricked out for navigation and usability.

Why Cloak? There are so many things that go into a well optimized page these days, that it is appropriate to protect your investment. Often when you get a high ranking page under quality keywords the first thing that will happen is your page gets stolen (called PageJacking ). Often that page is stolen just to put up a duplicate some where and reduce your rankings. Yes, that is what happens with many engines. They see the duplicate and think "duplicate" and mysteriously your page disappears in the next search engine update.

On still other engines you will see your competition suddenly bump up against your high ranking page - they stole all your best ideas. Often those ideas are as simple as keyword density or keyword location and frequency on the page - it doesn't have to be a duplicate.

And that's the short story of how and why we ip cloak via cloaking scripts.

Cloaking Related