Cloaking is the technique of returning different pages according to who or what is requesting them; e.g. a surfer would receive the actual page whereas a search engine spider would receive a different page, but would assume that it is the actual page that surfers see. Here's a look at some of the details:
The purposes of cloaking for search engine optimization are to hide highly optimized pages from people, so that they can't be stolen, and to provide search engine spiders with highly optimized pages that wouldn't look particularly good in browsers.
There are three ways of cloaking. One is "IP delivery", where the IP addresses of spiders are recognized at the server and handled accordingly; another is "User-Agent delivery", where the spiders' User-Agents are recognized, and the third is the first two combined.
Is cloaking ethical?
Let me put it this way: Search engines do it all the time. For instance, Google delivers different pages according to where in the world the surfer is located. People in the UK receive AdWord advertisements that are relative to the UK, and people in the U.S. receive AdWords that are relative to the U.S. Google delivers different pages according to the surfer's IP address. That's IP delivery, and that's cloaking.
Also, from time to time, search engines prevent certain people from gaining access to their .com versions. Instead, by checking the surfers' IP addresses, they redirect people to their localized versions - even when the surfers really do want to go to the .com version! Again, that's IP delivery, and that's cloaking.
Because the search engines do it, it is clear that cloaking isn't intrinsically unethical or wrong. If it's ok for the search engines to do it, then it must be ok for everyone else to do it. So what is it about cloaking that some people are dead against?
There is no sensible answer to that. The general idea is that serving people one page and serving the search engines a different page is simply wrong, because the engines are ranking the page according to what they believe it to be and not according to what it actually is. That idea is purely a matter of principle, and nothing at all to do with common sense.
The common sense view is that, if a page is ranked correctly, according to its topic, then surfers will find it in the search results, click on it and go to exactly where they expect to go to from reading the listing in the search results. It doesn't matter how the page came to be ranked in that position, and it doesn't matter if another page took its place when the engine was evaluating it. As long as it is ranked correctly, according to its topic, surfers are perfectly happy.
The fact that cloaking can be used to send people to sites and topics that they did not expect to go to when clicking on a listing in the search results, is an excellent reason to be against the misuse of cloaking, but it is no reason at all to be against the cloaking technique in general.
How Cloaking Actually Helps Google
Google's crawlers won't spider pages that have anything that looks like Session IDs in their URLs. If they did, they run the risk of spidering a potentially infinite number of pages, because each page that is requested would contain links to other pages, and the link URLs would contain the current session ID, which makes them different URLs than the last time the page was requested. And so it would go on and on and on, producing a vast number of unique URLs to spider and index.
It means that Google won't spider most of the pages on some websites. But Google actually wants to spider most or all of each website's pages. The solution is to cloak the pages. By spotting page requests from the Google spiders, and delivering modified pages without the normal Session IDs in the link URLs, Google is able to spider all of a site's pages. This is precisely what Google wants, it's what the website owner wants and, if asked, it would be what surfers want. It helps everybody and harms no-one.
This example alone demonstrates that cloaking is not intrinsically wrong or unethical. The technique can be used unethically, but it has various perfectly ethical uses.
How is cloaking done?
A simple way of doing it on an Apache server is by using the .htaccess file with the mod_rewrite module. With an .htaccess file in place, every file request is subject to it.
The .htaccess file has many uses but, for cloaking purposes, it employs Apache's mod_rewrite module to check for the search engine spiders' IP addresses, User-Agents, or both. If a spider is detected, then mod_rewrite is used to return a page that has been specially designed for spiders. If the requester is not a spider, then the request goes through as normal and the normal page is returned. Spiders are not aware of the switch. As far as they are concerned, they are getting the page that they requested.
Fantomas is widely recognized as the best cloaking software around. Information about it can be found here.
Stay tuned for our next installment, "auto-redirection"...