This short tutorial will cover the cloaking of web page META tags, which follows a different procedure than the IP delivery and full page cloaking method commonly employed for high grade Web page and serach engine stealthing.
Server Requirements
To take advantage of this procedure you must be able to make use of Server Side Includes (SSI) on your web server. Note IIS/4.0 users: The code presented here is an extended SSI expression which is not supported under IIS/4.0.
META tag cloaking is effected by excluding browsers from viewing certain parts of a web page, specifically the header where META tags are positioned by default. Browsers are determined by their UserAgent variable. Once properly cloaked, it won't make any difference whether you read the source code online or whether you download it for viewing offline – the META tag code will remain hidden, the browser will not be able to read it and will therefore not download it either. Here is a list of UserAgents as used by popular browsers:
- "Lynx": Lynx text browser
- "Mozilla": Netscape browsers
- "MSIE": Microsoft Internet Explorer
- "NCSA Mosaic": Mosaic technology based browsers like Spry, Spyglass, etc.
- "Opera": Opera browser
- "WebTV" - WebTV's proprietary browser
Activating SSI via .htaccess
If your web server is not configured for SSI by default, you will need to upload a file named ".htaccess" (please note the period/dot at the beginning of the file name!) to your server directory. This can be done by Telnet or FTP. The .htaccess file should have the following content:
Options Includes +ExecCGI
AddType text/x-server-parsed-html .html
Note that many web servers will not require the specfication "Includes", meaning you can omit it altogether. However, since it won't do any harm to keep it in your file, we suggest you do not change the above entry. Thus, should you switch servers some day, you will not have to readjust your .htaccess file. After you have uploaded the modified .htaccess file (MUST be in Ascii mode!), you're ready to go.
In the HEAD section of the web page whose META tags you wish to protect, place the following:
<!--# if expr="\"$HTTP_USER_AGENT\"
!= /Mozilla|MSIE|Opera|Lynx|WebTV|NCSA Mosaic/" -->
VERY IMPORTANT: The above should actually be in one SINGLE line! Page formatting tends to word wrap lines which are too long for display, but make no mistakes: The code above MUST be free from line wraps, or it won't work! Under this first header entry, you may now add the actual META tags you wish to protect. When you are done, you must close the protected section with the last header entry, or the rest of your page won't be displayed either!
<!--# endif -->
VERY IMPORTANT: If you have other entries in your page header (e.g. for an external CSS style sheet, an external JavaScript applet, etc.) you MUST place these OUTSIDE the protected area (but WITHIN the header tags) or they will not work unless you are operating with a browser sporting a UserAgent not included in the code above.
So What Does It Do?
The SSI code outlined above will determine the accessing browser by its UserAgent variable. If it is recognized, the system will skip the content within the exclusion tags, effectively preventing the META tags from being displayed. Search engine spiders not using common browser UserAgent variables (most don't) will still get to read the META tags nevertheless, which is, of course, what you want them to do.
The method outlined above may well qualify for "poor man's cloaking" - it is NOT an industrial-strength protection against code snoops, the more so as UserAgents can easily be forged ("spoofed"), but it will cover about 95% of all ordinary browsers and their users without putting an undue strain on server load and, hence, system performance. Bear in mind, too, that META tags are gradually losing in importance as many search engines have stopped indexing them because of massive abuse by keyword spamming ("spamdexing") and irrelevant description tags in the past.
This technique can also be used to prevent email harvester bots (address extractors) from culling email addresses from textarea fields. You can read more about protecting textarea fields from email harvesters here.