educational

Robot and Spider Control

Editor’s note: Search engine spiders are typically the only kind of spiders that Webmasters want to see hanging around. These robots quietly crawl their way around the World Wide Web seeking out every page they can find, and reporting their contents back to their search engine masters. This is usually a welcome operation as it often leads to more ‘free’ traffic – but occasionally robots find their way into places we wish they wouldn’t, exposing sensitive information for the world to see… Here’s how to help prevent this from happening: ~ Stephen

Before submitting your site to the search engines, you will want to consider what pages and links you want the search engine "robot" (the program that indexes your site) to "spider" (follow), and what pages you don’t want it to follow – since you may have pages with sensitive information, a ‘scrap directory’ full of "work in progress," or a protected "members area" that you would not like listed.

This goal can easily be achieved in two ways. The first way is with a robots.txt file placed in the root directory of your Website, but you must have full domain privileges in order for this to work. While this article is not meant to deal with the intricacies of the robots.txt file, a quick word of warning is in order: never leave this file empty, as it will indicate to some robots that you do not want any part of your site indexed.

The other way to stop most ‘bots’ from searching or indexing your page is to use META exclusion tags. This is often the only way that Webmasters on virtual or free hosts without full server access can hope to control a spider’s wanderings and reports on a page-by-page basis. The syntax is simple:

<META name="ROBOTS" content="ALL,NONE,INDEX,FOLLOW,NOINDEX,NOFOLLOW">

The default value for the robots tag is "ALL" which allows the robot to index the page, then spider all links, indexing the linked pages too. "NONE" performs the opposite, disallowing the robot from either indexing the page, or spidering the links on it, in essence ignoring the page altogether.

"INDEX" indicates that robots should include this page in their search engines, while "FOLLOW" means that robots should follow (spider) the links on this page. Conversely, a value of "NOINDEX" allows links from the page to be spidered, even though the page itself is not indexed, while a value of "NOFOLLOW" allows the page to be indexed, but no links from the page are to be spidered.

Some Sample Snippets
Here’s some example robot controlling META tags, which would be put in between your document’s <HEAD> and </HEAD> tags:

<META name="ROBOTS" content="NOINDEX">
- This will prevent the bot from indexing that page.

<META name="ROBOTS" content="NOFOLLOW">
- This allows the page to be indexed, but any hyperlinks in that page will not be spidered.

<META name="ROBOTS" content="NOINDEX,NOFOLLOW">
- Is a combination of the two, where the page will not be indexed, and other links will not be followed. This tag may also prevent some mirroring software from downloading the page.

While there are many other META tags that can be used to improve your rankings, controlling what’s ranked is the first step, after which it’s wiser to invest your time in optimizing your description and keywords tags in order to boost your search engine rankings, which is the subject of my next article…

Copyright © 2024 Adnet Media. All Rights Reserved. XBIZ is a trademark of Adnet Media.
Reproduction in whole or in part in any form or medium without express written permission is prohibited.

More Articles

opinion

Why Cyber Insurance Is Crucial for Adult Businesses

From streaming services and interactive platforms to ecommerce and virtual reality experiences, the adult industry has long stood at the forefront of online innovation. However, the same technology-forward approach that has enabled adult businesses to deliver unique and personalized content to consumers worldwide also exposes them to myriad risks.

Corey D. Silverstein ·
opinion

Best Practices for Payment Gateway Security

Securing digital payment transactions is critical for all businesses, but especially those in high-risk industries. Payment gateways are a core component of the digital payment ecosystem, and therefore must follow best practices to keep customer data safe.

Jonathan Corona ·
opinion

Ready for New Visa Acquirer Changes?

Next spring, Visa will roll out the U.S. version of its new Visa Acquirer Monitoring Program (VAMP), which goes into effect April 1, 2025. This follows Visa Europe, which rolled out VAMP back in June. VAMP charts a new path for acquirers to manage fraud and chargeback ratios.

Cathy Beardsley ·
opinion

How to Halt Hackers as Fraud Attacks Rise

For hackers, it’s often a game of trial and error. Bad actors will perform enumeration and account testing, repeating the same test on a system to look for vulnerabilities — and if you are not equipped with the proper tools, your merchant account could be the next target.

Cathy Beardsley ·
profile

VerifyMy Seeks to Provide Frictionless Online Safety, Compliance Solutions

Before founding VerifyMy, Ryan Shaw was simply looking for an age verification solution for his previous business. The ones he found, however, were too expensive, too difficult to integrate with, or failed to take into account the needs of either the businesses implementing them or the end users who would be required to interact with them.

Alejandro Freixes ·
opinion

How Adult Website Operators Can Cash in on the 'Interchange' Class Action

The Payment Card Interchange Fee Settlement resulted from a landmark antitrust lawsuit involving Visa, Mastercard and several major banks. The case centered around the interchange fees charged to merchants for processing credit and debit card transactions. These fees are set by card networks and are paid by merchants to the banks that issue the cards.

Jonathan Corona ·
opinion

It's Time to Rock the Vote and Make Your Voice Heard

When I worked to defeat California’s Proposition 60 in 2016, our opposition campaign was outspent nearly 10 to 1. Nevertheless, our community came together and garnered enough support and awareness to defeat that harmful, misguided piece of proposed legislation — by more than a million votes.

Siouxsie Q ·
opinion

Staying Compliant to Avoid the Takedown Shakedown

Dealing with complaints is an everyday part of doing business — and a crucial one, since not dealing with them properly can haunt your business in multiple ways. Card brand regulations require every merchant doing business online to have in place a complaint process for reporting content that may be illegal or that violates the card brand rules.

Cathy Beardsley ·
profile

WIA Profile: Patricia Ucros

Born in Bogota, Colombia, Ucros graduated from college with a degree in education. She spent three years teaching third grade, which she enjoyed a lot, before heeding her father’s advice and moving to South Florida.

Women In Adult ·
opinion

Creating Payment Redundancies to Maximize Payout Uptime

During the global CrowdStrike outage that took place toward the end of July, a flawed software update brought air travel and electronic commerce to a grinding halt worldwide. This dramatically underscores the importance of having a backup plan in place for critical infrastructure.

Jonathan Corona ·
Show More