by Get Real! » Mon Apr 23, 2018 9:42 am
Sotos, we can easily prove this by adding this simple test to our website...
if(navigator.userAgent=='I am Get Real Bot'){alert('Sotos just paid me a visit!');}
The alert message above will ALWAYS FIRE when you visit me using one of your browsers (which has set this UAS), but a bot will NEVER fire it even if it has the same UAS because it operates differently:..
For example, a "header retrieving bot" can just...
var Bot=new XMLHttpRequest(), S='';
Bot.open('GET','http://mysite.com/',true);
Bot.send();
Bot.onreadystatechange=function(){
if(this.readyState==this.HEADERS_RECEIVED){
S+=Bot.getAllResponseHeaders();
}
};
Our bot above grabs the response headers from sites and appends them to a string... a useless bot but it proves a point of how to fetch stuff without loading the site.
Did I load the website into a browser window? Nope!
Can the "trap" javascript at the top of my site catch it? Nope... the bot hasn't even loaded it!
So to summarize things... human users always request index.htm, whereas bots can just make specific HTTP requests to fetch things.
Of course, a bot can also retrieve the entire website script undetected without even loading it if it wanted to... which I can demonstrate.
That's it basically...