Thursday, July 9, 2015

web - Which sites reject crawler requests?

Is there any site which can be rejected by a crawler? I am using the Burp Suite crawler for the time being to crawl the sites.



I want to know when and in which cases a crawler fails to retrieve the results, as I have to build such a site which should reject crawler requests.



I have been running the above mentioned crawler on random sites, but couldn't find any specific site which rejected the crawler requests. Somehow, Burp Suite managed to get all the data from the sites.



Is this possible? Which sites reject these crawler requests?

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...