How can I limit the crawler to index within my website or domain?

You can ensure that the crawler stays within the same domain or website by specifying the pattern within the Allow Path box found here: Collections > Paths > Allow Paths.

For example, if you want to index all URLs on within www.cnn.com, and want the spider to only crawl within the same website, just enter www.cnn.com in the Allow Paths box.

 

Have more questions? Submit a request

Comments