There are a few reasons:

Reason 1: you attempted to crawl a domain without entering : www

for example, you enter:


http://shenandoahconnection.com/


but the links on that page are all pointing to URLs with : HTTP://www.

http://www.shenandoahconnection.com/aboutus.htm
http://www.shenandoahconnection.com/worldnews.htm
http://www.shenandoahconnection.com/testimonials.htm
http://www.shenandoahconnection.com/post-here.htm
http://www.shenandoahconnection.com/shenandoahadvertising.htm
http://www.shenandoahconnection.com/make-payment.htmso


so our crawler is very STRICT in that sense and sees that 2 different domains.
It has a simple rule to ONLY crawl within the domain you set to crawl.

So those links are skipped!

Solution: Crawl with www
http://www.shenandoahconnection.com/


Reason 2:
Our crawler only crawled URLs from the domain you set to crawl.
All other external links ( including subdomains ) will be ignored ❌
All dynamic JS URLs will be ignored ( here is a Technical reason & solution 🤓)

Example: https://desyr.co.uk
If you view the page-source*, you will see only these standard URLs:



Reason 3: You entered a URL with a /directory/ in it.
More here >

Did this answer your question?