All Collections
Tutorials
👀 Getting Started - How to generate your first sitemap!
👀 Getting Started - How to generate your first sitemap!

Crawling

Artur M avatar
Written by Artur M
Updated over a week ago


Start > Watch "The Basics" video.. to get a full overview.

Creating your first map:

IMPORTANT : Do a PRE-CRAWL sample first 🤓
Don't say we didn't warn you ;-)


When starting a new crawl, we highly recommend that you start crawling just 10 pages as a sample test. ​To limit your crawl amount, use the 'Max Pages' field:


Benefits:

1 - This protects you from wasting all your monthly credits.

2 - So you can test and see if the site is fully crawl-able.

If you don't see anything out of the ordinary (ie., a popup or other cookie content which you can easily remove like so), and If everything looks great, fantastic.

It's now safe to run your full crawl ( but again, you should crawl responsibly, and every few minutes check-in to see how it is looking.)

Running a FULL CRAWL via URL


Step 1: Enter a valid Url (http or https)

You can enter the Main URL ( site.com ) or a URL with a /directory/ 
(ie: Site.com/products/food/)

If you use a URL + /directory/, it will be used as the base starting point.

All other external /directories/ will be ignored by default - unless you select [] Crawl pages outside of /products/food/.

Step 2: Enter Max Pages (optional)

If you enter 100  it will crawl up to 100 pages..and then STOP cold.

If you leave it blank, it will :

1 - crawl up to your plan's sitemap limits: 500/Mini, 1500/Freelance, 3000/Team )
2- auto-pause the crawl, allowing you to upgrade ( if needed ) to finish crawling the full site ( if more pages are found )

Step 3: Enter Page Depth (optional)

Config = Max-Depth = 2
URL = Site.com/products/shoes/
Result : It will crawl up to /shoes/


**If you leave it blank, it will crawl all levels ( up to your plan's limits ) and it will auto-pause the crawl once it reaches your plan's limit - allowing you to upgrade, buy more credits, and restart the crawl to finish it.

>> More on how best to use Max Depth and Max Pages here <<

Step 4:  Advanced Settings (optional)

Use-Cases:

  • Filter Out/In Specific Directories
    you can enter multiple directories to filter out 

  • Filter Out/In Specific Keywords/Parameters in URL
    ​ you can enter multiple keywords to filter out

☞ See the full range of Advanced filters and examples here >

  • Hide Annoying Modal Windows or Cookie Notices


  • Screenshot Delay

    Waits a certain number of seconds for the page to fully load before taking a screenshot. Minimum 5 seconds, maximum 10 seconds. Defaults to 5 .


  • Over-Crawl Protection (optional)
    highly recommended. More info here >
    *only available for regular Paid plans.

Step 5:  Select a URL to crawl .

.then click Crawl Now


Now go grab a coffee or chai and enjoy your new free time!



​DURING THE CRAWL

Crawling Statuses you will encounter:

Pending = your crawl is in the queue. *This should take a few seconds.
If you find this stuck for more than 5 min - please let us know via chat.


In Progress = The pages are being crawled and the sitemaps are being generated. Meanwhile, you can click View  to see the map progress.


Crawl Failed = this could be for several reasons. ie, our crawler can't load your site, or some temporary internal failure. Either way, please let us know via chat and we will investigate it.

PROTIP - At any time during the crawl, we suggest that you give it a sneak peek and see how things are going via the Refresh button ( in the tiny case that you discover some weird results and wish to stop and adjust your settings for a better crawl ) *once you STOP a crawl you will not be able to restart it. Pausing a crawl, will allow you to only reset the Max-Page/Depth settings, and resume the crawl.

Click the Refresh icon to load the new updates.


done!

Did this answer your question?