All Collections
Tutorials
Sitemap Max Page & Max Depth Limits
Sitemap Max Page & Max Depth Limits

Everything you need to know about how these settings work

Jeff avatar
Written by Jeff
Updated over a week ago

Sitemap crawling is limited by three dimensions:

Max Pages : Total number of total pages
Max Pages per /Directory/ : Total number of total pages per URL /directory/
Max Depth : Page-level depth



Depth is interpreted like so:

configuration = Max Depth = `2`
URL = Site.com/products/shoes/
output = crawls up to /shoes/


If you leave it blank, it will crawl all levels ( up to and allowed by your plan ).

When you create a sitemap you can specify both of these limits or leave them blank, which will default to your plan's limits.

Take a sitemap like Stripe.com:

You can enter Max pages 3 to get something like:

OR this..

The outcome will always be slightly different because the crawler looks at the first set of links it finds from the homepage, and then randomly crawls them, until it is done, and finds more links on the other child pages.

The result depends on your site URL structure, this is one possible outcome. The important thing is that you always will end with 3 pages and no more.



Limiting results to a specific depth amount via Max Depth:

You can enter “Max depth 1” to ignore pages deeper than level 1 and get something like:


Leaving Max Pages or Max Depth ... empty

So what happens when you leave any of those fields empty?
Does it mean that there is no limit? Well, no.

In that case, a sitemap is going to use the workspace’s plan limit.

For example, if you are on a Free plan, leaving “max pages” empty will crawl up to 25 pages, and leaving “Max depth” empty will crawl up to 2 levels, while a Freelancer plan will crawl up to 1500 pages and unlimited levels.



Max Pages per /Directory/
Sometimes you done need to crawl all your 1000s of blog posts or product pages, but you do wish you showcase their presence on the map in some limited capacity while not spending all your credits/time. This config allows you to do just that!

image.png

Reaching limits: Setting them manually vs plan's hard limits

What happens to the sitemap when the crawl reaches a limit?
Well, naturally that depends on crawling settings!

If you explicitly enter a max-page limit:
- the crawler will stop, and the sitemap is finished when it hits that limit.

If the limit is empty (and crawl's under the plan's limits) :
- then the crawl pauses when it reaches that limit, so that you have the opportunity to upgrade to a higher plan, or buy more credits, to finish the crawl.



Paused vs Finished sitemaps

Why do these differences matter?


Paused sitemaps can be resumed and continue where you left crawling but once you or the plan's limit is reached the sitemap stops... it’s done.

Finishing the crawl process is a requirement to use other pro features like Editing and Visual Comparison. If you are exploring, it makes sense to work on plan limits let us pause the crawl and decide to upgrade & resume or stop. If you know ahead what you want, i.e if you are scheduling a visual comparison, it’s better to specify the limits.

Completing the crawl process is essential to access other professional features, such as Editing and Visual Comparison.

Should you specify a Max-Pages?
If you are looking to do a full exploratory audit of a site, we recommend that you have no max-pages configured. This allows you to decide whether to pause, stop, or upgrade for more crawling capacity.

If you already know what URLs you need, for example, if you are planning a visual comparison, it's advisable to specify the max-pages limits beforehand.
​​

Note: When resuming paused crawls, you also have the ability to change the Max-Page/Depth limits.


Changing your plan

Upgrading your plan will change the plan's limits instantly.
Downgrading keeps the old limits until the end of your 30 days billing cycle.

Once you upgrade and resume a paused crawl, you will have the opportunity to configure the crawl's limits again. stopped crawls do not have this option.


Use-Case: Let’s say that you crawl under a Free plan a site that has:

300 level-1 pages
25 level-2 pages
5 level-3 pages


Let’s see what happens depending on the settings.


Max Page = 10, Max Depth = 1: Sitemap will be finished with 10 pages, all level-1


Max Pages = empty, Max Depth = 1: The sitemap will be finished with 25 pages, all level-1


Max Pages = 25, Max Depth = 1: Sitemap will be finished with 25 pages, all level-1


Max Pages = 25, Max Depth = empty: The sitemap will be finished with 25 pages. Levels depend on the site structure.


Max Pages = empty, Max Depth = empty: The sitemap will pause due to a max page limit of 25 pages. A mix of level-1 and 2. The amount of pages of each level depends on the site structure.


Now let’s say that you crawl under a Free plan a website with

20 level-1 pages  
2 level-2 pages
3 level-3 pages


Max Page = 22, Max Depth = empty: Sitemap will stop with 22 pages (level-1 and 2).


Max Page = 25, Max Depth = empty: The sitemap will be paused due to the depth limit of 25 pages (level-1 and 2). This configuration shows that we pause when the only way to fulfill max-pages expectations is by crawling deeper.


Max Page = 20, Max Depth = 1: The sitemap will be finished with 20 pages, all level-1


Max Page = empty, Max Depth = empty: The sitemap will pause due to the depth limit of 22 pages (level-1 and 2).


Best practices:

  • 🔥 If you are not sure what to do, just leave Max Depth empty!
    This will ensure that the crawl will auto-pause if we find more pages on deeper levels, and allow you to upgrade to fully complete the crawl.

Did this answer your question?