Webmaster Central Blog
Official news on crawling and indexing sites for the Google index
To infinity and beyond? No!
martedì, agosto 05, 2008
When Googlebot crawls the web, it often finds what we call an "infinite space". These are very large numbers of links that usually provide little or no new content for Googlebot to index. If this happens on your site, crawling those URLs may use unnecessary bandwidth, and could result in Googlebot failing to completely index the real content on your site.
Recently, we started notifying site owners when we discover this problem on their web sites. Like most messages we send, you'll find them in
Webmaster Tools
in the Message Center. You'll probably want to know right away if Googlebot has this problem - or other problems - crawling your sites. So verify your site with Webmaster Tools, and check the Message Center every now and then.
Examples of an infinite space
The classic example of an "infinite space" is a calendar with a "Next Month" link. It may be possible to keep following those "Next Month" links forever! Of course, that's not what you want Googlebot to do. Googlebot is smart enough to figure out some of those on its own, but there are a lot of ways to create an infinite space and we may not detect all of them.
Another common scenario is websites which provide for filtering a set of search results in many ways. A shopping site might allow for finding clothing items by filtering on category, price, color, brand, style, etc. The number of possible combinations of filters can grow exponentially. This can produce thousands of URLs, all finding some subset of the items sold. This may be convenient for your users, but is not so helpful for the Googlebot, which just wants to find everything - once!
Correcting infinite space issues
Our
Webmaster Tools Help article
describes more ways infinite spaces can arise, and provides recommendations on how to avoid the problem. One fix is to eliminate whole categories of dynamically generated links using your robots.txt file.
The Help Center has lots of information on how to use robots.txt
. If you do that,
don't forget to verify that Googlebot can find all your content
some other way. Another option is to block those problematic links with a "nofollow" link attribute. If you'd like
more information on "nofollow" links
, check out the Webmaster Help Center.
Written by Torrey Hoffman, Webmaster Tools team
Hey!
Check here if your site is mobile-friendly.
Etichette
accessibility
10
advanced
195
AMP
13
Android
2
API
7
apps
7
autocomplete
2
beginner
173
CAPTCHA
1
Chrome
2
cms
1
crawling and indexing
158
encryption
3
events
51
feedback and communication
83
forums
5
general tips
90
geotargeting
1
Google Assistant
3
Google I/O
3
Google Images
3
Google News
2
hacked sites
12
hangout
2
hreflang
3
https
5
images
12
intermediate
205
interstitials
1
javascript
8
job search
2
localization
21
malware
6
mobile
63
mobile-friendly
14
nohacked
1
performance
17
product expert
1
product experts
2
products and services
63
questions
3
ranking
1
recipes
1
rendering
2
Responsive Web Design
3
rich cards
7
rich results
10
search console
35
search for beginners
1
search queries
7
search results
140
security
12
seo
3
sitemaps
46
speed
6
structured data
33
summit
1
TLDs
1
url removals
1
UX
3
verification
8
video
6
webmaster community
24
webmaster forum
1
webmaster guidelines
57
webmaster tools
177
webmasters
3
youtube channel
6
Archive
2020
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2019
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2018
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2017
dic
nov
ott
set
ago
giu
mag
apr
mar
feb
gen
2016
dic
nov
ott
set
ago
giu
mag
apr
mar
gen
2015
dic
nov
ott
set
ago
lug
mag
apr
mar
feb
gen
2014
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2013
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2012
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2011
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2010
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2009
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2008
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2007
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2006
dic
nov
ott
set
ago
Feed
Follow @googlewmc
Give us feedback in our
Product Forums
.
Subscribe via email
Enter your email address:
Delivered by
FeedBurner