Webmaster level: intermediate-advanced
Submitting sitemaps can be an important part of optimizing websites. Sitemaps enable search engines to discover all pages on a site and to download them quickly when they change. This blog post explains which fields in sitemaps are important, when to use XML sitemaps and RSS/Atom feeds, and how to optimize them for Google.
Sitemaps can be in XML sitemap, RSS, or Atom formats. The important difference between these formats is that XML sitemaps describe the whole set of URLs within a site, while RSS/Atom feeds describe recent changes. This has important implications:
For optimal crawling, we recommend using both XML sitemaps and RSS/Atom feeds. XML sitemaps will give Google information about all of the pages on your site. RSS/Atom feeds will provide all updates on your site, helping Google to keep your content fresher in its index. Note that submitting sitemaps or feeds does not guarantee the indexing of those URLs.
Example of an XML sitemap:
<?xml version="1.0" encoding="utf-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://example.com/mypage</loc> <lastmod>2011-06-27T19:34:00+01:00</lastmod> <!-- optional additional tags --> </url> <url> ... </url> </urlset>
Example of an RSS feed:
<?xml version="1.0" encoding="utf-8"?> <rss> <channel> <!-- other tags --> <item> <!-- other tags --> <link>http://example.com/mypage</link> <pubDate>Mon, 27 Jun 2011 19:34:00 +0100</pubDate> </item> <item> ... </item> </channel> </rss>
Example of an Atom feed:
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <!-- other tags --> <entry> <link href="http://example.com/mypage" /> <updated>2011-06-27T19:34:00+01:00</updated> <!-- other tags --> </entry> <entry> ... </entry> </feed>
“other tags” refer to both optional and required tags by their respective standards. We recommend that you specify the required tags for Atom/RSS as they will help you to appear on other properties that might use these feeds, in addition to Google Search.
XML sitemaps and RSS/Atom feeds, in their core, are lists of URLs with metadata attached to them. The two most important pieces of information for Google are the URL itself and its last modification time:
URLs in XML sitemaps and RSS/Atom feeds should adhere to the following guidelines:
Specify a last modification time for each URL in an XML sitemap and RSS/Atom feed. The last modification time should be the last time the content of the page changed meaningfully. If a change is meant to be visible in the search results, then the last modification time should be the time of this change.
<lastmod>
<pubDate>
<updated>
Be sure to set or update last modification time correctly:
XML sitemaps should contain URLs of all pages on your site. They are often large and update infrequently. Follow these guidelines:
RSS/Atom feeds should convey recent updates of your site. They are usually small and updated frequently. For these feeds, we recommend:
Generating both XML sitemaps and Atom/RSS feeds is a great way to optimize crawling of a site for Google and other search engines. The key information in these files is the canonical URL and the time of the last modification of pages within the website. Setting these properly, and notifying Google and other search engines through sitemaps pings and PubSubHubbub, will allow your website to be crawled optimally, and represented accordingly in search results.
If you have any questions, feel free to post them here, or to join other webmasters in the webmaster help forum section on sitemaps.
Enter your email address:
Delivered by FeedBurner