Changeset 266

Show
Ignore:
Timestamp:
03/03/2007 03:31:29 AM (6 years ago)
Author:
GooDiffMonitor
Message:

Modified files:

  • /google/books.google.com/webmasters/guidelines.html
  • /google/books.google.com/webmasters/faq.html

GooDiffMonitor run finished @ 2007-03-03 03:31:26.122772

Files:

Legend:

Unmodified
Added
Removed
Modified
Copied
Moved
  • google/books.google.com/webmasters/faq.html

    r265 r266  
    1 ** G  o  o  g  l  e     **   
    2 ** Error ** 
     1[ ](/) 
     2**  Google Information for Webmasters ** 
     3 
     4[ Home ](/)    
     5   
     6[ About Google ](/about.html)    
     7   
     8** Webmaster Info **    
     9   [ ** FAQ ** ](index.html)    
     10   [ Guidelines ](guidelines.html)    
     11   [ Facts & Fiction ](facts.html)    
     12   [ SEOs ](seo.html)    
     13   [ Googlebot ](bot.html)    
     14   [ Feedfetcher ](../feedfetcher.html)    
     15   [ Removals ](remove.html)    
     16 
     17 
     18_ Find on this site:  _    
     19   
     20 
     21 
    322  
    423 
    5 > #  Not Found  
    6 >  
    7 > The requested URL ` /webmasters/faq.html ` was not found on this server.  
     24** Advanced Questions **    
     25   
     26** General Questions ** 
     27 
     28  1. [ How often will Google crawl my site? ](#crawl) 
     29  2. [ How can I migrate my site to a new IP address? ](#migrate) 
     30  3. [ Why is my site labeled "Supplemental"? ](#label) 
     31  4. [ I'd like my site to return for pages from a specific country. ](#country) 
     32 
     33** Results Prefetching Questions ** 
     34 
     35  1. [ What is results prefetching, and how does it impact my site? ](#prefetching) 
     36  2. [ Can I distinguish between prefetch requests to my web server from normal requests? ](#prefetchheaders) 
     37  3. [ I want to block/ignore prefetch requests. What should I do? ](#prefetchblock) 
     38 
     39**  General Questions ** 
     40 
     41** 1. How often will Google crawl my site? ** 
     42 
     43Google's spiders regularly crawl the web to rebuild our index. Crawls are based on many factors such as PageRank, links to a page, and crawling constraints such as the number of parameters in a URL. Any number of factors can affect the crawl frequency of individual sites.  
     44 
     45Our crawl process is algorithmic; computer programs determine which sites to crawl, how often, and how many pages to fetch from each site. For tips on maintaining a crawler-friendly website, please visit our [ Webmaster Guidelines ](guidelines.html) .  
     46 
     47** 2. How can I migrate my site to a new IP address? ** 
     48 
     49We recommend migrating a site to a new IP address with the following steps:  
     50 
     51  1. Bring a copy of your site up at the new IP address.  
     52  2. Update your nameserver to point to the new IP address.  
     53  3. Once you see search engine spiders fetch pages from the new IP address (typically within 24-48 hours), it's safe to take down the copy of your site at the old IP address.  
     54 
     55** 3. Why is my site labeled "Supplemental"? ** 
     56 
     57Supplemental sites are part of Google's auxiliary index. We're able to place fewer restraints on sites that we crawl for this supplemental index than we do on sites that are crawled for our main index. For example, the number of parameters in a URL might exclude a site from being crawled for inclusion in our main index; however, it could still be crawled and added to our supplemental index.  
     58 
     59The index in which a site is included is completely automated; there's no way for you to select or change the index in which your site appears. Please be assured that the index in which a site is included does not affect its PageRank.  
     60 
     61** 4. I'd like my site to return for pages from a specific country. ** 
     62 
     63While all sites in our index return for searches restricted to "the web," we draw on a relevant subset of sites for each country restrict. Our crawlers may identify the country for a site by factors such as the physical location at which the site is hosted, the site's IP address, the WHOIS information for a domain, and its top-level domain.  
     64 
     65That said, your site's top-level domain doesn't need to match the country domain for which you'd like it to return. It's also important to keep in mind that our crawlers don't index duplicate content, so creating identical sites at several domains will likely not result in their returning for many country restricts. If you do create duplicate domains, we suggest using a robots.txt file to block our crawler from accessing all but your preferred one.  
     66 
     67**  Results Prefetching Questions ** 
     68 
     69** 1. What is "results prefetching," and how does it impact my site? ** 
     70 
     71On some searches, Google uses a special tag supported by Firefox and Mozilla to instruct the browser to download the top search result before the user clicks on the result. When the user clicks on the top result, the destination page will load faster than before. This tag is only inserted when it is likely that the user will click on the first link.  
     72 
     73For example, when a Firefox user searches for [ [ stanford ](/search?q=stanford) ], Google includes the following tag in the results HTML:  
     74 
     75` <link rel="prefetch" href="http://www.stanford.edu/"> ` 
     76 
     77The official [ Mozilla Link Prefetching FAQ ](http://www.mozilla.org/projects/netlib/Link_Prefetching_FAQ.html) describes the behavior of this tag in detail.  
     78 
     79Prefetching may impact your site because the prefetch request will happen whether or not the user clicks on the result, so it may result in additional traffic to your web server. Google only inserts this tag when there is a high likelihood that the user will click on the top result, but clearly this heuristic is not right 100% of the time.  
     80 
     81** 2. Can I distinguish prefetch requests from normal requests? ** 
     82 
     83Yes, as described in the [ Mozilla Link Prefetching FAQ ](http://www.mozilla.org/projects/netlib/Link_Prefetching_FAQ.html#As_a_server_admin_can_I_distinguish) , prefetch requests include the additional HTTP header  
     84 
     85` X-moz: prefetch ` 
     86 
     87** 3. I want to block/ignore prefetch requests. What should I do? ** 
     88 
     89To block or ignore prefetch requests (from Google and other web sites), you should configure your web server to return a 404 HTTP response code for requests that contain the "  ` X-moz: prefetch ` " header.  
     90 
     91    
     92 
     93(c)2007 Google - [ Home ](/) - [ About Google ](/about.html) - [ We're Hiring ](/jobs/) - [ Site Map ](/sitemap.html) 
     94 
  • google/books.google.com/webmasters/guidelines.html

    r265 r266  
    1 ** G  o  o  g  l  e     **   
    2 ** Error ** 
     1[ ](/) 
     2**  Google Information for Webmasters ** 
     3 
     4[ Home ](/)    
     5   
     6[ About Google ](/about.html)    
     7   
     8** Webmaster Info **    
     9   [ FAQ ](index.html)    
     10   ** Guidelines  **    
     11   [ Facts & Fiction ](facts.html)    
     12   [ SEOs ](seo.html)    
     13   [ Googlebot ](bot.html)    
     14   [ Feedfetcher ](../feedfetcher.html)    
     15   [ Removals ](remove.html)    
     16 
     17 
     18_ Find on this site:  _    
     19   
     20 
     21 
    322  
    423 
    5 > #  Not Found  
    6 >  
    7 > The requested URL ` /webmasters/guidelines.html ` was not found on this server.  
     24** Webmaster Guidelines ** 
     25 
     26Following these guidelines will help Google find, index, and rank your site. Even if you choose not to implement any of these suggestions, we strongly encourage you to pay very close attention to the "Quality Guidelines," which outline some of the illicit practices that may lead to a site being removed entirely from the Google index. Once a site has been removed, it will no longer show up in results on Google.com or on any of Google's partner sites.  
     27 
     28** Design and Content Guidelines: ** 
     29 
     30  * Make a site with a clear hierarchy and text links. Every page should be reachable from at least one static text link.  
     31  * Offer a site map to your users with links that point to the important parts of your site. If the site map is larger than 100 or so links, you may want to break the site map into separate pages.  
     32  * Create a useful, information-rich site, and write pages that clearly and accurately describe your content.  
     33  * Think about the words users would type to find your pages, and make sure that your site actually includes those words within it.  
     34  * Try to use text instead of images to display important names, content, or links. The Google crawler doesn't recognize text contained in images.  
     35  * Make sure that your TITLE and ALT tags are descriptive and accurate.  
     36  * Check for broken links and correct HTML.  
     37  * If you decide to use dynamic pages (i.e., the URL contains a "?" character), be aware that not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them few.  
     38  * Keep the links on a given page to a reasonable number (fewer than 100).  
     39 
     40** Technical Guidelines: ** 
     41 
     42  * Use a text browser such as Lynx to examine your site, because most search engine spiders see your site much as Lynx would. If fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of your site in a text browser, then search engine spiders may have trouble crawling your site.  
     43  * Allow search bots to crawl your sites without session IDs or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of your site, as bots may not be able to eliminate URLs that look different but actually point to the same page.  
     44  * Make sure your web server supports the If-Modified-Since HTTP header. This feature allows your web server to tell Google whether your content has changed since we last crawled your site. Supporting this feature saves you bandwidth and overhead.  
     45  * Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Make sure it's current for your site so that you don't accidentally block the Googlebot crawler. Visit [ http://www.robotstxt.org/wc/faq.html ](http://www.robotstxt.org/wc/faq.html) to learn how to instruct robots when they visit your site.  
     46  * If your company buys a content management system, make sure that the system can export your content so that search engine spiders can crawl your site.  
     47  * Don't use "&id=" as a parameter in your URLs, as we don't include these pages in our index.  
     48 
     49** When your site is ready: ** 
     50 
     51  * Have other relevant sites link to yours.  
     52  * Submit it to Google at [ http://www.google.com/addurl.html ](http://www.google.com/addurl/?continue=/addurl) .  
     53  * Submit a sitemap as part of our [ Google Sitemaps (Beta) ](https://www.google.com/webmasters/sitemaps/login?source=gsm&subID=us-et-gdlnsbeta) project. [ Google Sitemaps ](https://www.google.com/webmasters/sitemaps/login?source=gsm&subID=us-et-gdlns) uses your sitemap to learn about the structure of your site and to increase our coverage of your webpages.  
     54  * Make sure all the sites that should know about your pages are aware your site is online.  
     55  * Submit your site to relevant directories such as the Open Directory Project and Yahoo!, as well as to other industry-specific expert sites.  
     56 
     57* * * 
     58 
     59** Quality Guidelines - Basic principles: ** 
     60 
     61  * Make pages for users, not for search engines. Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as "cloaking."  
     62  * Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you'd feel comfortable explaining what you've done to a website that competes with you. Another useful test is to ask, "Does this help my users? Would I do this if search engines didn't exist?"  
     63  * Don't participate in link schemes designed to increase your site's ranking or PageRank. In particular, avoid links to web spammers or "bad neighborhoods" on the web, as your own ranking may be affected adversely by those links.  
     64  * Don't use unauthorized computer programs to submit pages, check rankings, etc. Such programs consume computing resources and violate our [ Terms of Service ](/terms_of_service.html) . Google does not recommend the use of products such as WebPosition Gold(tm) that send automatic or programmatic queries to Google.  
     65 
     66** Quality Guidelines - Specific recommendations: ** 
     67 
     68  * Avoid hidden text or hidden links.  
     69  * Don't employ cloaking or sneaky redirects.  
     70  * Don't send automated queries to Google.  
     71  * Don't load pages with irrelevant words.  
     72  * Don't create pages that install viruses, trojans, or other [ badware ](http://www.stopbadware.org/) .  
     73  * Don't create multiple pages, subdomains, or domains with substantially duplicate content.  
     74  * Avoid "doorway" pages created just for search engines, or other "cookie cutter" approaches such as affiliate programs with little or no original content.  
     75 
     76These quality guidelines cover the most common forms of deceptive or manipulative behavior, but Google may respond negatively to other misleading practices not listed here (e.g. tricking users by registering misspellings of well-known websites). It's not safe to assume that just because a specific deceptive technique isn't included on this page, Google approves of it. Webmasters who spend their energies upholding the spirit of the basic principles listed above will provide a much better user experience and subsequently enjoy better ranking than those who spend their time looking for loopholes they can exploit.  
     77 
     78If you believe that another site is abusing Google's quality guidelines, please report that site at [ http://www.google.com/contact/spamreport.html ](http://www.google.com/contact/spamreport.html) . Google prefers developing scalable and automated solutions to problems, so we attempt to minimize hand-to-hand spam fighting. The spam reports we receive are used to create scalable algorithms that recognize and block future spam attempts.  
     79 
     80    
     81 
     82(c)2007 Google - [ Home ](/) - [ About Google ](/about.html) - [ We're Hiring ](/jobs) - [ Site Map ](/sitemap.html) 
     83