|
#1
|
|||
|
|||
|
Why won't Google do a "deep dive" of my site?
Hello, and thanks in advance for reading my post.
I work on an e-commerce site in which the links on the homepage allow visitors to shop by category (flowmeters, controllers, etc) or shop by manufacturer (Fuji, Partlow, etc). In addition to these links, my homepage also has links to Support, About Us, Security, Shopping Cart, etc. The Google spiders have come around a couple of times, and the pages that have been indexed (in addition to my homepage) are the very high level pages that link directly off my homepage - the landing pages for Flowmeters, Partlow, About Us, etc. The spiders haven't gone any deeper than these landing pages to the individual product pages, which is what I want them to index! Any suggestions??? Thanks again. |
|
#2
|
|||
|
|||
|
Are the lower pages are actually spiderable - you should make sure they are.
Do the lower pages' URLs contain too many parameters? Do they contain session IDs, such as "id="? Google generally won't crawl pages that include and id parameter. Google's crawling depends a lot on PageRank. The higher the PR, the more often, and deeper, a site gets crawled. It's a good reason to get IBLs and build the PageRank in the site. You could try using a sitemap that is linked to from the homepage. You could also create and submit a Google Sitemap. |
|
#3
|
|||
|
|||
|
Thanks PhilC! Here's some more information about my site... Any more suggestions?
My product pages (deepest pages) do contain "id=", but this hasn't prevented Google from spidering my other e-commerce site which is similarly structured... The PageRank of my homepages is 1/10, but the product pages are a 0. I do have 8 or so inbound links to my homepage that Yahoo recognizes, but Google doesn't return any of these links when I enter "link:www.instrumart.com" into the search bar... Quote:
|
|
#4
|
|||
|
|||
|
There are some pages in Google's index that contain "id=" in the URLs, but Matt Cutts recently said that they don't index such pages, so maybe they are getting firmer about it. I'd change the URLs if it were me.
If Yahoo! shows the IBLs, the chances are that Google also has them even if they don't show them. But if they haven't got now, they will have them soon enough. Even so, you should always be doing what you can to build up the IBLs. The more PageRank a site has, the better it gets spidered by Google. Also, the link text from IBLs is the most powerful ranking factor of them all, so you not only need IBLs for PageRank, you also need the links to have the right link text. I.e. a link to your site that has "click here" as the link text, won't do anything much for your rankings, but the same link that has "New York hotels" as the link text will help the site's ranking for 'New York hotels'. |
|
#5
|
|||
|
|||
|
Recently came across a site with problems that also used a secondary search by manufacturer, and in the product pages this list formed the majority of the textual content as seen in the text-only version of the cache.
You might check it's not the case with your site... Last edited by glengara : 05-15-2006 at 03:19 PM. |
|
#6
|
|||
|
|||
|
How new is the site in question? You mention that the spider has visited a couple times, which makes it sound like a relatively new site.
So in addition to the other replies, I'd just add that Google isn't crawling as voraciously as it has in the past. And Matt Cutts recently mentioned that under the new BigDaddy infrastructure, G has a different crawl priority than it used to have. Something to keep in mind.... but the other replies are also helpful and not to be ignored in favor of my "this is just how it is these days" post. ![]() |
|
#7
|
|||
|
|||
|
G'Day Atlanta 404,
The above advice is great, be sure your have plenty of ontopic links to your site, not only the homepage but also deep into your products, be sure to get a nice spread. Also another thing might be to look at the 'uniqueness' of you pages; - do they have a significant amount of individual content per page? - do you have a unique meta description on each page? - do you have a logical internal navigation that themes your site? - see: http://www.webmasterworld.com/forum10003/3060.htm This big daddy confusion seem to have hit some sites really hard, I am sure the above could alleviate most of the indexing issues currently experienced. Also if your site is new, kepp on keeping on, build so many links around the site, SE's cannot ignore you (be sure to vary your anchor text). Cheers, Ben |
|
#8
|
|||
|
|||
|
Quote:
To describe the situation in pre-BigDaddy terms, because I'm not sure exactly what's happening with the new infrastructure... Googlebot will spend only a limited amount of time spidering a given site, and the amount of time it will spend is directly influenced by PageRank. Within that time, Google will crawl only so wide or so deep into your site's structure. Spider-friendly urls might help things by speeding up the crawl. External inbounds going to your landing pages may also help the spiders go deeper into the site by providing deeper entry points. Though unique meta descriptions shouldn't have anything to do with crawling, substantial similarity among pages, usually similarity in page content, will keep them from ranking and may get them dropped. I don't think meta descriptions would do that, but templated pages without much unique content, or very low PR pages with unique content but with identical titles, eg, often will go supplemental. Depending on the size and structure of your site, it may well be that you're going to need a PR of 5 or 6 or 7 to get all your pages indexed and ranking. |
|
#9
|
|||
|
|||
|
Quote:
How many large sites with bunches of product pages and funky url's actually have unique meta descriptions? Bugger all (wow, this could be what Big_D went after...). It's a clear signal that someone can use a database (wow) --> so what! do these pages deserve to be in an index of other pages that are clearly telling the SE what they are about (or even in the index), I think not. Do some test's Robert_Charlton and you WILL find that not only will a unique meta description pull some pages out of the supplemental index, but they are also seen as a signal of quality (how could they not be). More effort = more ranking. (tedster noted this at WMW when the Big_D update began {long time back now..} and it makes total sense) See also a great post by Ian mcanerin http://forums.searchenginewatch.com/...threadid=11444 about them. If Google finds some nice unique content with all the right signals it will crawl substantially deeper, why would it not? Your forgetting to tell the SE what your pages are about, and this will certainly help if they are similar and the bot cannot work it out for itself. Food for thought, Ben |
|
#10
|
|||
|
|||
|
Quote:
As far as testing goes, I think I inadvertently just ran a test. I've got a page on a client site that last week moved up to #5 on Google for a two word phrase with 70-million competing pages... reasonably competitive... and I realized that I'd never tuned the meta-description for this page. It's got the same generic description that I use on the unoptimized pages in the site, which doesn't happen to contain this particular phrase. For this page, Google had been pulling the snippet from the first paragraph. So I tuned the meta description a few days ago, not for ranking purposes but to present a more attractive description for that particular search. My point is that the meta description doesn't seem to have affected the ranking much, but I'll let you know if tuning it gets us to #1. ![]() PS - The question of the original poster is essentially about crawling and indexing, though... and here I don't think the meta description affects crawling at all, nor do I think it's a large enough factor to put a page in supplemental. I do agree with you that it's important for selling the page to the searcher. Last edited by Robert_Charlton : 05-16-2006 at 01:33 AM. |
|
#11
|
|||
|
|||
|
We ran a public test a few months ago, and it was found that the Description tag isn't used at all for rankings in Google, but it is used by Yahoo!. I'd run a test a year earlier, and found the same thing.
|
|
#12
|
|||
|
|||
|
I'd agree with Wilksys' comment, don't know about ranking, but from what I've recently read from a number of experienced posters a generic D tag on dynamic sites with low link juice may well hinder spidering/indexing.
|
|
#13
|
|||
|
|||
|
The key here is to be proactive about developing your store. Don't sit back and hope G will index a whole list of similar titles, descriptions and content. You may experience short lived success with a good listing but personally I would focus on the bigger picture and what can be indexed as a whole.
As stated earlier in the discussion you need to make sure the store is locatable by the spiders, for starters introduce search safe URL's and ditch the id='s. Good site linking structure is important for crawling the complete site. PhilC - Thanks for the update btw rgd Matt Cutts, this is something I have guessed may happened to sharpen up results in the SERP's. This can be only a good thing I feel, do you agree? I am currently holding good positions for a store, for product pages all scripted with unique META info and page content. Without the Description tag I feel this will not be possible, imo the pages relevancy will decrease. In some cases the des tag is shown in the SERPs, others it is not...it pulls text from the page. Final thoughts - cover all bases and make sure all your pages are valued by G and dont be lazy ![]() |
|
#14
|
|||
|
|||
|
Quote:
Quote:
|
|
#15
|
|||
|
|||
|
I could well see where with "funky" Urls G would need some enticement to dig deeper, and with generic titles and D tags, it may well require higher than "normal" PR.....
|
|
#16
|
|||
|
|||
|
Quote:
My reference on effort and rankings was about crawlability not ranking ; ) One leads to the other. Also while I am certainly no se engineer myself I have done a lot of thinking and I am sure the copy taken back to the plex after a crawl is assessed and the next time the site is crawled it's crawled according to it's merits. Hence the better the signals the better the crawl, makes sense no?? |
|
#17
|
||||
|
||||
|
Quote:
When I optimize a database driven catalog site such as described in the original post, with "funky urls" (if I understand what you mean by these), I generally have the programmers apply mod_rewrite to fix the urls, and come up with a database-driven approach to generate optimized titles, descriptions, and content... all together. It would be a waste of everybody's time and resources to just optimize the description. If I'm working on a static site, I'd take care of titles and descriptions at the same time. I do have a site coming up that, in its archives, has hundreds of more or less identical titles, all of which are returned only with the dupe filter off, even though the pages have original content... and I have proposed a series of step tests to determine how much we need to change things to bring relevant content in these pages to user attention. I've got to tell you though, that it's never occurred to me to change only the descriptions, without first changing titles or internal nav links or page headings. That simply doesn't make sense. Quote:
Quote:
I mentioned that BigDaddy might be changing things. I've seen hints from Matt's postings that there are various bots from Google that are now "co-operating" with each other, so the above might make sense. Quote:
And again, returning to the original question, and to several earlier posts on this thread, there are some other well-known factors that need to be taken care of first. Of course, descriptions ought to be fixed. I don't think they're the source of the crawling problem, though. |
|
#18
|
|||
|
|||
|
Thanks for all of the great posts!
I've checked my site's log files, and Google has been at the very deepest level pages of my site... But for some reason, Google has chosen not to index them. What do you think??? ![]() |
|
#19
|
|||
|
|||
|
Matt Cutts gave us some new information a few days ago, which concerns this topic. With the Big Daddy update, Google now...
1) intentionally crawls more pages that they will index, so pages will be crawled that won't get into the index. 2) has new criteria about how deep and how often to crawl a site. It's the reason why many sites are having their pages dumped from the index, and it's the reason why many sites won't get all their pages into the index, even though they may be crawled. You can read what Matt has to say, and the subsequent posts which include some angry ones from me at http://www.mattcutts.com/blog/indexi...line/#comments |
|
#20
|
|||
|
|||
|
Read your posts Phil, and couldn't quite see what the excitement was all about, hasn't PR/links always determined site indexing?
Though its been known for a long time in our circles, it's only relatively recently that G suggests gaining links in their guidelines, maybe MC was simply reiterating it for a wider audience? |
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|