PDA

View Full Version : Symptoms of Exiting the Sandbox


wheelsoffire
07-06-2005, 11:21 PM
I have a site that has been up since August of 2004. I am getting about 70 visitors a day. Over 90 percent of that traffic comes from MSN. I am always watching my referers to see if Google will start to refer traffic.

Recently, Ive been seeing one or two referals from Google, but they are always from keyword phrases in quotes. To use adwords lingo, Im getting only exact matches. No broad matches.

Is this anything to get excited about? Is it a symptom of getting released from the sandbox?

Thanks

Relevancy
07-07-2005, 12:17 AM
August 2004?? You should be out of the sandbox now. Usually takes about 6-8 months.

Do you have any links pointing to you?
Do you regularly edit the site and add fresh content?
Are there duplicate mirror domains?
Do you at least rank somewhere for your domain name with spaces?

Marcia
07-07-2005, 12:27 AM
There are sites that date back to well before the beginning of the "sandbox era" that have exhibited the same symptoms. That's an indication that the sites are running into filters - which some believe is all the sandbox is, just a set of filters.

wheelsoffire
07-07-2005, 12:49 AM
August 2004?? You should be out of the sandbox now. Usually takes about 6-8 months.

Do you have any links pointing to you?
MSN = 2138 , Y! = 1470, Google = 23

In reality I have links from about 40 unique domains. I have some pretty good ones. One .edu, and I just recently got some good one way links... PR 6 , and 2 PR 5's. Google has yet to acknowledge these.

Do you regularly edit the site and add fresh content?
pretty regularly, but I should probably add more often.
Are there duplicate mirror domains?
NO
Do you at least rank somewhere for your domain name with spaces?
NO

Relevancy
07-07-2005, 12:58 AM
What about the domian name with no spaces?
Do you have duplicate content at all?
Do you have varied anchor text on the links pointing to you?
Have you tried Google Sitemaps (https://www.google.com/webmasters/sitemaps/login) yet? Make sure there is something new to be seen.
Are your pages in the supplemental index? use site:www.domain.com in google to see

wheelsoffire
07-07-2005, 12:59 AM
There are sites that date back to well before the beginning of the "sandbox era" that have exhibited the same symptoms. That's an indication that the sites are running into filters - which some believe is all the sandbox is, just a set of filters.

I agree. Im sure my site is filtered because of a specific incident. I lost my best link for about back in NOV, when the site (a newspaper) was re-designed.

Im not sure about rankings bc I didn't really keep track back then, but I know my site used to show over 80 for a backlink check in Google. After I lost it, I went to showing 5 backlinks in Google.

I recently re aquired the lost link. Its a PR 5, 1 way, site wide link. Lots of pages and growing every day, because they use one template for every page.

When I did, I had a huge increase in traffic from MSN. I still showed no change in Google for about a month until now Im noticing the referals for quoted searches. Also I went up to 23 results for a backlink check in Google.

Relevancy
07-07-2005, 01:05 AM
another question
Do you get nav links? Where one domain puts your link on everypage on say a side nav. Or are your links one page specific and not purchased?

Read above too for second set of questions

wheelsoffire
07-07-2005, 01:06 AM
What about the domian name with no spaces?
Do you have duplicate content at all?
Do you have varied anchor text on the links pointing to you?
Have you tried Google Sitemaps (https://www.google.com/webmasters/sitemaps/login) yet? Make sure there is something new to be seen.
Are your pages in the supplemental index? use site:www.domain.com in google to see


No, I dont come up for my domain with spaces in G.

Yes, anchor text is varied.

Yes I did sitemaps and my whole site is indexed, where as, it wasn't before that.

When I do site:www.mydomain.com, Every single page is indexed.

One other thing I forgot to mention. After I lost the major link, I did a redesign. I had no clue what I was dong, and I changed all of my page extensions from php to html. I lost pagerank on 8 or so pages, and was only able to keep the pagerank of my home page (pr4)

But that was about 4 months ago.

wheelsoffire
07-07-2005, 01:11 AM
another question
Do you get nav links? Where one domain puts your link on everypage on say a side nav. Or are your links one page specific and not purchased?

Read above too for second set of questions

I have "Nav links" or site wide links on at least 3 sites. I have about 15 to 20 in links pages. I really make sure that they are relevent and not in the sandbox. I quit doing reciprocal links.

Relevancy
07-07-2005, 01:12 AM
You will lose PR if you change the file name. it is just like making a new page. You have to 301 old urls to new urls to transfer PR

Where your pages in supplemental index when you did the site:? It will say supplemental results in green next to the url if so.

Can you PM the site to me to look at?

Relevancy
07-07-2005, 01:16 AM
site wide links are a clear sign of link buying. I would stear clear of them. That might be a reason for being penalized. I had a site dropped when Google cracked down on links a way back.

Relevancy
07-07-2005, 01:20 AM
stick to free directories and one way link trading. Meaning if you have 2 sites, give them a link from a different site you have and not the site you are getting linked.

wheelsoffire
07-07-2005, 01:22 AM
site wide links are a clear sign of link buying. I would stear clear of them. That might be a reason for being penalized. I had a site dropped when Google cracked down on links a way back.

Im not sure about that. My traffic seriously went from 20-30 a day to 80-100 visits per day after I got a pr 5 sitewide. Ive never bought a link by the way.

wheelsoffire
07-07-2005, 01:25 AM
stick to free directories and one way link trading. Meaning if you have 2 sites, give them a link from a different site you have and not the site you are getting linked.

Ive been hearing bad things about free directories lately, but who knows.

I like that idea for getting 1 ways though. I just don't have any other sites wit any decent pagerank yet.

Relevancy
07-07-2005, 01:44 AM
dont worry about PR so much. Just get related sites. PR 1 will turn into a 2,3,4, whatever soon enough. Try to get in body text links rather then links page links.

PR friendly free directories are a good way to build gradual link popularity. I run this site and it is a good way to get a link on a related page. Plus the link is prominate on the page. http://www.topicdirectory.com/addurl.html#topic

You can find other directories that offer good linking such as anchor text control, limitied mass linking on pages, passes PR, etc

wheelsoffire
07-07-2005, 01:52 AM
dont worry about PR so much. Just get related sites. PR 1 will turn into a 2,3,4, whatever soon enough. Try to get in body text links rather then links page links.

PR friendly free directories are a good way to build gradual link popularity. I run this site and it is a good way to get a link on a related page. Plus the link is prominate on the page. http://www.topicdirectory.com/addurl.html#topic

You can find other directories that offer good linking such as anchor text control, limitied mass linking on pages, passes PR, etc
Thanks, Ill check it out.

I haven't looked at my backlinks lately. It looks like my site is getting picked up by the spammer - scraper sites. 3 of them so far. Nothing I can do about that.

Relevancy
07-07-2005, 01:58 AM
So after all is said and done. It might just be that you are still in the aging delay. :) Maybe non-.com sites take longer to get out then .com site do. 11 month aging delay is a little excessive, but not outrageous.

Also try linking out to related authority sites. Outbound linking is making a come back. If you find a non competing site that ranks well, link to it as a resource for your site. Google loves this now. And not just on your links page.

wheelsoffire
07-07-2005, 02:03 AM
So after all is said and done. It might just be that you are still in the aging delay. :) Maybe non-.com sites take longer to get out then .com site do. 11 month aging delay is a little excessive, but not outrageous.

Also try linking out to related authority sites. Outbound linking is making a come back. If you find a non competing site that ranks well, link to it as a resource for your site. Google loves this now. And not just on your links page.
Thats good advice. I will definately do it.

Also about the age delay. I think its still affecting me too. I think I prolonged it by 1 loosing my only real 1 way link at the time, and 2. changing from .php to .html. Hopefully Things will turn around soon.

Relevancy
07-07-2005, 02:08 AM
Good luck.

I have a few sites in the aging delay now too. So I know how it is to wait.

It is about relevancy and gradual growth. Optimization is not important if you design clean and grow resourcefully.

Marcia
07-07-2005, 02:19 AM
Im sure my site is filtered because of a specific incident. I lost my best link for about back in NOV, when the site (a newspaper) was re-designed. Losing a link isn't a filter. A complete redesign is a different story - and again, that would't be what we usually look at as cause for filtering, a lot of sites have experienced this with a major redesign.

Optimization is not important if you design clean and grow resourcefully.I'm not sure I'm clear on what that's implying. There are plenty of clean, resourcefully rich sites that languish in the hundreds without benefit of search engine traffic their whole lives until they're properly optimized.

Relevancy
07-07-2005, 02:26 AM
I'm not sure I'm clear on what that's implying. There are plenty of clean, resourcefully rich sites that languish in the hundreds without benefit of search engine traffic their whole lives until they're properly optimized.

What I meant by 'clean' is well represented content and topic clarity. I guess you can consider it optimization to edit the title tag and add h1 tags etc, but to me that is just common clean site design and page clarification. Link development is marketing. Natural link growth comes with resource rich information, so buying links or trading for links to me is just not needed unless it is used as a resource and not just stuck on a useless links page. I gues I call optimization 'page clarification'.

wheelsoffire
07-07-2005, 02:27 AM
Losing a link isn't a filter. A complete redesign is a different story - and again, that would't be what we usually look at as cause for filtering, a lot of sites have experienced this with a major redesign.

I'm not sure I'm clear on what that's implying. There are plenty of clean, resourcefully rich sites that languish in the hundreds without benefit of search engine traffic their whole lives until they're properly optimized.


My theory about loosing the link was that Google gave me some kind of negative points for it. I wish I knew back then what I know now.

Marcia
07-07-2005, 06:18 AM
Ive been seeing one or two referals from Google, but they are always from keyword phrases in quotesI've been seeing the same thing with a couple of sites, and some of those phrases my site is way down in the serps - it's just odd, but nothing to worry about. One site in question is just starting to get some Google traffic for the very first time ever - it's an older site and it's now coming up for some good searches, too, though not too many yet. Some things just changed, time to wait it out some more.

There are NO negative points and no penalty involved with losing links. It's just losing some links, which happens to everyone - it's a normal, everyday fact of life for sites that links can come and go. As long as there are no irregularities with your behavior, there's no problem.

wheelsoffire, back up a little bit, please - and go back to your first post. Your site is NOT new enough to be sandboxed. If the timing of a loss of traffic coincides with a total redesign of your site - it is NOT any of those other factors, it is the redesign.

You're not alone, lots of people all over are posting about the same thing happening with a major redesign and/or site restructuring. It just takes a bit of time for Google to sort out the details and catch up with a site that's had a major makeover.

It is not a backlinks issue, or any of these other things, or any sudden problems. It's just a matter of time 'til they catch up. No, you are NOT in any aging delay - yours is an entirely different situtation. Please, just go on as normal and wait it out.

DaveN
07-07-2005, 06:39 AM
Marcia, is right I'm affraid you need to sit it out, and let google do it's job.. you could request a removal of all the old pages and get some new links :)

Dave\\n

AussieWebmaster
07-07-2005, 09:58 AM
site wide links are a clear sign of link buying. I would stear clear of them. That might be a reason for being penalized. I had a site dropped when Google cracked down on links a way back.
Let's try are generally a sign of link buys.... if you offer a tool that a site wants to link to in its nav then you are not buying - at least not in Google terms....

AussieWebmaster
07-07-2005, 09:59 AM
Have you looked at the links that you do have?
Are there a bunch of them that come from the same C Block?

wheelsoffire
07-07-2005, 10:04 AM
Marcia, is right I'm affraid you need to sit it out, and let google do it's job.. you could request a removal of all the old pages and get some new links :)

Dave\\n

Now that you mention it Maria, You are right. I didn't know enough back then to realize that all the bad things happend when I did the redesign and changed the extensions of all my pages.

Thanks for those encouraging words. I have no problem with waiting, but my original question is....

If anyone has experienced this recently, Is the fact that I am starting to see referals for these quoted / exact phrase searches a sign / symptom of coming out of whatever penalty or filter I got when I did the redesign?

wheelsoffire
07-07-2005, 10:22 AM
Have you looked at the links that you do have?
Are there a bunch of them that come from the same C Block?

got this usingthis tool www .seologs.com/link-analysis-tool/backlinks.html
----------------------------------------------------------------------
1 Unique Educational Domains (*.edu) with 1 Unique C Block Addresses

35 Unique Commerical Domains (*.com, *.net, etc) with 29 Unique C Block Addresses

AussieWebmaster
07-07-2005, 11:02 AM
got this usingthis tool www .seologs.com/link-analysis-tool/backlinks.html
----------------------------------------------------------------------
1 Unique Educational Domains (*.edu) with 1 Unique C Block Addresses

35 Unique Commerical Domains (*.com, *.net, etc) with 29 Unique C Block Addresses
So you have at least 6 link partners coming from the same C Block that is a large percentage... which could weight as spam

wheelsoffire
07-07-2005, 12:12 PM
So you have at least 6 link partners coming from the same C Block that is a large percentage... which could weight as spam

There are 6 sites in 2 IP addresses.

Taking a closer look at the backlinks I see that I have 3 links from blogger sites. They all have the same IP.

There are 3 other related sites that have the same IP.

These are all 1 way links. Not really partner sites. Is there any way that they could really hurt?

Robert_Charlton
07-07-2005, 02:55 PM
These are all 1 way links. Not really partner sites. Is there any way that they could really hurt?

wheelsoffire - Very quick top of my head opinion... From what I'm following, your site has roughly 1000 to 2000 inbounds, from 30 unique IPs, and several of these unique IP links are from blogs. The rest are run of site links.

While I don't know all the details, I'm guessing that, based on this linking pattern, Google is not seeing your site as a good quality site, and that you've probably raised some flags.

If you raise enough flags with Google these days, you are liable to have ranking difficulties. The sitewide redesign might have been enough to do it. Depends on what you changed, how fast, etc.

That you're worried about one link as your one good link, also signals that your site is perhaps walking a fine line. While page html or design templating shouldn't be a problem, if the content of your pages is too similar, that might be a problem as well.

My guess is that if you create some good, solid, unique content... and get a variety of good links from good sites... it's likely that you will start to move back up.

Robert_Charlton
07-07-2005, 03:03 PM
PS - And yes, of course the site needs to be optimized and some of the links need to have good anchor text. The thrust of my post above is that even if the site is optimized, if it has a suspicious linking pattern, it's not liable to do very well.

AussieWebmaster
07-07-2005, 04:22 PM
PS - And yes, of course the site needs to be optimized and some of the links need to have good anchor text. The thrust of my post above is that even if the site is optimized, if it has a suspicious linking pattern, it's not liable to do very well.
I have to argree.

wheelsoffire
07-07-2005, 09:28 PM
As for the "One good link" That was when my site was 1-2 months old. It was a well established site. Very old and trustworthy link. It really was my only link at the time. I have regained the link and severall more good 1 way IBL's.


If you raise enough flags with Google these days, you are liable to have ranking difficulties. The sitewide redesign might have been enough to do it. Depends on what you changed, how fast, etc.

I think you may have hit the nail on the head here. I read thisVery Very Good Article (http://www.seo-scoop.com/direct_link.cfm?thepost=418) today. After coming back and reading your reply, I have to assume that this was probably what happened to me.

Robert_Charlton
07-08-2005, 12:57 AM
...I read thisVery Very Good Article (http://www.seo-scoop.com/direct_link.cfm?thepost=418)...

wof - Thanks for that link. It is a very good article, and confirms other comments I've heard that the engines are using statistical analysis.

Whether you have just one good link or several, you've got to get more. When I see that a client is dependent on just a few good links, I make getting more links a very high priority. To use a physical analogy... you want to build on a very broad, stable foundation, one that won't tip over easily if one supporting link is pulled.

SEO1
07-08-2005, 07:55 AM
Relevancy said:

site wide links are a clear sign of link buying. I would stear clear of them. That might be a reason for being penalized. I had a site dropped when Google cracked down on links a way back.

This is where the "sandbox" myth comes about.

I have never seen a clients site sandboxed....largely because I don't build many links, as I feel they are virtually useless in getting front page results. They have their place in my eyes, but not as important as they seem to be to others.

As was mentioned prior in the quote above Google cracked down on links a while ago and hasn't let up.

Amazingly webmastesr will go on link campaigns to build links to their site and never realize that they are shooting themselves in the feet.

I spent a bit of time looking at 1995 1996 registered domains over the last couple of years, spending time looking at the "authority" sites in the forward index, and over the course of internet time most of these "grandfathered" sites built about 100 links per year.

On a monthly basis that equals about 8 links per month/

Google is not stupid and has known this fact for quite some time. It is a data mining robot which stores data about your web site in its database. It uses this data for comparison to give your site a score and in order to strengthen the algorithim.

So now along comes green webmaster "zeewhatanayyo" who builds 100 links to his new web site in a mointh.....

Don't you think Google can see this and then apply a filter to keep the page stuffed down in the SERPs???

The other issue is most people never built traffic to their web sites .....with 10,000 websites in existence in any category what makes anyone think Google will send users to a site nobody visits when there are currently sites receiving 1,000s of unique visitors daily ??

Many people want to make the virtual world business model into something that the business would not be in the real world.

Would you recommend a brand new restaraunt you have never been to eat...to a god friend?

Would you recommend a new new medical school graduate perform neuro surgery on your spouse, love intrest, signifigant other??

No, most of us wouldn't....so why do people feel Google will be any different??

The algorithim consists of over 100+ parts to the script. Those who solve the most parts... will have their sites listed on the front pages of Google...those who don't... will forever sit in forums discussing the conspiracy against them.

Peace

PhilC
07-08-2005, 08:24 AM
Sorry if this is going a bit off-topic, but I can't help responding to it, and, besides, it needs a response.

The other issue is most people never built traffic to their web sites .....with 10,000 websites in existence in any category what makes anyone think Google will send users to a site nobody visits when there are currently sites receiving 1,000s of unique visitors daily ??(1) How would Google know about the quantities of traffic to websites? They have a toolbar that is used by a very small percentage of people, so, if they did track people, it wouldn't be much use as a ranking factor - and there'd be uproar if it was ever discovered. They have their serps, and they do gather information about clicks, but that could only show that those already at the top get the traffic, and so they stay more firmly at the top, and those that are not at the top can never get there because they don't get clicks. That's a very bad basis for a ranking factor, and no serious engine is that stupid - not since it failed with DirectHit, anyway. Also, it isn't information about a site's traffic. It's only information about traffic form the particular search engine.

(2) Traffic does not equal relevancy, and search engines seek to display relevant results at the top, and not those that somehow manage to arrange significant traffic.

There are all sorts of other reasons why traffic would be a very bad ranking factor (for instance, thousands of people to one area of a site doesn't mean that the site's other areas and pages merit ranking boosts, and yet the site does get lots of traffic), but I don't think there's any need to list them all.

PhilC
07-08-2005, 08:51 AM
I think you may have hit the nail on the head here. I read this Very Very Good Article (http://www.seo-scoop.com/direct_link.cfm?thepost=418) today.Excellent find! And excellent information. Even though it's a seriously flawed concept, it could account for many things that we see.

SEO1
07-08-2005, 09:06 AM
Hi Phil

Googlebot crawls the server and I would think keeps a rolling hit count. Hits don't equate to traffic but by dividing total hits by average number of hits generated when a user visits one page, give a good indication of traffic levels.

The toolbar if they were using it to count visitors to sites, and which you say is only a small percentage, by assigning a number to that percentage, say 20% then multiplying by 5 (20 x 4 - 100) they could get enough information to develop and average guesstimate number to serve their purpose.


There are all sorts of other reasons why traffic would be a very bad ranking factor (for instance, thousands of people to one area of a site doesn't mean that the site's other areas and pages merit ranking boosts, and yet the site does get lots of traffic), but I don't think there's any need to list them all.

1,000s of visitors to one area of a website would only increase the rankings for those pages not the entire site.

If you look at SERP resullts the URL at the bottom of the description is what page is ranked for the user keyword query that matches, not the page you land on when you click the Link at the top of the description, so the assertion that volume to one part equals improved rankings for the other areas, is a bit off.

Google has been using historical data analysis for some time and in March of this year filed a patent surrounding it's use and it's implementation into the Google Algo.

http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&co1=AND&d=PG01&s1=20050071741&OS=20050071741&RS=20050071741

If you click that link and then scroll down to this section you can read more on the importance of traffic:

[0087] Traffic

Or to make it easier I'll post the summary they supply:

[0091] In summary, search engine 125 may generate (or alter) a score associated with a document based, at least in part, on information relating to traffic associated with the document over time.

Seems pretty clearly spelled out to me that they will use traffic to score pages, have done so in the past, and will continue to do so in the future.

Again this is a 3rd party offering a free service, once you are in their establishment they have rights to protect themselves just as any offline business has the right to protect itself (security, employee background checks video surveillance) by wishing to have your site indexed by Google you give them the permission to protect themselves.

Another issue to consider is Google is now a publicly held company whose shares are trading at close to $300.00 per/ They now have a huge responsibility to protect those shareholders as well as increase shareholder value.

By just ranking any site because it feels like doing so, would not offer much in the realm of corporate responsibility. or consumer confidence.

They have their serps, and they do gather information about clicks, but that could only show that those already at the top get the traffic, and so they stay more firmly at the top, a and those that are not at the top can never get there because they don't get clicks.

If that were true, I and the owner of this site, and others as well, would no longer be in the business of helping websites get to the front pages of search engines such as Google.

Again traffic is just 1 part of the 100 parts to the algo...those who solve the most parts will be ranked on the front pages...those who don't will sit in the forums wondering why.

one other thought, for those of you are trying to build a business from the traffic sent by results pages on Google you might want to buy the Sunday Newspaper in your city for the employment section...you'll need it soon.

Clint

PhilC
07-08-2005, 09:36 AM
Hi Clint,

First off, that patent application is no indication at all of what Google uses in the algo, or of what they may use in the future. It is nothing more than Google applying for everything they can think of that may or may not be useful to a search engine in the future. It prevents competitiors from getting patents on those ideas.

Seems pretty clearly spelled out to me that they will use traffic to score pages, have done so in the past, and will continue to do so in the future.That's not what the application says. You need to read it again.

Googlebot crawls the server and I would think keeps a rolling hit count. Hits don't equate to traffic but by dividing total hits by average number of hits generated when a user visits one page, give a good indication of traffic levels.Search engine spiders do not crawl servers. They crawl websites, and they can only do it from links they find on webpages, plus the standard robots.txt, which they always request in case it exists. They have no access to 'hits' data of any kind unless a website makes it's logfiles available by linking to them from a webpage, and I know of no websites that do that.

They could gather data about which pages each toolbar user visits, but (a) there would be uproar if they were found to be doing it, and (b) the amount of data they would need to store, even from the small percentage of people who have the toolbar, would be horrendous, because they would not only need to keep a score for each visit to each page, and keep the score recent so that changes in traffic levels would be reflected in the score, but they would also need to time each visit and store that data as well, because a visit is not the same as a vote. Many pages are visited, and many of them are exited very quickly because they are not suitable for one reason or another.

Again this is a 3rd party offering a free service, once you are in their establishment they have rights to protect themselves just as any offline business has the right to protect itself (security, employee background checks video surveillance) by wishing to have your site indexed by Google you give them the permission to protect themselves.

Another issue to consider is Google is now a publicly held company whose shares are trading at close to $300.00 per/ They now have a huge responsibility to protect those shareholders as well as increase shareholder value.

By just ranking any site because it feels like doing so, would not offer much in the realm of corporate responsibility. or consumer confidence.No argument from me, but those things have nothing to do with using traffic as a ranking factor.

If that were true, I and the owner of this site, and others as well, would no longer be in the business of helping websites get to the front pages of search engines such as Google.Quite right. So we are agreed that Google can't successfully use click-throughs from their serps to determine any traffic relevancy for ranking purposes.

Again traffic is just 1 part of the 100 parts to the algo...Sorry, but that's just an opinion, and one without any evidence to support it as far as I can tell.

PhilC
07-08-2005, 09:57 AM
Sorry - I missed these bits:-

1,000s of visitors to one area of a website would only increase the rankings for those pages not the entire site. Your post said "site", and I responded to that. But it doesn't make any difference as the other things still apply.

If you look at SERP resullts the URL at the bottom of the description is what page is ranked for the user keyword query that matches, not the page you land on when you click the Link at the top of the description, so the assertion that volume to one part equals improved rankings for the other areas, is a bit off.You've lost me there. In the serps, the 'printed' URL is the same as the URL that the Title links to, except when Google is updating something, when they use internal URLs instead. Have I misunderstood what you said?

SEO1
07-08-2005, 10:17 AM
Phil

Search engine spiders do not crawl servers. They crawl websites,

Ummm lets see....where would those webpages be placed in order for the spiders to crawl them???

For a further bit of information I worked with several large retailers in the mid 1980s to develop POS exception reports which pointed possible acts of internal theft being committed against these corporations.

We used a data mining script (read robot, spider) which ran on ASP400 servers to cull the averages of each tranasction in several 2,000+ chain of retail department stores over the course of a few years.

I assure you that if we could store the massive amounts of data we needed to fine tune these reports so that only those with a high percentage of accuracy of theft indicators were presented after culling millions of tranasctions daily Google most certainly has enough room in their server & distributed network to garner the bits of information they need.

It's also why Google has a forward index and general index. The forward index is used for comparison, which is used to measure new sites against the authority sites.

The sites in the forward index are those that are most likely to be returned in the results, which will match the users search query most accurately.

Clint

PhilC
07-08-2005, 10:54 AM
Ummm let's see....the websites would be stored on servers.

You see, you don't write exactly what you mean, so it's hardly suprising that there is confusion. You said that Googlebot crawls servers and that it has access to 'hits'. From that, most people would understand that Googlebot is able to crawl the servers themselves, and reach the logfiles where links don't take it, because logfiles are the only places where the 'hits' that you described can be found. But search engine spiders can't do that. They could make educated guesses as to where the logfiles could be found, and request them, but has anyone ever seen a spider requesting a logfile that wasn't linked to? No. It's not surprising, because (a) they just don't request files for which they have no links (except the robot's txt file), and (b) the size that many logfiles get to would make it prohibitive. How are they going to handle logfiles that are many gigabytes long? Using a site's 'hits' is a complete non-starter. You also used the word "site" when you meant "page". It all makes the discussion a bit more difficult.

I'm sorry, but what you did in the 80s has no bearing on whether or not Google uses traffic as a ranking factor. By comparison to what Google would need to do, your work in the 80s was very very tiny. Even if Google were able able to do it, which I fully believe they can't, there is nothing to show that that they did it in the past, or that they do it now, or that they will do it in the future - nothing at all.

It's also why Google has a forward index and general index. The forward index is used for comparison, which is used to measure new sites against the authority sites.The forward index isn't for that purpose at all. I can't imagine where you got that from. Apart from that, it has nothing to do with Google using traffic as a ranking factor.

SEO1
07-08-2005, 11:13 AM
Hi Phil

You've lost me there. In the serps, the 'printed' URL is the same as the URL that the Title links to, except when Google is updating something, when they use internal URLs instead. Have I misunderstood what you said?

No I misspoke.. not enough caffiene I guess..

My only relevancy to the 80s and now is that I do understand very well how datamining robots work.

As you say the googlebot follows links and there are no links to log files..

However I think they have other methods and then there is the Google Sitemaps implementation which welcomes them into your access logs.

Access logs

Locate the following section:

<!-- ** MODIFY or DELETE **
"accesslog" nodes tell the script to scan Apache-style webserver
log files to extract URLs on your site.

Required attributes:
path - path to the file

Optional attributes:
encoding - encoding of the file if not US-ASCII
-->
<accesslog
path="/etc/httpd/logs/access.log" encoding="UTF-8" />
<accesslog
path="/etc/httpd/logs/access.log.0" encoding="UTF-8" />
<accesslog
path="/etc/httpd/logs/access.log.1.gz" encoding="UTF-8" />

This section gives three examples. You should replaces these entries and include an entry for each log file. Ensure that path is the complete path and filename on your web server. If the log files are not encoded as US-ASCII or UTF-8, then use the optional encoding attribute to specify the encoding.

The Sitemap Generator assigns priority to URLs it finds in the logs based on how often each URL is accessed. For instance, a URL that has been accessed 100 times will be given a higher priority than a URL that has been accessed twice. The actual priority assignment is relative and depends on each URL as compared to other URLs in the site.


Can't imagine what else they might be able to determine using this method.

But now we are getting way off the topic..

I get out of Googles Sandbox by logging out of my adwords client center :cool:

Peace

AussieWebmaster
07-08-2005, 11:22 AM
I get out of Googles Sandbox by logging out of my adwords client center :cool:

Peace
There must be some oblique jest in there but I just don't see it.... maybe I have not had enough caffeine yet.

SEO1
07-08-2005, 11:30 AM
Hi Aussie

The Google Sandbox is a tool in Googles Adwords proegram.

So to get out of the sandbox I log out.

It's the only google sandbox I have seen.


Clint

PhilC
07-08-2005, 11:32 AM
Hi Clint,

The only method that I know of, or can even conceive of, where Google can get to the logfiles is their Sitemaps system, as you said. But that's only a few weeks old, and, of the comparitively tiny number of websites that are using it, only some of them provide the logfile urls, so traffic evaluation from that source could be a possible (but unlikely, imo) future ranking consideration, but it can't be one yet.

And you are right - we've detracted from the topic for too long so...

Sandbox? What sandbox? ;)

SEO1
07-08-2005, 11:43 AM
Phil

Well I see we are on the same page about the damn catbox.....

Aussie - I didn't see that you were in charge of Adwords ... disregard my last post...

I'm Done

Clint

PhilC
07-08-2005, 11:53 AM
er...sorry....just one last little bit...

Google's Sitemaps doesn't give Google the logfile urls - does it? The only thing that sees them is the script that is run on the indivdual website. I don't know about prioritising urls because of their counts in the logfiles, because I haven't used the system yet, but it would be a long time before such data could realistically be used as a ranking factor.

I'm done too.

AussieWebmaster
07-08-2005, 01:19 PM
Phil

Well I see we are on the same page about the damn catbox.....

Aussie - I didn't see that you were in charge of Adwords ... disregard my last post...

I'm Done

Clint
Hey don't leave... you are offering alternate viewpoints... theoretically they may not be supported by known info... but that does not mean they could not be right... the algorithm has many elements that the public is not privy to, maybe you have one that needs testing.

AussieWebmaster
07-08-2005, 01:23 PM
Hi Aussie

The Google Sandbox is a tool in Googles Adwords proegram.

So to get out of the sandbox I log out.

It's the only google sandbox I have seen.


Clint Okay... thought that was the reference, just wanted to make sure.... clever way of negating the belief in the other sandbox!