PDA

View Full Version : Website ignored by Google


webecho
09-14-2005, 06:19 AM
This is my first post here and i'm desperately hoping someone can help.
The 'article' below is a little long winded but needs all the details so i dont have to keep saying "thanks but i've done it already".
Summary - a validating website, with no cheats, using sitemaps, reasonably linked to is being completely ignored by google. - I really need some help/advice

Thanks

Webecho
-------------------------------------------------------------------------

Seaeco.com is an accessible website with validating XHTML and CSS and was designed without the use of frames. It has quality, keyword rich content, quick loading pages and conforms to all web standards save for some of the AAA accessibility guidelines.

Seaeco.com has been submitted to Google, MSN, Yahoo and of course DMOZ.org. The website has incoming links (set up by me) from about 8 websites, 4 added before it went 'live', the rest done at 2 a day over a week or two, it also contains links to it's testsite from CSS forums where i have asked others opinions about the design and coding. It contains no broken links, it has a submited Google XML sitemap as well as a more readable XHTML 'visual' sitemap for real visitors.

Every page has a keyword rich title, a 'descriptive' description and i haven't overdone it it the META keywords by repeating things too often or using too many keywords in there. (from what i understand, Google pretty much ignores this particular META tag anyway. It has been present on the net ready for indexing in this format for nearly 3 months .......


This situation has been confusing and irritating me (and my client) for a while. I understand that Google will only index the website approx every three to four weeks, but judging by the three other websites i have uploaded or redesigned during this period of time, the only conclusion i can come to is it is being 'ignored' for some reason.
Did i make a mistake?

I may have made, what i believe, are a couple of errors submitting this website.

1. While submitting the other websites to DMOZ.org, i got carried away and added seaeco.com before it was actually there, i don't know why - over enthusiastic i guess (and probably tired too!).
2. I have submitted the website to Google a couple of times as well as submitting the seaeco.com/sitemap.xml through Google sitemaps - did i overdo it?
3. When the website was originally posted it contained incoming links from the other websites i have designed, firstly on their attractions pages (all the websites are tourism based in a small area and are related) and secondly on the 'about this website' page, a page which i include in all websites i have designed, promoting my services as a web designer and linking to other 'examples' of my work. Did i make a mistake by having incoming links to seaeco.com before it was actually there?

The other websites i have designed and uploaded during this period have been indexed at least once, with some being indexed three times, they are all of the same 'format' i.e. they are all validating XHTML, CSS accessible, quick loading with relevant content and with incoming links from a variety of websites - all worthwhile incoming links.

I haven't succumbed to the temptation of 'link farms' or even linking from websites with unrelated subjects. I haven't 'cheated', there are none of the old SEO 'tricks' like hidden text coloured to match the background, no stupid keywords etc - i've done it all right! (i think)

------------------------------------------------------------------------

link to website ignored by google (http://members.westnet.com.au/freakmansion/articles/websiteignoredbygoogle.htm)

Rob
09-14-2005, 02:23 PM
how new is this domain?

I ask because when when I try a WHOIS lookup it doesn't show an IP address to resolve to, yet I do see that it does resolve.

Also when I WHOIS the IP I don't see the site listed.

If it's not a new domain then perhaps it is a hosting issue?

webecho
09-14-2005, 09:58 PM
Hi Rob
It's been up and about for around 3 months.

I see you point re the IP address but Yahoo and MSN have found it and rank it quite well for my main keywords "whale watching dunsborough" and sailing dunsborough".

I may be wrong but im assuming that if the others can find it then Google should be able to :confused:

How would i check that it is hosted correctly on the server?



Webecho

Rob
09-15-2005, 11:47 AM
Yes you are right if the other engines are finding it then it's not likely a DNS issue. I Googlebot at least crawling the site? Do you see it in the logs?

Also, did you register this domain as new? Or did you purchase it from someone? I ask because sometimes Google will "hold" a site that was previously registered to ensure it isn't spamming, and that the content is similar to what it was before.

If it's a newly registered domain that too could be the issue - I've seen sites with new domains take months to get included.

I think though the first thing to check is that it is being crawled by Googlebot.

webecho
09-15-2005, 02:07 PM
Hi Rob
Yes it was registered brand new, i have checked wayback machine and there is no record of it previously.

Would i need to contact my host to get the server logs?

Unless it's just one of those waiting games, what's confusing me is that the other sites i have put up have all been found quite quickly and all link to this one (and each other).
I did leave the test site up for a few weeks with identical content, while i was adding some finishing touches so whether that would have an effect i don't know, but the test site has been gone for at least 6 - 8 weeks so......

I'll contact my host and see if i can get the server logs for this site and hope that gives me a clue or two

Thanks Rob, i'll post any results i get and if you think of anything else in the meantime, please let me know


Thanks

Webecho

webecho
09-15-2005, 09:44 PM
Hi Rob
I got this result from my siute stats

crawl-66-249-66-42.googlebot.com

Machine Name or RDNS : crawl-66-249-66-42.googlebot.com
Machine IP : 66.249.66.42
Total Visits : 16

I guess that means Google has crwled it 16 times

also this

crawl-66-249-66-212.googlebot.com

Machine Name or RDNS : crawl-66-249-66-212.googlebot.com
Machine IP : 66.249.66.212
Total Visits : 10
Average : n/a

still google just from different IP

So it looks like they have crawled it - i guess there must be a problem somewhere else - would you agree?


webecho

Rob
09-16-2005, 12:22 PM
hmmm,

It didn't seem to visit too many times.

Can you tell how deeply its getting into the site? Or is it requesting the same pages over and over?

Rob
09-16-2005, 12:30 PM
hmmm,

It didn't seem to visit too many times.

Can you tell how deeply its getting into the site? Or is it requesting the same pages over and over?

Normally Gbot crawls many more pages many more times per day.

webecho
09-17-2005, 02:38 AM
No Rob
I can't see how deep it's going, i found out that info by looking at my livestats, it can't do it on a page by page basis (at least i don't think so) - yeah just checked again and it dioesn't say

Ironically, I'll go and google for something that mnight tell me


Cheers Rob

Rob
09-21-2005, 01:09 PM
Hi

Sorry for the delay in responding - I'm out of the office this week and couldn't remember my login :)

If you could get access to your server logs, I'd recommend using a program like AWstats to check and see what Googlebot is requesting.

Since there still doesn't appear to be any pages indexed, I'm wondering if there's a spidering issue?

Thanks
Rob

webecho
09-21-2005, 02:09 PM
I have found out that the intermediate site that was set up while the DNS numbers were registering is still online. As far as i can work out, any seaeco.com address is pointing to the temporary one not the real one!

I've been on to my server guys and they are sorting it out, i guess i wont know until Google does another crawl.

Unfortunately, in all the messing about with the two site deleting reloading etc, all the logs are gone so i will have to start afresh PITA but nothing i can do about it.

I'll let you know how i got on, thanks for all the time you've put into this one mate ...much appreciated eh! :)


Webecho

Rob
09-21-2005, 06:36 PM
No Worries!

Let me know how it goes?

Thanks

webecho
09-21-2005, 11:22 PM
I actually had an email back from Google - not just the autooreply pointing me to the google guidlines page but saying that my enqiry was being passed on to the Google engineers for them to look into it!

I'ts going to be very interesting to see what they have to say...

webecho
09-28-2005, 08:40 AM
I have posted an update to this topic on my new blog
http://www.webechodesigns.com/blog/archive/2005_09_01_archive.html

hopefully we'll get some results


Freddy

webecho
10-23-2005, 01:30 PM
This is a more concise re-write of an article originally posted here http://webechodesigns.com/blog/index.htm, since the original article i have updated it twice. This is intended to include all the information.

Website URL: www.seaeco.com (http://www.seaeco.com)
Problem : Never indexed by Google

www.seaeco.com (http://www.seaeco.com) was originally posted in March 2005, since that time it has slowly been indexed my all the major search engines EXCEPT Google.


To date (23 october 2005) WebCEO shows it as ranked by 7 major Search Engines:
MSN, Yahoo, All the Web, Alta Vista, Jayde and Webcrawler
and
Allthe Web, Alta Vista, Yahoo and MSN report 48, 41, 48 and 1 incoming links respectively all to the default or index page, they also show between 5 and 15 incoming links to all the other pages on the website.


Designing and submitting
seaeco.com (http://seaeco.com) was created using validating CSS and XHTML and provides original content aboutWhale Watching in Dunsborough (http://www.seaeco.com/whalewatching.htm) Western Australia and Yacht cruises in Dunsborough (http://www.seaeco.com/cruise.htm).
It has been submitted by hand to all the major Search Engines as well as approximately 15 web directories including DMOZ and Yahoo.
The titles, headings, image alt's all contain the relevant keywords, there is a reasonable keyword density in the three main pages relating to whale watching and sailing in dunborough, naturally with each page using the optimal keywords for its content.
I have not employed ANY Black Hat SEO techniques and have designed this site no differently from a couple of others i created around the same time. The other websites are indexed and ranking well so far.

Seeking advice
While struggling with this over the past few months i have sought a great deal of advice in Search Engine forums, as i continue to do, i have recieved many helpful comments and ideas, many of which i have put into practice, all to no avail. I have contacted Google on several occasions regarding seaeco.com and after a fair wait each time, recieved two replies. The first stating that the Google engineers were 'looking into it' and the most recent stating that they could confirm that seaeco.com (http://seaeco.com) was NOT being penalized or banned for any reason. Although the last reply was initially comforting, it has left me in a position, where as far as i (and many other experinced eyes who have looked at it) can see, there is absolutely no reason for seaeco.com not to be indexed. I think maybe i would have preferred to find out i was banned or penalized for a specific reason, which i could at least amend and then beg to be let back in.

What have i done so far?
Apart from checking to ensure i had employed no Black Hat SEO techniques, i have researched keywords and optimised the pages as much as possible. I have searched for relevant websites and personally approached them in order to obtain links to seaeco.com, most of which have been successful (remember just ask nicely!). I have submitted to the major directories, with some success, not in DMOZ unfortunately (i think the backlog is a few years by now). All the incoming link text that i have been
able to specify uses the relevant Keywords, some are unfortunately stuck with "visit their website" and other vague linking texts.

I have submitted the Google sitemap (as i do with all my sites), i have been onto the site to update it and am currently in the process of setting up a blog (yes i know i should have done that straight away but i'm pretty new to the game), which will be updated every few days detailing whale sightings in Dunsborough.

What mistakes did i make?
These are mostly conjecture, i did explore the possibility of other mistakes but have slowly discounted them through research into the do's and don'ts of Google optimisation and submission (and lots of helpful advice).

The only one that seems to have hung around is this:
When seaeco.com was being designed, i was also designing three other websites, all of my websites have so far linked to each other, each one has been tourism and Dunsborough related so i figured this was a fair call and have been given no reason to think otherwise.
I did post the other websites before seaeco.com went live - this means that there is the possibility that Google spidered the other websites during the week or so when the links pointed nowhere. It has been suggested to me that this may have caused Google to think they were dead links.
However, from further reading and advice, i gather that even if this was the case on Google's next indexing it would try again, so in theroy it should then come across some worthwhile content at the other end of the link and indexed it as normal.

Other oddities
When searching on Google for "seaeco.com", it often brings up a listing for seaeco.com.au, a dead link which doesn't exist.
Some SEO guys have suggested that there may have been / be a problem with resolving the DNS addresses to the website, which i haven't discounted, but as i have another 4 websites on the same server with exactly the same IP that are not suffering, i am inclined not to follow this path any further. I did contact the host provider

who checked and assured my everything was OK. I am not hugely knowledgeable on the server side of things yet, so haven't been able to 'really' check for myself.
According to my server logs, Googlebot visits seaeco.com, not as often as it visits my other sites but it's been there ...so why did it ignore seaeco.com (http://seaeco.com)?


So that's where it's at ....

The intention of this article is to bring it to the attention of as many people as possible in the hope that someone will either spot the problem or can offer me another path to explore, or, by spreading this article as widely as possible with numerous links to the original site, hope that Google will someday spider it.

Any advice that any of you may have would be fully tried and tested ...fingers crossed!

Robert_Charlton
10-23-2005, 06:13 PM
webecho - Please forgive rushed reply. I've only skimmed your voluminous posts... and I vaguely remember you'd posted elsewhere with similar questions.

Briefly... you say your site is only 3 months old. Google is currently taking anywhere from 6 months to a year to rank new domains. Google indexes sites before it ranks them, and submitting to Google is generally pointless. Google indexes sites that have inbound links from other sites already in its index.

Beyond that, interlinking sites of similar subject matter will often cause Google to drop all but one of them out of the index, or at least not to rank them.

I recommend that you get links to each of your sites from sources completely independent of each other. If all of the sites would plausibly all get links from the same places and nowhere else, then you might consider combining all of your sites into one.

CSS compliance, etc, have little to do with whether the engines will rank or index you.

Robert_Charlton
10-23-2005, 06:24 PM
PS to the above, as I see your question is about spidering. seaeco.com and www.seaco.com are not seen as the same site by Google.

Your dropping links here to both of those domain variants is only going to confuse the problem, not to help it.

Until the engines offer some sort of domain management tools, you need to decide on one version for your site, www or no-www, and set up a 301 permanent server side redirect, in .htaccess, using mod_rewrite, directing all variants to your default version.

Re domain management, read Danny's blog on the subject...
http://blog.searchenginewatch.com/blog/051021-092129

I'm guessing on all of the above. It could be a robots.txt, I don't know and am not about to wade through all the detail. Hope this is helpful.

webecho
10-28-2005, 11:35 PM
Until the engines offer some sort of domain management tools, you need to decide on one version for your site, www or no-www, and set up a 301 permanent server side redirect, in .htaccess, using mod_rewrite, directing all variants to your default version.
Hope this is helpful.

Certainly was helpful, i uploaded the .htaccess yesterday and lo and behold seaeco.com appears on Google this morning!

Although i am very happy seaeco.com has been found, i must admit to kicking myself for not trying this earlier.

RewriteEngine On
RewriteCond %{HTTP_HOST} ^seaeco\.com$ [NC]
RewriteRule ^(.*)$ http://www.seaeco.com/$1 [R=301,L]

That is all it took! 4 little lines of code and seaeco.com is now visible!


Thank you to everyone that's looked at the site for me and tried to help