PDA

View Full Version : Help required


kina
08-21-2006, 11:49 PM
I have been learning from posts here over the last months and have increased position in search engines gradually - reaching first page for a few search terms. But not for the most important ones.

My best result on Yahoo was position 2 for 'Badaling Great Wall' - but that was 2 days ago. Now that page is not in the index at all ! Other sub-pages are still in the index however. Yahoo search team sent an auto reply about not being banned. Some old 404'd pages are still in the index too - aaargh! But not of that page.

Anyway, while still have a little hair left, maybe it's time to throw open a plea for advice on what might be not good about my site structure, keyword density etc.

If anyone has the time to take a look at this puzzle it will be greatly appreciated.

On all engines, 'Temple of Heaven' does badly, even though my main page for this is good - or so it seems to me !

My website is in my profile.

Regards,
Steve.

kina
08-22-2006, 12:09 AM
That one page dissapearing from the index (well, several actually, but not all by any means) is so surprising - i have not heard of such a thing before. Especially when some pages i have removed and 404'd long ago are still in the index !

Anyone have an idea what could be the reason ?

Brian M
08-22-2006, 02:58 PM
Hi Steve,

A "site:" command returns some very odd results, and a very quick look reveals that you have numerous home pages in your root directory, but none of these automatically re-direct robots to the actual home page that you have listed in your profile. There may be others, but I can already see:

index.html
default.html
index.jsp
default.jsp

Before looking any closer, I would recommend that you decide upon one structure for your site and remove all the other confusing pages that might be returned to a robot. An "index" or "default" page is what most browsers (or robots) look for when they check a directory or subdirectory, so having multiple examples like the above can cause extremely sporadic results. This can also cause very strange results over time since duplicate pages are eventually indexed by the robots.

Clean these things up, save your remaining hair, and then be patient...

Brian M

kina
08-23-2006, 12:35 AM
Thanks Brian,

I can't see the other home pages you mention. I tried to load them in a browser and they seem not to be there, so i don't think that is a problem.

However I will explore this further ... Thanks for the pointer.

Steve

kina
08-23-2006, 12:47 AM
On further investigation of Yahoo's index, it seems that since about last friday that Yahoo has been using an older index. A number of new pages were showing last week but not this week, and some old pages that seemed to have been removed are now back.

I assume this wouldn't have happened only to my pages so has anyone else noticed this ?

Could this be because of a problem in the new index and a return to an older one meantime or because there are multiple indexes or ... ?

Steve

Marcia
08-23-2006, 12:52 AM
I tried to load them in a browser and they seem not to be there, so i don't think that is a problem.I think there is a problem, because they are there. Try typing in the domain name with these after

index.html
index.jsp
default.html
default.jsp

They're all there. Clear browser cache and take another look.

kina
08-24-2006, 12:10 AM
I have checked by browser and also looked with my FTP program - only index.html is there - which should be. As far as I can tell none of the others are there. At least, not in the root. I trust what you say, but why I can't see those files ??? Even FTP listing does not show them.

Brian M
08-25-2006, 11:18 AM
The pages that we see are from a custom 404 page, but it is not clear what that page is because it says, "Site has been updated" so both Marcia and I were fooled into thinking that those pages existed. It might be more useful to visitors if that page said, "Page Not found" but that's a minor point because the page itself puts out a 404 server header code, which is excellent and should not confuse the robots.

However, when a visitor (or robot) requests the domain without a page, the index.html page is delivered (as it should) and this page should tell people what the site is all about.

Unfortunately, there is also an index2.html page in the site, which is what the search engines are picking up and displaying as the home page. For example, Google displays this title and snippet:

"The Java IDE
A Java IDE includes code editing highlight, code completion, debug and compile capabilities. Also built-in API help, framework code generator. [Shareware]
www.yoursite.com/ - 13k - Cached - Similar pages"

I'm not sure how this could happen because the cache shows the existing page (and I have only seen this before when cloaking software was used), but I would delete (or 404) the index2.html page and leave the index.html page in place. I would also recommend a "site map" link on the index page so both visitors and robots can find what is in your site.

In addition, there are a lot of pages devoted to JAVA and OOP in your site, so it is not clear what your site is about, and that can work against you in the natural SERPs. For example, if I search for "Beijing JAVA" your site comes up very high in the SERPs (page one in Google and MSN), but I'm not sure that is your goal. If not, I would delete (or 404) all of those pages and leave the Beijing pages in place. The robots have found those pages and will continue to index them unless they put out a 404.

This confusion can be the cause of sinking SERPs, and as more of these pages are added to the index over time, they dilute your site and your original SERPs will sink. It is easy to fault the search engines, but the cause is usually within the site and it just takes time for it to become apparent.

One more thing: in the index.html page there is an additional carriage return and line feed character in the meta description tag. Some robots will exit the tag as soon as they find "invalid" characters like these and look for other elements in the code to use, so you should remove these extra characters:

"<metaˇname="description"(CR)(LF)
content="Seeˇmoreˇthanˇ30ˇplaces..."

Sorry for any confusion and the delay in my response. I wish I had more time to look at your site, but other customers are begging for my time...

By the way, your site has very beautiful images so it should be found.

Brian M

kina
08-29-2006, 02:15 PM
Thanks Brian,

I need to move the Java stuff to its own host but can't afford to do so at the moment.

Your point about CRs / LFs in the description is a great one - i will ammend appropriately as i update.

Steve