View Full Version : Alt Attributes Appearing as Anchor Text in Text-only Cache
Marcia
06-29-2004, 04:43 AM
This is sure how it's looking right now, in cases where the graphics are being used as anchors for links. They look exactly like regular text links, you can't tell them apart. Assuming the possibility that this is the case and could be how it's being interpreted from Google's end, it raises a few issues - including a couple that have been getting a lot of attention and stirring a bit of controversy over the past few months without having been firmly resolved.
First, people over time have questioned whether graphics with keywords in the alt attribute have as much scoring value as ordinary text links using keywords in the anchor text. Without trying to come to a definitive conclusion, they sure look the same in the text-only cached version of pages; there's no way to differentiate which is which without looking at the page with the graphics.
So then that gets into a few other things: keyword density on the page, number of occurences of the identical phrase on the pages, and the occurences of identical anchor text of links, including the alt attribute in graphics, between pages within the site - most particularly if it's heading graphics at the top of the page linking back to the homepage.
I've seen a number of cases over the past few months where when the heading graphic or logo links back to the index page sitewide, all with the same keyword phrase in the alt attribute, that's all that showed in Google's description snippet for the pages - like it got stopped short at that point. And when the next element is a bit of text or H1 that's identical sitewide, it's those two that have shown in the snippet and that's it.
The serious factor is something there's been some debate about. Some believe that all identical anchor text is fine, while others believe that if the identical phrase goes over a certain percentage of total links it can trip a penalty or filter. I'm one of those that believes it can happen, even just within the site itself without regard to inbound links, though inbounds also could possibly make it even worse if there's a problem. I've had it happen to me; that was one of the factors I identified as causing problems with the site.
Then there's on-page KWD and number of phrase occurences. While for Yahoo a certain ideal density and number of occurences is fine, some believe that for Google it's necessary to whittle down number of occurences of an identical phrase on a page to avoid problems.
I don't know who has been looking at and considering linked graphic alt attributes in number of occurences along with what's visible on the page, like when looking at the cached page with highlighting on, but the picture sure looks different to the naked eye seeing the text-only cache.
I have a feeling some people may have gone over the top and run into problems because of some of these factors.
Has anyone else been looking at this text-only cache now being shown and had any observations about it?
dannysullivan
06-29-2004, 07:29 AM
Marcia, do you recall when the text only cache option was added? I certainly don't remember seeing it before today, when I went looking after your great post.
For those that aren't clear, do a search, and Google generally gives you the ability to see exactly what it spidered by selecting the Cached link underneath a page's listing.
This cached link has typically brought up a copy of the page, then pulled any images for the page off the page's own web site. That can be annoying if the images have since been deleted or links changed.
Here's an example (http://66.102.11.104/search?q=cache:XBvJ7pr5NuMJ:www.cnn.com/+cnn&hl=en) of the CNN home page, as shown through the cache. Now look at the top of the page. The third line says:
This cached page may reference images which are no longer available. Click here for the cached text only.
Click on that bolded part (it's a link at Google, but the forum software kep messing it up here), and you'll now see the page without images. Then to the point Marcia's making, any images on the page that had ALT text associated with them will be replaced with the ALT text.
For example, the "Powered by Yahoo search" graphic to the right of the CNN search box becomes simply "Powered by" in the text-only cached, as that's all the ALT text that was associated with the logo.
Down at the bottom, you can see what Marcia's talking about. The Fortune logo is also a link to www.cnn.com/fortune. In the text-only cache, the logo is replaced with the logo's alt text and turned into a link, so it looks like this, Fortune: (http://www.cnn.com/fortune/)
It's been some time since I've gone back and looked at the state of ALT text indexing. This has been primarily because I never advise people to worry about it much for search engine purposes. If a page is text light, adding a bunch of ALT text hasn't seemed to make up for the lack of HTML text. I liken it to running a race with your leg in a cast. You can do it, but the other runners will probably beat you.
I last had in my notes that Google only indexed visible text, so not ALT text. Did a few searches and found a WebmasterWorld.com discussion on ALT text support being dropped (http://www.webmasterworld.com/forum3/8859.htm) last year and another one suggesting that ALT text might be indexed only if it was within an image that was also a link: Alt text inside a link counts (http://www.webmasterworld.com/forum3/14178.htm).
I did some checking today with that CNN page to confirm this. This search, site:www.cnn.com (http://www.google.com/search?q=site%3Awww.cnn.com+%22America+Votes+2004. +Complete+election+coverage%22) "America Votes 2004. Complete election coverage" brings up a description of the CNN home page like this:
Museum re-creates dinosaur breath. • 'Fahrenheit 9/11' breaks records | Video • America Votes 2004. Complete election coverage Complete coverage. ...
www.cnn.com/ - 60k - 27 Jun 2004 - Cached - Similar pages
The fact the part in bold appears shows that Google has indexed the copy. That bold part is also only in the ALT text of a graphic showing at the bottom of the More Top Stories area of the CNN home page, on the right hand side, near the top.
Now when I do this, site:www.cnn.com (http://www.google.com/search?hl=en&lr=&ie=UTF-8&q=site%3Awww.cnn.com+%22powered+by+Yahoo%21%22) "powered by Yahoo!", I get no matches. If any ALT text was being indexed, then the home page would have shown up. But this particular ALT text only appears in a graphic, not a graphical link.
So, it looks like Google's been indexing ALT text as anchor text for over a year, and the text-only cache makes this much easier to now spot.
rustybrick
06-29-2004, 09:57 AM
Very nice find guys. I also have never seen this text only cache version, and I tend to look at Google's cache often.
Marcia - you brought up excellent points. Do most SEOs have contingency plans? Just in case Google does start ranking based on alternative text or other areas?
semsai
06-29-2004, 10:55 AM
Alt-text is a staple for helping visually impaired visitors (and disabled image browsers) "see" the site. If one has been practicing good, basic, fundamental site design, then alt-text has always been a part of a SEO's tool kit -- regardless of how/what/when Google gives weight to it.
Of course, it is nice to see that employing best practices in site design pays off in Google. That's never a bad thing either :)
David Wallace
06-29-2004, 12:39 PM
The serious factor is something there's been some debate about. Some believe that all identical anchor text is fine, while others believe that if the identical phrase goes over a certain percentage of total links it can trip a penalty or filter. I'm one of those that believes it can happen, even just within the site itself without regard to inbound links, though inbounds also could possibly make it even worse if there's a problem. I've had it happen to me; that was one of the factors I identified as causing problems with the site.
This could potentially be a bad thing for sites that use include files for headers, footers, etc. If someone uses an include file for their header and link back to home page through logo, using alt attribute text, then this is going to show up the same way in every page, not because one is purposely doing it but each page is pulling the same include file. Surely Google wouldn't penalize a site such as this because it is a quite common practice.
On external links, if they were to penalize for duplicate alt attribute text, what would stop one's competitors from linking to your site through a graphic on multiple pages using the same alt attribute text?
It would be interesting to see what GoogleGuy has to say about this. Anyone know if this has been discussed on WMW and if GoogleGuy has had any comments on the matter?
rustybrick
06-29-2004, 01:24 PM
Regarding the internal links (i.e. company logo that links back to the same page with same alt text), I have a solution that works well on dynamic sites.
On some of my dynamic sites, when I do an allinurl:www.domain.com site:www.domain.com most the results have that "In order to show you the most relevant results, we have omitted some entries very similar to the # already displayed. If you like, you can repeat the search with the omitted results included." message.
The reason some of my sites had this was because the content closest to the top of the page was the alt tag of the logo that linked back to the homepage. All the pages had the same header with same text at the top.
What I did was change the alternative text to read the same as the page title for each unique page. Now the alt text is different at the top and no more messages like "In order to show you the most relevant results, we have omitted some entries very similar to the # already displayed. If you like, you can repeat the search with the omitted results included."
Also, I do not have the same alternative text with a link back to the same page on every page. Not that I feel having the same text linking to the same page over and over again is a bad thing. I only did this because I hated getting "In order to show you the most relevant results, we have omitted some entries very similar to the # already displayed. If you like, you can repeat the search with the omitted results included." :D
Daria_Goetsch
06-29-2004, 01:45 PM
The serious factor is something there's been some debate about. Some believe that all identical anchor text is fine, while others believe that if the identical phrase goes over a certain percentage of total links it can trip a penalty or filter. I'm one of those that believes it can happen, even just within the site itself without regard to inbound links, though inbounds also could possibly make it even worse if there's a problem. I've had it happen to me; that was one of the factors I identified as causing problems with the site.
That's really interesting, Marcia, good find. Could cause numerous problems sitewide in Google.
detlev
06-29-2004, 10:27 PM
Hello everyone,
It has long been debated whether the alt of images are indexed, it seems that they are, and seemingly counted when found in a link. It is important to describe the image properly as many browsers surf with images off. I am using a PDA and surfing with images off right now. And that is only one instance of browsing without images on. Please be mindful when marking up your alts. Keywords can belong if they are descriptive of the destination page when used in links. That is good SEO and usability too.
*cheers*
-detlev
AussieWebmaster
06-30-2004, 11:31 AM
Though all the factors mentioned were good. I like the feature because now you can combine it with the highlight tool in the Google toolbar and get a better picture of your competition... see which ones are using CCS to rework <h1> tags etc. .... how packed their alt tags are.... I will be playing with this for a couple of days... much easier than doing a spider simulator...
Dodger
06-30-2004, 05:05 PM
This could potentially be a bad thing for sites that use include files for headers, footers, etc. If someone uses an include file for their header and link back to home page through logo, using alt attribute text, then this is going to show up the same way in every page, not because one is purposely doing it but each page is pulling the same include file. Surely Google wouldn't penalize a site such as this because it is a quite common practice.
I look at it like this, if the Alt attribute is not used in a spammy way there should be no problems. Two things. One, this is not any different than having a text link that does the same thing by pointing back to your home page or any other page for that matter. Header include files normally have repetitive navigation bars also ... and this is another form of navigation. Some navigation menus are rollover images which also use the Alt text.
Second, the Alt attribute if used within the W3C guidelines for what it is supposed to be used for (Alternative text for non-graphical browsers) then I would think Google will take that into account. Of special note would be the rollover navigation menus that will display the Alt text in lieu of images in text only capable browsers.
On external links, if they were to penalize for duplicate alt attribute text, what would stop one's competitors from linking to your site through a graphic on multiple pages using the same alt attribute text?
Links can never hurt you. I think this is one of those manual "we have to check it out" types of things that Google will have to perform to ascertain whether the site is doing it for their benefit or to screw somebody else. You can do the same thing with text links also if that were the case.
It would be interesting to see what GoogleGuy has to say about this. Anyone know if this has been discussed on WMW and if GoogleGuy has had any comments on the matter?
As far as I know, GG has been pretty much silent lately. Possibly because of the IPO. He did recently come out because of the mouseover event firing a redirect script. The pages were being dropped from the Index. But this had to do with a specific SEO company, and it appeared that things needed to be rectified.
Marcia
07-02-2004, 05:53 PM
>>recall when the text only cache option was added?
Early this week for the first time I noticed it. I've always checked the cache a lot, especially since being so conscious of the number of times exact phrases are used on pages - it's easier to spot with the highlighting; but I'd never noticed this until earlier this week.
It's been in cases where the graphic is a link so far, but I found 3 instances yesterday afternoon of the alt text showing up without the graphics being used to link. It looks like ordinary text in a regular Times font.
One is a site I found at the new MSN beta that's got a JS redirect to a PPC engine and mostly garbled gibberish text on the pages. Checking for the URL at Google, the alt text of the one graphic on the page is showing in the text-only cache and it isn't linked. The site isn't in the regular index, it's all Supplemental Results.
The other two are top-rankiing sites for a particular search that are mostly graphical in content on the homepage - visibly, anyway. On one in particular, the alt text for every graphic is showing, with or without being a link; even the graphical "hr's" on the page, which were apparently put there to get more keywords in without interfering with the look of the page. It's keyword-stuffed to the hilt in not only alt text but other elements as well. Some of the page text that isn't graphical is not really visible and consists of links with keyword anchor text. Apparently it works just fine. ;)
This may have been there, but I've never noticed it before - with the full URL of the page in the cache so that it can easily be copied and pasted.
>>>To link to or bookmark this page, use the following url:
I'm not sure what the implications are for scoring with regard to alt text, but it's sure been made convenient for people to spot these things and even easier for some to report what they're catching.
Robert_Charlton
07-06-2004, 02:03 AM
Hi Marcia - Thanks for (emailing) the url of the page you describe. What a foul smelling page. Whew.... :eek:
For the non-linked top graphics, as you describe, the cache is showing the alt text. I've always thought that Google indexed alt text... I think I've seen it in serps... but I never thought they weighted it very heavily.
Where there are graphic links, yes, I am seeing the alt text returned as anchor text in the text cache. This for me could be major news.
I didn't see link title attributes, though, returned as anchor text. There was at least one on the page.
It's a hard page to sort out, however, and I may well have missed it, because there's so much hidden text and so many hidden links.
I found it interesting that the hidden text remains hidden in the text cache... I trust that Google can sort it out.
For me the question would be how much are these alt anchors weighted. Are they weighted as regular anchor text, which would be significant, or are they weighted as regular alt text is (which I feel is so small that it's essentially useless)?
Marcia
07-10-2004, 10:19 PM
There's a very simple to see example in this thread here because it's a very simple, straightforward page
http://forums.searchenginewatch.com/forum/showthread.php?t=451
If you look at the text cache for the second site the member mentions (not his own), you can easily see the non-linked alt attributes appearing as regular text, though none of those words are highlighted.
It's interesting to note, but at this point it looks like it can be assumed that the value is for usability purposes.
Bojon
07-11-2004, 05:27 AM
When our banner ads are clicked on they first go to our ad agency where they pickup a tracking cookie before they are taken to our website.
A person clicking on the banner has no idea this is happening. They click on the banner and arrive at our website.
If we sold Blue Widgets and we had “Click here for Blue Widgets” in the alt tag of the banner… How would Google credit the anchor text?
Would our ad agency start ranking well for Blue Widgets or is Google smart enough to know that WE are the ones that should benefit from the anchor text in the link?
Thanks, :confused:
Marcia
07-14-2004, 06:39 PM
Bojon, that's a completely different issue and what happens can depend on how that redirection from the banner is being done. It really does warrant starting another discussion for the question; there could be some technical issues involved. It's more about who's getting the benefit of the inbound anchor text the way you're asking it, unless I'm interpreting it incorrectly.