Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > General Search Issues > Search Technology & Relevancy
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 01-06-2005   #1
fathom
Member
 
Join Date: Jun 2004
Location: Nova Scotia, Canada
Posts: 475
fathom is a jewel in the roughfathom is a jewel in the roughfathom is a jewel in the rough
Title Attribute Control Test

Title Attribute

Premise: Google states that 100 variables are considered for ordered rankings - If true is Title Attribute one or merely for usability - let's find out.

Hypothesis

The title attribute [title=""] is known to offer usability value for humans viewing a web page. To date, limited knowledge is known to suggest that the title attribute offers value to enhance ranked results. This test series intents to provide observations beyond theory or spectulation in order to have a reference for conclusions..

Test Series

Six independent tests each using a set of two control pages will be conducted. One page of the set will use title attributes in all 'point-to' links towards the page, the second page will use links with no title attributes in all 'point-to' links towards the page.

Controlled Variables

All pages will use Acirehp as the targeted keyword [S- pherica backwards] and where applicable using 's' for plural [Acirehps]. Neither currently return any results.

To establish a quality baseline all set pages will have Acirehp as the title element, and 6 times in the body text all positioned in the exact same character position from top left. All link anchors to the page sets will use Acirehp as the anchor.

Page body text [and source code] will have the exact number of characters with only the general body text being unique to avoid duplicate content penalties.

Each page set will be on different domains, different hosts, IPs, and each at different general levels of authority in order to predetermine which set will [should] rank above the others of lesser overall authority.

Both pages of a set with have main navigational links towards them [site wide] to provide a fair quality control margin without negative bias.


Series #1

Link Anchors & Attributes identical text

Series #2

Link Anchors as singular word / Attributes as its plural [testing stemming value]

Series #3

Image Anchors & Attributes identical text

Series #4

Image Anchors as singular word / Attributes as its plural [testing stemming value] As with Series #1 where the text anchor does not have 's' the alt="" attribute does not have it either.

Series #5

Identical to #1 however, in order to remove the chance that 'link position' may bias the 'point-to' links with a page top prominence inducing a false observation - Series #5 reverses the link order.

Series #6

Identical to #3 however, in order to remove the chance that 'link position' may bias the 'point-to' links with a page top prominence or left on page prominence inducing a false observation - Series #6 reverses the link order.

Test Duration: While fresh results will appear within 48 hours - would expect 6 months [approximately 2 PR updates] would be needed to effectively make informed observations.

Pre-Test Conclusions:

Series #2 and #4 are believe to be the most promising as out of ranked results of 12, noting terms Acirehp and Acirehps no results are returned by Google and 'Acirehps' will be in no other element or attribute but the title attribute. As the title attribute is a support tool 'if used in ordered rankings' it is likely weighted only as a supplementary to other variables.... meaning if used independent of visible elements to 'spam' results a nil is returned.

However, it is the authors belief that Title Attribute weighs in on relevancy issues. When use in conjunction with other elements (other variables) positive enhanced return can be observed.

Major consideration - Most often when a singular occurrance is tested for observation a nil is return noting 'inconclusive' not because there is no effect but because the level of effect is so small the ability to make detectable observations is difficult in real-time, real world results.

This is no different than attempting to observe the effect of a single low quality link [PR1 - PR3 from pages that have 20+ links. There 'IS' an effect - we simply cannot easily see it.

However, if you have 500 such links - observations are noticable - as with this control test - links to the control set pages are between 100 - 500 and in the range of PR4 - PR6.

Experiment Specifics

To avoid content duplication each page will contain consecutive text copy from an unpublished article 'THE EXECUTIVE MONKEY: Considerations for Burnout in the Workplace By Kelly McCullough to facilitate unique page content at a 1500 characer level per body text per page. Within the text copy Acirehp will reside; being the frist word of copy and last plus four additional times.

One addition consideration: in all likelihood ranked results of the control set pages will appear as a listing and indent for queries Acirehp. Without title attribute have any optimization value there should be a random placement for which page becomes the listing.

The most interest part of this control will be if the stem query of Acirehps returns any results - as it stands - if title attributes has any optimization value - 2 results should return.

NOTE: domain name will not be link to - to avoid tainting results - they will appear soon enough

This thread [and one other also use the keywords so should add additional results for the query without negatively affecting the return.

Last edited by fathom : 01-07-2005 at 12:22 AM.
fathom is offline   Reply With Quote
Old 01-07-2005   #2
orion
 
orion's Avatar
 
Join Date: Jun 2004
Posts: 1,044
orion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to behold
Exclamation Excellent

Excellent procedural for an initial test. I encourage this type of experiments.

One thing to add could be a comparison component, free from noise and as follow

1. a control set
2. a training set

Then apply the exact procedure to 1 and 2. The training set shoudl be conducted using TREC sample/data. (chekc NIST organization). The control set could be a series of documents and the training set could be a series of terms or passages. Combination of these are also possible, but done properly. Still your exp so far is a good start and you are thinking right.

Cheers

Orion
orion is offline   Reply With Quote
Old 01-07-2005   #3
qwerty
Proudly Title-Free
 
Join Date: Jun 2004
Location: Somerville MA US
Posts: 134
qwerty will become famous soon enoughqwerty will become famous soon enough
I'll be watching. As I wrote in the other thread, I don't believe title attributes get indexed, but I'm certainly open to seeing some empirical evidence.
__________________
Bob Gladstein
Raise My Rank
qwerty is offline   Reply With Quote
Old 01-07-2005   #4
Mikkel deMib Svendsen
 
Mikkel deMib Svendsen's Avatar
 
Join Date: Jun 2004
Location: Copenhagen, Denmark
Posts: 1,576
Mikkel deMib Svendsen has much to be proud ofMikkel deMib Svendsen has much to be proud ofMikkel deMib Svendsen has much to be proud ofMikkel deMib Svendsen has much to be proud ofMikkel deMib Svendsen has much to be proud ofMikkel deMib Svendsen has much to be proud ofMikkel deMib Svendsen has much to be proud ofMikkel deMib Svendsen has much to be proud ofMikkel deMib Svendsen has much to be proud of
Very good work! I am looking forward to the results ...

Quote:
noting terms Acirehp and Acirehps no results are returned by Google
Well, except at least one, that willl soon show up, to "clutter" the results: This page
Mikkel deMib Svendsen is offline   Reply With Quote
Old 01-07-2005   #5
greenleaves
Member
 
Join Date: Jun 2004
Location: San Jose Costa Rica
Posts: 51
greenleaves will become famous soon enoughgreenleaves will become famous soon enough
Thumbs up Excelent!

Great to see this experiment. I'm gonna be waiting to see the results. I personally think the title IS considered by the SEs, but as pointed out, I think it will only be a very small effect.
greenleaves is offline   Reply With Quote
Old 01-07-2005   #6
fathom
Member
 
Join Date: Jun 2004
Location: Nova Scotia, Canada
Posts: 475
fathom is a jewel in the roughfathom is a jewel in the roughfathom is a jewel in the rough
Quote:
Originally Posted by orion
Excellent procedural for an initial test. I encourage this type of experiments.

One thing to add could be a comparison component, free from noise and as follow

1. a control set
2. a training set

Then apply the exact procedure to 1 and 2. The training set shoudl be conducted using TREC sample/data. (chekc NIST organization). The control set could be a series of documents and the training set could be a series of terms or passages. Combination of these are also possible, but done properly. Still your exp so far is a good start and you are thinking right.

Cheers

Orion
Good points there orion.

I believe this specific attribute is worthy - since it is a bonafide usability best practice... which make the potential for SEO far more appealing.
fathom is offline   Reply With Quote
Old 01-07-2005   #7
fathom
Member
 
Join Date: Jun 2004
Location: Nova Scotia, Canada
Posts: 475
fathom is a jewel in the roughfathom is a jewel in the roughfathom is a jewel in the rough
Quote:
Originally Posted by Mikkel deMib Svendsen
Well, except at least one, that willl soon show up, to "clutter" the results: This page
Had thought of that... if for acirehps and this page ranks below either control set - well that will truly open the debate up to what is actually occurring!
fathom is offline   Reply With Quote
Old 01-08-2005   #8
GoLinks
Professional Weboholic
 
Join Date: Dec 2004
Location: Israel
Posts: 15
GoLinks is on a distinguished road
Very interesting experiment, I'll be watching for progression,
would you monitor Google only or other SE as well?
GoLinks is offline   Reply With Quote
Old 01-08-2005   #9
fathom
Member
 
Join Date: Jun 2004
Location: Nova Scotia, Canada
Posts: 475
fathom is a jewel in the roughfathom is a jewel in the roughfathom is a jewel in the rough
Well it's pretty easy to monitor all, Google will likely show tomorrow early AM, MSN not too far behind.
fathom is offline   Reply With Quote
Old 01-08-2005   #10
orion
 
orion's Avatar
 
Join Date: Jun 2004
Posts: 1,044
orion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to behold
Exclamation Idf

One thing occurs me you could include in the experiment is a method for separating the effect of uniqueness of terms, which depending upon results should convince others of the merits ot your exp.

Term uniqueness shows in SERPS via the IDF term

IDF=log(D/d) = inverse document frequency

D=database size
d=documents containing the term

For uncommon terms, IDF is high since d is very low (few documents use the term) For very uncommon and invented terms, this effect is too way omnipresent and too strong, introducing bias.

This causes almost any document containing the term to show in the top results when the term is queried. The effect should occur regardless of anything you do to a document; hence causing spurious ranking results.

You may want to device controls to remove this effect.

Still, your work could inspire others to do more experimentation.

Orion

Last edited by orion : 01-08-2005 at 01:26 PM. Reason: typos
orion is offline   Reply With Quote
Old 01-08-2005   #11
orion
 
orion's Avatar
 
Join Date: Jun 2004
Posts: 1,044
orion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to beholdorion is a splendid one to behold
Exclamation some thoughts

The effect of term uniqueness that affect inverse documents frequencies (IDF) is not limited to single terms. It can also be induced with very unique phrases (and that probably few will search for or care of, anyway). It could also be induced with special delimiters as well.

If you elect to use phrases, you may want to avoid such type of phrases as well, otherwise this effect will mask the effect/observable you intent to measure. Overall the described uniqueness effect is easy to spot since IDF is very high and d (number of search results) is very small.

This is one of many reasons of why absolute ranking results are often questionable. To me, being #5 out of 40,000 is not the same as being #8 out of 7,000,000.


About your hypothesis

Quote:
Hypothesis

The title attribute [title=""] is known to offer usability value for humans viewing a web page. To date, limited knowledge is known to suggest that the title attribute offers value to enhance ranked results. This test series intents to provide observations beyond theory or spectulation in order to have a reference for conclusions..
Note. I’m not trying to open a textbook now. My intention is to encourage, not discourage, experimentation among seos/sems. I do believe your experiment should inspire others. More initiatives in all fields of search engine experimentation is needed and your exp is a positive step in the right direction. I honestly hope others follow into your steps.

Having taught at 2- and 4- year colleges I know how hard it feels at times when experiments are conducted in a science/computer lab setting. (In our case -as IRs/seos/sems-, the lab setting could be taken for the entire Web).

You may want to reword the above hypothesis, as hypotheses are formulated based on initial observables, not designed to provide observations. You may also want to write a null hypothesis (H0).

Null hypotheses (H0) are normally formulated and then prove or disproved. H0 should be an assertive statement based on observables. For instance, a null hypothesis is something like this

H0=”There is no correlation between X and Y”.

Note that H0 is stated as a negative. Then the experiment statistical analysis should prove or disprove H0 at a given confidence level. If neither, then one may want to think about using bigger sample data, more replicates, reducing the confidence level or about reformulating a different experiment altogether.

For example, if I want to test whether there is a correlation between the title attribute of a given tag(s) and SERPS for several replicates and at a given confidence level (e.g., at a 90%-95% level;see t-test tables) I would state something like this

H0=”There is no correlation between the title attribute and SERPS”

Then, if and only if the results disprove H0, I would probably have a good case; i.e. that there is a correlation between title attributes and SERPS (or SERPS improvement). Then I’m in a position of making other stat tests, this time for inference and prediction.


Orion

Last edited by orion : 01-08-2005 at 09:33 PM.
orion is offline   Reply With Quote
Old 01-08-2005   #12
fathom
Member
 
Join Date: Jun 2004
Location: Nova Scotia, Canada
Posts: 475
fathom is a jewel in the roughfathom is a jewel in the roughfathom is a jewel in the rough
Google started indexing but most of the control pages have not been included as yet.

I would think once they are most of the 'link from' pages will be dropped from results.

Interesting enough both pages of the control set on spherica are indexed with no results appearing for 's' from the 'phrase.html' page.

But let wait and see. I would expect to see 'only' the most relevant pages from a domain to appear in either case.
fathom is offline   Reply With Quote
Old 01-08-2005   #13
Dave Hawley
Please remove heart from sleeve before replying
 
Join Date: Nov 2004
Location: Australia
Posts: 573
Dave Hawley will become famous soon enoughDave Hawley will become famous soon enough
Isn't this the sort of thing SEO professionals would do all the time? In other words, surely the answer is already known?
Dave Hawley is offline   Reply With Quote
Old 01-09-2005   #14
fathom
Member
 
Join Date: Jun 2004
Location: Nova Scotia, Canada
Posts: 475
fathom is a jewel in the roughfathom is a jewel in the roughfathom is a jewel in the rough
Hey 'brand does work'!

Over 200 queries for acirehp @ spherica in the last hour!
fathom is offline   Reply With Quote
Old 01-10-2005   #15
fathom
Member
 
Join Date: Jun 2004
Location: Nova Scotia, Canada
Posts: 475
fathom is a jewel in the roughfathom is a jewel in the roughfathom is a jewel in the rough
Notes:

4 of the control sets appear as title="" as listing, no attribute as indent.

1 control set showing reverse [interesting that the no attribute page didn't snippet the listing description as the other 9 pages did - but still early]. This is text links denoting stemming with 's' in attribute (Series #2).

1 control set not showing control pages as yet (Series #5).

No control results showing for stemming value.
fathom is offline   Reply With Quote
Old 01-12-2005   #16
andrewgoodman
 
andrewgoodman's Avatar
 
Join Date: Jun 2004
Location: Toronto
Posts: 637
andrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to all
Not so fast

I'm no scientist, but:

Given that SEO 101 would likely involve such tests, especially where the #1 truth-in-labeling attribute, the title tag, is concerned... wouldn't you think that a search engine company would by now anticipate this, and make it more difficult to run this kind of experiment? By making the independent impact of the title tag per se a non issue (a red herring), and substituting a different sort of multivariate test which *involves* the title tag?

Eg. :

- Does the title tag appear to significantly match the rest of the findable text content on the page? If so, allow it to carry x weight. If not, give it (less) weight.

- Does the title tag appear to significantly match any of the anchor or surrounding text of incoming links? If so, pay attention to text in tag. If not, give it less weight.

Something like that.

I am saying there are probably ways to make this more complicated, and more difficult to run controlled experiments, by testing for relationships, not just for static qualities.

And another possible bit of fun might be a commerciality filter:

- On a scale, how commercial does the text in the title tag (or for that matter, the rest of the page) appear to be? Are there strong matches with popular, expensive AdWords keywords? If those words are worth more than $2/click, rank the page lower.

I realize I'm not making a whole lot of sense, but throw this out there for the sake of argument: what flaws might there be in a simple experimental design, if one assumes that Google is already assuming you know how to run simple experiments and adds complexity to prevent this?
andrewgoodman is offline   Reply With Quote
Old 01-12-2005   #17
andrewgoodman
 
andrewgoodman's Avatar
 
Join Date: Jun 2004
Location: Toronto
Posts: 637
andrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to allandrewgoodman is a name known to all
Moreover...

Again, completely hypothetical, but let's say Google includes in Algo du Jour a requirement that pages meet fourteen tests of authenticity:

- Title is confirmed in meaning by at least two other elements on page

- Due to recent spam, keyword density on query should not exceed x (extremely high number)

- At least two of the links pointing to the site on which this page is hosted are PR 5 or higher

- no instance of deceptive practice Q

- no instance of deceptive practice W

- no mention anywhere on site of blacklisted company from list maintained by Google

- page does not mention anything to do with reciprocal linking or other farming related info

- page has been in the index for at least three months

- page has not received more than one spam report

- and several other things

If the fourteen (hypothetical) tests are not satisfied, the page is considered a potential crap page. Non-crap pages rank highest for query, and all crap pages are ranked with a significant degree of randomness introduced to foil optimizers.

If no non-crap pages exist (i.e. all pages on that query fail to meet at least some tests), the least crappy pages do rank better in general, but on the whole, a significant degree of randomness is introduced to foil optimizers.

Well, that might be too complicated for Google, but if I were them, that's the kind of thing I'd be trying to do, since everyone and his uncle would be trying to reverse-engineer me to find out simple stuff like how important a title tag is. I would want to make it nearly impossible to determine the independent impact of any element at any given point in time.
andrewgoodman is offline   Reply With Quote
Old 01-12-2005   #18
fathom
Member
 
Join Date: Jun 2004
Location: Nova Scotia, Canada
Posts: 475
fathom is a jewel in the roughfathom is a jewel in the roughfathom is a jewel in the rough
Quote:
Originally Posted by andrewgoodman
I'm no scientist, but:

Given that SEO 101 would likely involve such tests, especially where the #1 truth-in-labeling attribute, the title tag, is concerned...
Well I guess we should first address the 'slang' use of TITLE TAG

<META NAME="DC.Title" CONTENT="This is a title tag">
<meta name="Title" content="This is a title tag">
<meta http-equiv="Title" content="This is a title tag">

<title>This is a title element</title>

<a title="this is a title attribute inside of a link element" href="">Anchor Text</a>

I know you know this andrewgoodman and so do many others but we often slang the 'title' and new members will be confused.

Quote:
wouldn't you think that a search engine company would by now anticipate this, and make it more difficult to run this kind of experiment? By making the independent impact of the title tag per se a non issue (a red herring), and substituting a different sort of multivariate test which *involves* the title tag?

Eg. :

- Does the title tag appear to significantly match the rest of the findable text content on the page? If so, allow it to carry x weight. If not, give it (less) weight.
Well sure it is possible but at the same time what makes this any different from searching and adapting to 'limited competitive terms' and adding content pages that capture a few once only query?

I guess my question back on this - 'if I told no one, and monitoring results internally only' - could Google automatically [or any search engine] determine the difference between 'acirehp' as a test of some type or a new product/service establshed as a brand?

Quote:
- Does the title tag appear to significantly match any of the anchor or surrounding text of incoming links? If so, pay attention to text in tag. If not, give it less weight.

Something like that.

I am saying there are probably ways to make this more complicated, and more difficult to run controlled experiments, by testing for relationships, not just for static qualities.

And another possible bit of fun might be a commerciality filter:

- On a scale, how commercial does the text in the title tag (or for that matter, the rest of the page) appear to be? Are there strong matches with popular, expensive AdWords keywords? If those words are worth more than $2/click, rank the page lower.

I realize I'm not making a whole lot of sense, but throw this out there for the sake of argument: what flaws might there be in a simple experimental design, if one assumes that Google is already assuming you know how to run simple experiments and adds complexity to prevent this?
Interesting enough - I come from a complementing environment that helps.

In submarines when you couldn't actually see what was coming at you - you relied on compounding evidence:

1. if it talks like a whale

2. response like a whale

3. moves like a whale

4. and pings like a whale

It usually is a whale...

But was wrong a few times!!!

Last edited by fathom : 01-12-2005 at 03:43 AM.
fathom is offline   Reply With Quote
Old 01-12-2005   #19
Dave Hawley
Please remove heart from sleeve before replying
 
Join Date: Nov 2004
Location: Australia
Posts: 573
Dave Hawley will become famous soon enoughDave Hawley will become famous soon enough
Quote:
I realize I'm not making a whole lot of sense...
On the contary, your post makes the most sense, to me at least.

I'm still astouded with all the SEO professional here the answer is not aleady known. Kind of makes me lose faith in the whole industry.
Dave Hawley is offline   Reply With Quote
Old 01-12-2005   #20
fathom
Member
 
Join Date: Jun 2004
Location: Nova Scotia, Canada
Posts: 475
fathom is a jewel in the roughfathom is a jewel in the roughfathom is a jewel in the rough
Quote:
Originally Posted by andrewgoodman
Well, that might be too complicated for Google, but if I were them, that's the kind of thing I'd be trying to do, since everyone and his uncle would be trying to reverse-engineer me to find out simple stuff like how important a title tag is. I would want to make it nearly impossible to determine the independent impact of any element at any given point in time.
Agree to alot of this and one reason why I forced relevancy through many different 'known variables' via title element, on-page text, file name, links to page, etc. and didn't rely on title attribute on the page itself.
fathom is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off