PDA

View Full Version : Referral ID strings and referrer info.


Black_Knight
01-17-2005, 08:12 PM
Referral tracking is becoming an ever harder game with the growing obsession about (and misidentification of) spyware. A growing number of applications and plug-ins strip referrer info from HTTP headers sent by browsers, making the HTTP referrer less accurate by the day.

Using JavaScript to read the last entry in the browser history for the session would work, but of course, even this is sometimes blocked, and JavaScript is more often turned off than ever. It's less reliable than HTTP referrer info.

The search engines make it harder still. I mean that if you were to create dedicated landing pages to track source, you'd be creating duplicate content and splitting link popularity, and even the addition of a referrerID query string variable creates the same issues. Then on top of that lot, the spider goes and grabs one of those URLs and sends its traffic as some specific referrer it isn't.

Naturally, there are common workarounds of all kinds, but all have limitations.

The 'safest' method is naturally to block all urls that contain referrer info in the URL with the robots exclusion protocol. But that then discounts the links entirely which is incorrect for all parties, including the search engine which wants to accurately assess links.

The 'riskiest' method is to use server-side redirection for spiders, but that is technically cloaking and therefore against the stated guidelines, (thus the 'risk'), even though it is done partly in the search engine's own interest to provide the exact same content but without an erroneous referral ID in the URL.

In between options used to include using a link with a HREF attribute set to the url without referrer identification in the URL, while using a JavaScript 'onClick' event to provide JavaScript enabled browsers with a URL that included the referrer ID. However, as spiders gain the ability to index JavaScript based navigation, they could follow both versions (and get confused about the meaning), and spiders are very far from being the only user-agents that will have JavaScript disabled.

So what other solutions are available?

Do we need to sort out our own solution and then leave it to the search engines to catch up when they can?
e.g. We might all decide that we'd use a refID=somevalue; query string parameter, which the engines can then look at to see if they'd teach the spider to automatically strip out that one variable, or perhaps even change the value - refID=Google; or refID=Yahoo for instance.

Do we need to be petitioning the W3C to create a solution?
e.g. The creation of a new attribute for the anchor tag that need not be specified, but when it is has to send the base domain name to the server that the link URL is requested from.

What other solutions are there? Is the search engine marketing industry solid enough to start making demands and requesting specifications to make our work better and more efficient and effective?

Nacho
01-18-2005, 03:03 PM
Let's get this thread rolling . . . excellent post Ammon!

Robert_Charlton
01-18-2005, 04:27 PM
Excellent and important post. I think these concluding paragraphs point a direction that organizations like SMA might well consider....

Do we need to be petitioning the W3C to create a solution?
e.g. The creation of a new attribute for the anchor tag that need not be specified, but when it is has to send the base domain name to the server that the link URL is requested from.

What other solutions are there? Is the search engine marketing industry solid enough to start making demands and requesting specifications to make our work better and more efficient and effective?

One specific makes me think that this might be a politically contentious issue....

...and splitting link popularity...

It's not clear that webmasters, affiliates, and search engines would all want the same resolution. I agree that it is a problem that needs to be addressed, and the anchor text attribute seems to be a clean way to do it.

Black_Knight
01-18-2005, 06:02 PM
Thank you for your responses, gentlemen. I was beginning to wonder if I were the only one worried about the decreasing accuracy and value of referral info. :)

politically contentious issue
Of course.

Just as whether to include a feature like layers (invented by Netscape) into HTML is a political issue yet the W3C has to find what is best all-round, even though Microsoft are also contributors to the decision.

Search engine marketing can't be an 'us and them' issue between search engines and internet marketers because we're both living in the same street and gain more from being good neighbours than from being feuding neighbours.

The more we can rely on tracking, the more advertisers will be confident in spending on search marketing. That matters as much to the search engines as to us marketers, if not more so.

We can accept that we need to find a balance that doesn't ruin the value of link analysis, because we marketers need good SERPs just as much as the engines themselves. We are all equally interested in keeping customers happy to use search. Those SERPs are the pools we fish from, and as vital to our survival as to the engines. If customers lose faith in Google, that hurts all of us with time and effort invested in Google - both those within Google and those without.

I think our industry can be mature enough to work this through together. Sure, there are fly-by-night spammers thinking only of short-term, quick-turnaround, small-scale money. Just as there are still constant inventions of poor quality search services that hope to grab advertiser money. However, there are serious players in plenty on both sides, who share a vision of the importance and long-term value of search.

We must not go down the route of harming our shared long-term interests just because not all share our vision. What we need instead is to commit to moving forwards in a way that prevents abuse, even if it does help the idiots too.

We shouldn't be accepting the blocking of referrer information without a fight. To do so is to declare that it isn't worth a fight to us.

The first steps could be so simple.

We could simply agree a 'standard' for passing a referrer value, such as proposed with the refID=value query string parameter. Then the engines can look for themselves as to whether it would improve the link analysis, and thereby the SERPs, to have the spiders strip out such a value automatically.

It would take time that way, but the first step is as simple as agreeing now, here, that we will follow this proceedure until a better alternative can be found.

In the meantime, we could look to starting a dialog with the search engineers to see if they can help with better ideas. Explain to us their issues, and have them see ours.

Basically, I think it is time. I also think that referrer information is an issue that we share so much common ground in wanting that it would be an ideal ground-breaker. I share your belief that the role of organizations like the SMA groups that are springing up will include these exact kind of issues. That's just one of the many reasons I am commited to SMA UK.

AussieWebmaster
01-18-2005, 11:56 PM
Great discussion...
I think the tracking companies would need to be involved in the process also, since the codes they use vary.

FundingPost
01-20-2005, 11:17 AM
This has been a major discussion in my company for over a year - Link popularity VS knowing where our users are coming from. Right now we are opting for tracking. It's important! Where are we spending our money? Which partners are performing? Which keywords are converting to sales?

We use the Querystring: refer= We have great tracking, but it makes it tough to improve our link popularity. Our main url page rank gets split over 1000 different refer tags.

Put your mouse over any link and look at the bottom status bar to see how this works:
http://www.Fundingpost.com/index.asp?refer=forums.searchenginewatch.com

I would have to agree with Black Night - i like the idea of standardizing this to One common tag. It would take us 30 minutes to change this to a decided standard.

AussieWebmaster
01-20-2005, 12:54 PM
Between tracking codes and dynamic pages there is a good area where a set of rules or methods could be established so they do not hinder the engines and make it easier for us to roll these out without the problems.

2ndSite
01-21-2005, 02:59 PM
Standardization: Good. No doubt about it. The beauty of the web from a marketer's perspective is the way one can track marketing efforts through anayltics etc. There is no reason to disallow this tracking or impede its progress.

Link Popularity Passing: Questionable
It opens a can of worms. Affiliate Programs and link popularity are seperate in my books. I can see horrendous affilate network spam and totally unrelated banner placements by Affiliates if this were the case. Not to mention it's hard to track a reward for an affiliate if there is no conversion factor. And let's face it, for many campaigns banner impressions don't really cut it at the end of the day. If you want to pay for link pop, pay or do the old fashion thing and build it with blood sweat and tears. If you want affilates, recruit/grow a network and try to harvest it.

Clearly seperate in my books.

mcskoufis
01-21-2005, 03:03 PM
Personally I think that it is practically impossible for the Internet marketing community to enter into such an agreement with the search engines. The most important barier perhaps is that each engine uses it's own algorythms and ranking "mechanisms" and therefore getting into such an agreement would involve releasing some of their trade secrets.

Also if a SERP spammer get's to know this information, he would probably have an easy time to get his p... enlargement website within the top results of unrelated searches.

There's probably a hundred reasons why this "agreement" can't work.

Generally I have been using the net since the old graphicless web back in 1989. Lycos and yahoo were my first "top destinations", even though there were mainly educational institution websites available and pr0n. I think that my searches produced much better results at the time, than these days.

I say all this because I want to express my complaint about the poor results I am getting from google and the rest of the engines. My point is that unless the search engines don't improve the way they interpret a web site's content, they will continue to get their index spammed.

Also for all of those people using legitimate means of website promotion and do provide qualitative and unique content, it is a real pitty to get lower in the results than spammers because they don't post to blogs, don't issue press releases, don't have the time to find links, etc.

Personally have contacted google several times to report a spammer having more than 15 duplicated pages on the top 30 results. I have also told them that "hey, I need to modify my keywords more than 10 times to find the results I need". However it doesn't seem that my feedback has helped (or has been considered by) google.

Concluding, if the search technology doesn't get a real improvement, it doesn't matter who is using the best algorythm or has the biggest index, there will always be ways to cheat them. It is similar to email spamming. IT JUST NEVER STOPS!

I think that is something that the Internet marketing community can do absolutely little about it. It is on the search engines to provide the innovation they evangelise about. If they look after their searchers more than they do for their advertisers, we might be able to see some light. I find promising the fact that there is more and more competition between the bigger players in this industry.

Pavlos Skoufis

george
01-21-2005, 04:12 PM
I know it is not perfect, but my opinion is that this is a better way of referring:

www.domain.com/landing.php?=refid.

It seems that the s/e will give less weight to the page landing.php than if it had no ?=refid on the end, but the ? seems to act as a stop, so all pages regardless of the refid are accepted as being the same page.

busboy
01-25-2005, 04:30 PM
I read above of how one guy said that tracking information is more important than search engine rank. He mentioned how his main page's rank is split over 1000 times among all of the referral URLS. So with that in mind, what is the best option for me right now:

http://www.domain.com?ID=956

or..

http://www.domain.com/tracker.php?ID=956

It seems that if I HAVE to divide the rank for a page for over a 1000 different referrals, then maybe it's best to have tracker.php be divided instead of index.php

Does this make sense?

Thanks.

Black_Knight
01-25-2005, 07:58 PM
It seems that if I HAVE to divide the rank for a page for over a 1000 different referrals, then maybe it's best to have tracker.php be divided instead of index.php

Does this make sense?
That's probably an excellent question to raise here because the honest answer is "No, it doesn't make sense."

You see PageRank is solely and entirely about links. Nothing else. Having links that all go to the same content divided between different URLs means that all those different URLs are each getting some PageRank, instead of the one true piece of content they all point to getting it all.

The problem is with the very purpose of a URL in the first place. URL is an acronym for Uniform Resource Locator. Where it goes wrong is that many of us don't have a uni-form (single form) for links, but rather have Multi-form Resource Locators. This is only becoming more true as dynamic sites become larger, more widespread, etc.

If those links are pointing to tracker.php then they are not pointing to your root-domain index at all, so now rather than get a small percentage of all the links, it would get none. It would be all the different versions of tracker.php URLs that got PageRank directly from the links.

That help, or do you need a bit more explanation to properly grasp the way this works?

busboy
01-26-2005, 03:51 PM
Yes, what you said makes sense. So like the original author of this thread posted, what are we to do? It's too bad google doesn't just strip everything out of the URL after a question mark. So this:

domain.com?IDtracker=100&action=viewUser

Would turn into this:

domain.com

But then if that happened, certain pages would never be indexed, like this one:

domain.com?article=121

Well, I guess I'll have to keep doing what I'm doing. To make www.oil-testimonials.com really take off, I MUST rely on the use of referrals like this:

http://www.oil-testimonials.com?source=110

Thanks guys!

Black_Knight
01-26-2005, 08:11 PM
You've grasped the problem just fine there now busboy.

The immediate take-away from my opening post would suggest that you change the way you identify those referrals/sources to use refID
e.g. http://www.example.com/?refID=110

Right now, it is not either better or worse to call it refID= than to call it source= but my idea is simple to start a standard. If we all start to do it this way, then in time it becomes a de facto standard, and then the search engines can decide for themselves whether ignoring the refID= parameter in a URL query string improves their end analysis of links or not.

We'll have done our part in making it possible for the engines to better understand multi-form resource locators. Whether they decide to follow up or not will be their decision, but at least we'll have proposed a possible solution and done something toward attaining resolution.

AussieWebmaster
01-27-2005, 12:27 AM
You've grasped the problem just fine there now busboy.

The immediate take-away from my opening post would suggest that you change the way you identify those referrals/sources to use refID
e.g. http://www.example.com/?refID=110

Right now, it is not either better or worse to call it refID= than to call it source= but my idea is simple to start a standard. If we all start to do it this way, then in time it becomes a de facto standard, and then the search engines can decide for themselves whether ignoring the refID= parameter in a URL query string improves their end analysis of links or not.

We'll have done our part in making it possible for the engines to better understand multi-form resource locators. Whether they decide to follow up or not will be their decision, but at least we'll have proposed a possible solution and done something toward attaining resolution.
BK I agree with the method but it also has to be adopted by the tracking programs so that it is totally uniform.

busboy
01-27-2005, 12:46 PM
Are there any google employees that hang out here? Perhaps we can ask one of them about this. Maybe Google is already working on a solution to this problem?

Thanks.

AussieWebmaster
01-27-2005, 12:57 PM
Are there any google employees that hang out here? Perhaps we can ask one of them about this. Maybe Google is already working on a solution to this problem?

Thanks.
There a few Google people who visit the site semi-regularly but I don't think it is high on their short-term list.
The discussion has been mentioned in some other posts here and with the adoption of the blog no follow tag there is hope that moves in this direction will eventually be implemented.

figment88
01-27-2005, 05:56 PM
If you use tracking codes on the url, can't you just 301 mod_rewrite them to preserve pr on one page?

Then if you keep a rewrite log, you will have all your referral information.

I don't do this, so I'm not sure if it would work, but it seems like it should.

AussieWebmaster
01-28-2005, 11:55 AM
If you use tracking codes on the url, can't you just 301 mod_rewrite them to preserve pr on one page?

Then if you keep a rewrite log, you will have all your referral information.

I don't do this, so I'm not sure if it would work, but it seems like it should.
The problem is generally not with how they appear on your site... the url is grabbed at the other sites linking to you....

figment88
01-28-2005, 12:29 PM
the url is grabbed at the other sites linking to you....

Yeah but the 301 tells the SE's that the rewritten page should be used instead of the GET page for indexing.

That way, they should know that
http://www.somesite.com/?ref=tracker

is really the same as
http://www.somesite.com/

AussieWebmaster
01-28-2005, 02:41 PM
Yeah but the 301 tells the SE's that the rewritten page should be used instead of the GET page for indexing.

That way, they should know that
http://www.somesite.com/?ref=tracker

is really the same as
http://www.somesite.com/
Exactly... there has been discussion about working with engines to put this in place similiar to the recently no follow tag

merlin78
02-02-2005, 12:57 AM
So with that in mind, what is the best option for me right now:

http://www.domain.com?ID=956

or..

http://www.domain.com/tracker.php?ID=956

It seems that if I HAVE to divide the rank for a page for over a 1000 different referrals, then maybe it's best to have tracker.php be divided instead of index.php


I recently removed some tracker links (from external sites) because the site I work for was not getting indexed properly in Google. One month the homepage would index fine, the next it was out of the SERPS completely. I believed this to be a problem with duplication of content (as Google saw the homepage and tracking url (default.cfm?id=1) as two different pages). Since I removed the tracker link so far I have had no problems. So I'm all for a standard for tracking referrers as this issue has already caused quite some chaos for me.

AussieWebmaster
02-02-2005, 01:42 AM
I recently removed some tracker links (from external sites) because the site I work for was not getting indexed properly in Google. One month the homepage would index fine, the next it was out of the SERPS completely. I believed this to be a problem with duplication of content (as Google saw the homepage and tracking url (default.cfm?id=1) as two different pages). Since I removed the tracker link so far I have had no problems. So I'm all for a standard for tracking referrers as this issue has already caused quite some chaos for me.
If the tracking coded links are the only ones you have then it would present problems but generally you don't rely on them for your inbound links.

merlin78
02-02-2005, 01:50 AM
If the tracking coded links are the only ones you have then it would present problems but generally you don't rely on them for your inbound links.

Yes AussieWebmaster the tracking coded link was pretty much the only one we had (other links just pointed to the homepage with no tracking id attached the url). It was basically setup to see how many referrals from a particular site we would get. In the end it turned out to be a big problem and I guess I learnt the hard way not to use them for inbound links. Thanks for your advice.

AussieWebmaster
02-02-2005, 12:15 PM
Yes AussieWebmaster the tracking coded link was pretty much the only one we had (other links just pointed to the homepage with no tracking id attached the url). It was basically setup to see how many referrals from a particular site we would get. In the end it turned out to be a big problem and I guess I learnt the hard way not to use them for inbound links. Thanks for your advice.
If you are looking for who is sending you traffic on a domain level you can use your tracking program and do a search by domain without tracking if they have that option - many do.
If not look at your log files and use a log analyser to check referrers.

Black_Knight
02-02-2005, 02:55 PM
Okay, I've taken a little time, made a few calls, and have some progress and new ideas that should be included in this thread.

Firstly, several tracking solutions currently use a whole raft of variables for tracking referral information. Some have a source variable for the domain or partner that sent the click. Then a separate keyword or campaign variable. Then perhaps another variable to denote the price of the item (for ROI tracking), etc.

This means that many of these urls may contain 3 or four different data-pair parameters in a URL even before anything needed for a dynamic site. Obviously, it makes sense to try to reduce this, and create a single, structured format for the variables involved.

So, I'm extending what I said before about refID=value so that all related referral/conversion variables can be included in just one parameter.

I propose using a double hyphen ('--') to combine several variables into a single parameter string.

example:

?refID=partner--campaign--price;

which means
?source=overture&keywd=buy_widgets&saleprice=50;
becomes
?refID=overture--buy_widgets--50;

The obvious benefits of reducing the number of data-pair variables and creating simpler URLs are hopefully appreciated by all.

The side effect (for good or ill depending on perspective) is that it becomes somewhat easier for people to switch from one tracking solution to another if they wish, which may of course limit the support for this idea from some of the tracking solutions providers (though this has not yet happened, thankfully). The ones most confident that if switching becomes easier they will gain more custom will more likely support the standard, while tracking solution providers that really like having customers locked into a proprietary URL system may be very resistant to losing that lock-in.

However, the first two companies I have spoken to about this matter have been very supportive, so a big, sincere thank you to ClickTracks (http://www.clicktracks.com/) and to NedStat (http://www.nedstat.com/) for seeing the value here, and helping to move this discussion forwards. A thanks too to all the others who've not been able to get back to me on this yet, but who've still expressed a genuine and generous desire to move forward.