|
#1
|
||||
|
||||
|
Googlebot called me on my Timpani Live Help
I just got a call (request for chat) from the Googlebot to my live help system (we use Timpani). This seems very interesting, because it indicates that the crawler was not only able to read the JavaScript but also execute the command to "call."
Does this seem out of the ordinary to anyone? I wonder if we will get "bonus points" for actually answering? By the way the Googlebot is a little rude...didn't even respond to my greeting. |
|
#2
|
||||
|
||||
|
This is most likely someone changing their useragent to GoogleBot in their browser.
|
|
#3
|
||||
|
||||
|
damn it's gone now, but the system also shows me IP and registration, and the reg info showed Mountainview CA, so I think it was legit. Should have printed the chat info.
![]() Last edited by Chris Boggs : 03-15-2006 at 12:49 PM. |
|
#4
|
|||
|
|||
|
I find this really interesting. I've always argued that bots wouldn't read JavaScript - rather than bots /couldn't/ read JavaScript. The danger is that they'll start to loop, or order things, or try and talk to Chris.
However, JavaScript isn't going away. AJAX grows in popularity. It must be tempting for search engines to try and have their cake and eat it. That's to run with JavaScript but only in an information discovering way. It sounds like this really was Googlebot too. |
|
#5
|
||||
|
||||
|
I will analyze our log files and see if G was in fact crawling at the time, unless that can be spoofed in the same manner?
|
|
#6
|
|||
|
|||
|
The user agent can be spoofed easily. The IP address is not so easy to spoof. If you spot Googlebot in your logs, run a tracert on the IP and see if you return to Google Inc.'s network.
If you do then I'll believe it's Googlebot. I'll be interested to know it's exact user agent too. We've seen Googlebot/test hit .js files back in early 2005/late 2004 as I recall. |
|
#7
|
||||
|
||||
|
Browser Googlebot/2.1 (+http://www.google.com/bot.html)
Host address crawl-66-249-64-44.googlebot.com Host IP 66.249.64.44 Country United States City Mountain View Organization GOOGLE World Region California Postal Code 94043 Time Zone America/Los_Angeles ISP GOOGLE Connection Type Unknown (note: I checked...it's legit.) Last edited by Chris Boggs : 03-15-2006 at 01:19 PM. |
|
#8
|
||||
|
||||
|
Hmm, can you show the code used on your site?
|
|
#9
|
||||
|
||||
|
this should be it.
Quote:
|
|
#10
|
|||
|
|||
|
This doesn't surprise me, I've seen some very interesting crawling coming from the mozilla googlebot.
For one, I've seen it submit forms - actually fill them out and hit the submit button. It's also trying to request pages that don't exist, but it's doing so intelligently. ie, if I have a site with pages named page1.html page2.html and page3.html it's also requesting page4.html page5.html and so on even if they don't exist. It's also a greedy bot - it's requesting as much as 8 times the page in a day than the old googlebot. I figure,based on the fact that it's built on a modern browser engine, we've only just begun to see what it will be able to do. Just think of all the things your normal browser can handle (css, js, flash, movies etc) and I think this is what the new bot can now, or soon will be able to handle. |
|
#11
|
|||
|
|||
|
I found out 3 weeks ago that google is looking at on-page javascript on many of my clients' sites, seeing what it thinks are URLs, and trying to go to those pages.
for example, client has hitbox, and we have various categories on the site. Categories are defined in the HB javascript by Code:
var _mlc="/CategoryName"; |
|
#12
|
|||
|
|||
|
In Chris' example code we can see that there is, in fact, a fully qualified URL in the standard HTML for Googlebot to follow. It depends on whether that would trigger the LiveChat session or not.
That said, I've seen other evidence to say that Googlebot is grabbing more URL data from JavaScript this week (or rather, evidence which suggests it as nothing is scientific in this industry) and Mike's example above is further supporting speculation. So, as an SEO, your clients have just spent money to change their navigation from JavaScript to non. It cost and they're less happy with the way their site looks now. How do you handle this? At this point in time I like to remind myself that there's no evidence, not even this sort of speculative evidence, that there are any promotional benefits to be had from JavaScript based navigation and so, right now, its business as usual. |
|
#13
|
||||
|
||||
|
Right, I would try adding a rel="nofollow" to the URL.
|
|
#14
|
|||
|
|||
|
Yeah. That would be an interesting study.
There are some signs to suggest that spiders do follow "nofollow" links (after all, the concept is to link condom the anchor and have it not count for PageRank) but its hard to tell whether a spider found a page via some other method (a PageRank request via toolbar). The LiveChat talk box would be as a good 'closed lab' as any other public page. In fact, there's no way you can add "rel='nofollow'" to a JavaScript variable. Ie, Mike can't take his var _mlc="/CategoryName"; and mark it 'nofollow'. |
|
#15
|
|||
|
|||
|
I agree -- I used the hitbox thing as an example when speaking to another client -- bottom line is that even though google is looking for links in javascript, it doesn't mean it's time to announce "ok javascript for everybody!" -- just means they're starting, that it's not 100% accurate yet, and that we don't know what the other engines are doing yet, etc
I think it's gonna be a couple of years before we can tell all our clients to go nuts with javascript menus.. |
|
#16
|
|||
|
|||
|
Googlebot and msnbot visit Timpani Live help almost on a daily basis at my workplace too.
I tend to "Refuse chat" when I see their hostname displayed. I sent a support request to Timpani regarding this problem, and they never replied..... |
|
#17
|
||||
|
||||
|
OK thanks everyone. It looks like the answer to my original question of "Does this seem out of the ordinary to anyone?" is mostly a "no." I was told by a developer that this seemed very strange. I am not a developer. According to some here and elsewhere, though, this involves a clear html path for the Googlebot, even though this is within a JavaScript portion of the code? Can someone "dummy it down" for me?
|
|
#18
|
|||
|
|||
|
Quote:
Search engines can follow this. This isn't JavaScript, this is a straight link. Just because there's JavaScript nearby does not stop the spider finding this but of the code. Now, if we had said something like: document.write('<a href=\'http://server.iad.liveperson.net/hc/74320687/?cmd=file&file=visitorWantsToChat&site=74320687&by href=1&imageUrl=http://www.g3group.com/assets/images/chat\'>'); Then that would have been a different story as JavaScript is actually being used to write that address (it's not in the actual) code. I don't think anyone here would be terribly surprised to see that Googlebot is smart enough to extract this fully formed URL from the JavaScript. It seems that the bot is increasingly likely to do so these days (we speculate). |
|
#19
|
|||
|
|||
|
I'd slap a nofollow on any link that you don't want Googlebot to follow, just to be safe.
|
|
#20
|
|||
|
|||
|
Quote:
|
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|