Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > Search Engines & Directories > Google > Google Web Search
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 11-09-2006   #1
Heineken
Member
 
Join Date: Oct 2006
Posts: 7
Heineken is on a distinguished road
Pdf-files is not indexed ?

Often I see google indexing the content of pfd files.

My site has pr 4 and the pages which contain the link to pdf files are indexed too. The pfd files are in the same directory as the pages who links to them.

Why are the content of the pdf files not indexed in google ?
Heineken is offline   Reply With Quote
Old 11-09-2006   #2
mcanerin
 
mcanerin's Avatar
 
Join Date: Jun 2004
Location: Calgary, Alberta, Canada
Posts: 1,564
mcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond repute
The most common reason for this is when people create PDF's from scans, or some other image. Search engines don't index the content in pictures or scans.

Ian
__________________
International SEO
mcanerin is offline   Reply With Quote
Old 11-09-2006   #3
Heineken
Member
 
Join Date: Oct 2006
Posts: 7
Heineken is on a distinguished road
The document only contains text
Heineken is offline   Reply With Quote
Old 11-09-2006   #4
mcanerin
 
mcanerin's Avatar
 
Join Date: Jun 2004
Location: Calgary, Alberta, Canada
Posts: 1,564
mcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond repute
How was the text generated? Which PDF maker was used? What was the process?

This information will help me help you.

Ian
__________________
International SEO
mcanerin is offline   Reply With Quote
Old 11-09-2006   #5
Heineken
Member
 
Join Date: Oct 2006
Posts: 7
Heineken is on a distinguished road
Its created by htmldoc 1.8.23.

In the pdf reader it is possible to select and copy paste text to wordpad.
So I assume that the document is not based on images.
Heineken is offline   Reply With Quote
Old 11-09-2006   #6
Heineken
Member
 
Join Date: Oct 2006
Posts: 7
Heineken is on a distinguished road
Sorry. My pdf files IS indexed now. I made a mistake searching in google.
Everything is fine.
Heineken is offline   Reply With Quote
Old 11-09-2006   #7
Heineken
Member
 
Join Date: Oct 2006
Posts: 7
Heineken is on a distinguished road
But I have a question:

If the pfd documents is stored in a directory ABOVE the pages who links to them - can this be a problem ?
Heineken is offline   Reply With Quote
Old 11-09-2006   #8
mcanerin
 
mcanerin's Avatar
 
Join Date: Jun 2004
Location: Calgary, Alberta, Canada
Posts: 1,564
mcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond repute
Quote:
In the pdf reader it is possible to select and copy paste text to wordpad.
So I assume that the document is not based on images.
For the record, that's an excellent answer and way to check - thanks. I know that it's not necessary now, but as someone who used to work a tech help line for a while, I always appreciate clear, helpful descriptions of the problem (you sound like you may have done some tech help stuff, yourself!)

As for your second question, search engines look at websites purely from the linking setup, not the physical or virtual directory structure.

A page 4 folders deep but linked directly from the home page is considered 1 click away. A file on root that is linked only to a page 3 clicks away is looked at as 4 clicks away, not as being on root.

With redirects and website moves and virtual directories and all that stuff, it's the only way a search engine can do it.

So the answer to that question is that it doesn't matter where it is, as long as it's on the same (or related) site, and linked properly.

One hint, though. If you can, try to have at least one link in those PDF's out to some other page in your site, it can be very helpful.

Be sure to check to make sure the link really is a link. You can see a discussion of this issue here:

http://forums.searchenginewatch.com/...ead.php?t=8100

Cheers,

Ian
__________________
International SEO
mcanerin is offline   Reply With Quote
Old 11-10-2006   #9
Heineken
Member
 
Join Date: Oct 2006
Posts: 7
Heineken is on a distinguished road
Quote:
Originally Posted by mcanerin
One hint, though. If you can, try to have at least one link in those PDF's out to some other page in your site, it can be very helpful.
In what way ?
In my case the pdf file is indexed (without a link to my page in the pdf). Will it still help me to place a link in the text ?
Heineken is offline   Reply With Quote
Old 11-16-2006   #10
mcanerin
 
mcanerin's Avatar
 
Join Date: Jun 2004
Location: Calgary, Alberta, Canada
Posts: 1,564
mcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond repute
In general, search engines don't like dead ends.

When a spider goes to a page, it makes a list of all the links on that page and then chooses one, at random. When it lands on the next page, it doesn't go back, it just chooses one link on that page, then continues on.

So what happens if there are no links on the page it lands on?

Generally, it will skip to the next website on it's list, rather than continuing to index yours. Most SEO's consider sending spiders away to be a bad trend.

Ian
__________________
International SEO
mcanerin is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off