Search Engine Watch
SEO News

Go Back   Search Engine Watch Forums > Search Engines & Directories > Google > Other Google Issues
FAQ Members List Calendar Forum Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old 05-25-2005   #1
Everyman
Member
 
Join Date: Jun 2004
Posts: 133
Everyman is a jewel in the roughEveryman is a jewel in the roughEveryman is a jewel in the rough
SEW Should Support The AAUP's position On Google Print

Danny's piece in the blog about the concerns of the Association of American University Presses is confusing. He seems to downplay the AAUP position on the grounds that Google has long been treating the rights of webmasters with the same disregard for copyright law that they now propose for the world's printed material.

I agree that the cache copy is an issue. But I disagree when Danny implies that the AAUP's case less important because of this. I read their letter to Google (PDF, 256K file), and my reaction is that I'm grateful that the AAUP has more common sense, self-respect, and legal savvy than webmasters -- and a few self-interested libraries -- have been able to muster.

Danny says one of AAUP's arguments is bad. I say it's a good argument from AAUP's perspective. The entire process of converting a book to a raster scan, and converting the raster scan to OCR for indexing purposes, involves a bit more originality and effort than stashing a cache copy of a web page. For this reason, Google will try to argue that their digitization is proprietary. They would never try to argue that their cache copy of a web page is proprietary. If Google wins this argument, the implications are enormous. Why do you think half of the nations in Europe are getting very nervous over this?

The AAUP is anticipating Google's position, drawing a worst-case scenario, and exploring the implications. If the AAUP wins on this, it doesn't mean that webmasters lose. It will mean that Google is constrained in ways that it doesn't feel constrained now, and that can only be good news for webmasters.

It's about time that someone pointed to Google and said that they have no clothes. If it takes an organization of 125 nonprofit scholarly publishers to do what webmasters have failed to do ever since that first Google cache copy showed up years ago, then I say we should support them. I'm not sure whether Danny does or not, because I find his position confusing.
Everyman is offline   Reply With Quote
Old 05-26-2005   #2
mcanerin
 
mcanerin's Avatar
 
Join Date: Jun 2004
Location: Calgary, Alberta, Canada
Posts: 1,564
mcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond repute
I just read the blog piece, above post and the letter, and I think someone is confused. I know I am, but I suspect I'm not the only one.

I also looked at Googles description of the topic, since I think the AAUP has a misunderstanding of parts of the concept. I thought it was strange when they started asking questions like "how many digital copies do the libraries get?"

Read this:

http://print.google.com/googleprint/library.html

Ok so here is how it works. Google Print for libraries is the concept of making traditional paper-bound books available via a search engine, by digitizing them and adding them to Googles index.

Google gets more content, and libraries get indexable digital copies of their collection. Everyone wins. But there is a "but" involved here.

If the work in question is not copyrighted, then the search within Google will link to a complete scan of the book. If the work is copyrighted, then Google will display a text snippet from the book that will hopefully give the searcher enough information to decide whether the book is on-topic or not.

First and foremost, the works that the AAUP members are referring to that do NOT have a copyright are not an issue:

Quote:
The entire process of converting a book to a raster scan, and converting the raster scan to OCR for indexing purposes, involves a bit more originality and effort than stashing a cache copy of a web page. For this reason, Google will try to argue that their digitization is proprietary.
If Google tries that, they will fail. This is one area of copyright law that is very settled - if a work is not copyrighted, it's not copyrighted, period. You gain absolutely NO rights at all if you go through the trouble of digitizing it or compiling it:

http://www.constitution.org/1ll/court/fed/bridgman.html

Corel (a software company) scanned images from transparencies taken by the Art Gallery of public domain art and sold them in a clip art disk as public domain. At least one of the images had been registered with the copyright office by the Art Gallery.

The Art Gallery sued. The Art Gallery lost. The Art Gallery appealed, along with a whole bunch of amicus briefs from museums and copyright professors. The Art Gallery lost again. - under both US and UK law.

So Google has no more of a copyright (or ANY right) to the material, regardless of how they digitize or store it, than the library does. A key point here is that the AAUP is attempting to also cover public domain works in this letter and they are absolutely wrong in it.

I understand that many libraries make money off of licensing out their old books (which are not copyrighted), but the fact that they are making money off of things that do not belong to them does not give them a monopoly on it. The library may own the physical book, but NOT the uncopyrighted work within it.

The fact that they are a non-profit does not give them any rights on works that are not copyrighted, digital or not. This part of the AAUP's arguement is completely off-base. Making money off of something does not give you rights over it.

However, they are correct in that copyrighted materials should not be treated in a cavalier manner. This is a whole different area.

IMO, as long the full text scan is only for the purposes of indexing and are stored in a database format, then I would argue that is fair use, though there is room for counter arguement, which I'll get to in a moment.

When the results are displayed to humans, however, care must be taken. If the full text is displayed, it's a clear copyright violation.

My personal rule of thumb for "how much is fair use" is, very simply, the minimum necessary to understand the information and it's relevence. When using fair use, you want to focus on using the LEAST amount possible to convey the message, not wonder about "how much" you can use. There is no clear amount, since it depends on the source document, and some authors take longer to get to the point than others.

There is a huge difference between a Haiku and a Tolstoy work, for example.

The AAUP does have a legitimate concern as to the size of the snippet displayed.

I think their concerns about digitization, in an of itself, are not realistic. Fair use includes minimum necessary changes that allow the work to be understood by the reader, whether that reader is a computerized indexer, a blind person listening to an audio book or braille version, or someone listening to the work in another language. The copyright exists and continues through the translations (you don't lose it if someone translates the work, and the translater gains NO rights at all for their efforts).

Now I would be MUCH happier (and, I suspect so would the AAUP) if Google indexed only short, representative versions of the copyrighted works, rather than the full text version. I would be concerned that an automated series of searches would result in the display (and capture) of the whole document unless great care was taken in the text snippet selection, and the screen captures that Google has on their website seem to me to be something that I could use to compile a complete work from with a series of progressive searches. This is bad.

It's the equivelent of photocopying and entire book (NOT fair use) and justifying it by only letting someone look at one page at a time (on page would probably be fair use). Fine. But then if you hand that same person the entire stack of photocopies to look through (or the ability to do so easily), you are back into copyright violation area, IMO.

In short, there is no issue regarding works that are public domain, but the issue of full scans of a copyrighted work are serious, if the technology allows that work to be displayed. The copyright owner owns those scans, NOT Google. It's "fair use" not "aquired ownership". I think Google is confusing the two.

The text snippet system should be changed to make large scale reproductions of a works contents unavailable as a possibility, at the very least.

[EDIT] I just went back to the example page in question and noticed that I'm apparently half asleep. Google shows 3 different version of results, depending on whether or not it's public domain, publisher contributed, or copyrighted general works.

The examples shown for the general copyrighted works makes it clear that it would be almost impossible to piece together a whole work. This removes my major objection, and, honesly, should remove AAUP's, as well.[/EDIT]

Ian
__________________
International SEO

Last edited by mcanerin : 05-26-2005 at 01:22 AM.
mcanerin is offline   Reply With Quote
Old 05-26-2005   #3
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
Daniel, I'm not downplaying the concerns they are raising. If anything, I'm making the specter worse for Google and the other search engines.

The AAUP has come in with this statement no one has ever done widescale copying like Google is currently just starting to do with books. Not so. They are indexing the full text of many pages (up to 101K, 500K, so on depending on the search engine) without permission.

Did they get permission from you before indexing your site? Did they get permission from me? I certainly didn't get the letter. So what exactly makes what they are doing on the web somehow so acceptable to the AAUP but it's a different issue with books.

What I dislike is the suggestion that somehow copyright is different for scholarly works, libraries or just print material. You don't become a second class citizen just because you publish on the web. This is the second time I've heard someone in that community try to make out they are somehow different. The first was this:
Copyright laws are written for companies like Time Warner and Disney instead of research libraries like Harvard. [These laws are] not aimed at us.
That's what a former Harvard University Library director said, as covered here. I pointed out that copyright laws are written to protect the rights of publishers and authors, not for Harvard's needs, Disney's needs or Google's needs.

So actually, we might be kind of on the same page. I'm saying the publishers of all sorts have rights, and that this type of widespread copying has happened without seriously being challenged legally until now.

Of course, I also wrote that if it were challenged, I suspect a court might rule that opt-out is a fair way to declare you don't want to grant "reprint" permission to the search engines. But I don't know that. I'm not a legal expert. We really won't know until we get a court case.

I do know that the AAUP comes across as not knowing what they are talking about, with that letter. If they don't have a fundamental understanding of how search engines have worked, they aren't going to do so well in battling against things they don't like. I'm sure they'll learn. My article is part of that education process.

Oh, and worse case is that search engines really could lose. If a court ruled copying material for indexing and display of snippets was illegal without prior permission, that very much could be taken to apply to web indexing. And given the current freak out over better company governance, you could see a search engine decide that they better drop everything and start again specifically getting permission from people. It's a worse case, not likely in my view, but it is possible.

The reality, of course, is that webmasters don't complain about being in Google because they get plenty of value out of it. The vast majority of complaints you here is when they are NOT in Google. Heck, I've heard more complaints about people who feel Google is obligated to list them then I've ever heard from someone who felt Google violated their copyright by indexing them.

The AAUP is in a different situation. They don't see value. They sell physical objects and fear this is going to stop those sales, undercutting their non-profit mission to:

Quote:
Help the advancement of knowledge by making the results of scholarly research known through their publications.
The cynic part of me wants to say if you want to advance knowledge, then don't complain about something that will help more people find your works than if you sell a relatively small number of books that sit on shelves and probably are consulted by only a few.

But the letter addresses this. They depend on these sales to fund peer-review they say sets the "gold standard" for these works. OK. And even more OK that no one wants their business model eroded.

So now we've got a group with a vested interest to block widespread indexing for the first time in the 10 years that we've had it. It remains to be seen what's going to happen. Maybe Google will prove value to the publishers and they'll all be happy. Maybe they'll be blocked. Maybe we'll end up with a radio-like solution, where Google pays a flat license fee to various publishers for the right to broadcast snippets. Maybe we'll have different rulings for different type of indexing. Maybe book publishers will realize that the revolution hitting other publishing industries (movies, TV, music) is coming to them as well. If something's digital, it can be easily copied. Do they need to establish more to prevent this. Would that even work? Or does web publishing have any lessons for them to learn.

As for the Europe point you raised, be fair. That's not about copyright. That's France in particular worried that Google will create a US-centric archive of digital works. To combat this, the EU looks to be funding their own project. I haven't heard word one being raised about any publishers asking copyright issues with that. I may have missed it -- if so, let me know. The issues are of course valid. But don't suggest Europe is nervous over copyright issues. They've been nervous about wanting to ensure that the Google Print team (half of which is apparently French) doesn't create McPrint.

There are definitely a lot of issues with Google Print we agree on. I'd certainly liked to have seen the entire thing done in a more coordinated fashion. We still don't know exactly how this data is stored and if it can be easily used by others.

But to throw the question back at you, what do you think should be done. You want to support the AAUP. On what points. Google can't copy anything? Google can copy things only that have expired copyrights according to the laws of the countries it operates in? Google can copy only with permission? And if so, is that applicable to the web? And if so, what exactly do you feel webmasters have been "failing" to do? They should have sued over cached copies because opt-out wasn't enought? Robots.txt not enough?

Last edited by dannysullivan : 05-26-2005 at 07:04 AM.
dannysullivan is offline   Reply With Quote
Old 05-26-2005   #4
Everyman
Member
 
Join Date: Jun 2004
Posts: 133
Everyman is a jewel in the roughEveryman is a jewel in the roughEveryman is a jewel in the rough
I believe that Google should not be able to copy anything in print from a library. The exception would be if they get permission on a per-title basis from the copyright holder.

As for public domain material, I firmly believe that if Google is left unchecked, in ten years we will see the Google watermark on every page of Shakespeare, and Google will show anti-depression pill ads alongside of Hamlet's soliloquy. No, you don't see this now in Google's print scans. No, you won't see it from day one after Shakespeare is indexed. But within ten years, the Google juggernaut will make it happen. Surely I'm not the only one who has figured out how Google behaves by now. Who's going to sue Google in ten years? Shakespeare? I don't think so. He isn't worth nine billion like Larry and Sergey are each worth, and besides, he's dead.

This whole discussion isn't so much about copyright, but rather about making Google accountable. The AAUP letter makes it clear that Google has been secret and devious with their entire approach to this library project. The libraries involved had to sign nondisclosure agreements, and we don't even know how what rights the libraries will have to do what they please with their own copies of Google's files. Could the University of Michigan, for example, after several million books are digitized, give all the files to Yahoo if they wanted to? What's in the fine print in the UM-Google agreement? I believe the public has a right to know. Google disagrees with me. That should tell you something.

For copyrighted material, this is an issue of meeting the payroll for the nonprofits who make up the AAUP, and their concerns are completely valid. For me, as well as for Europe, it's a simple matter of protecting the public domain. Libraries should hold off on making deals with Google until all the issues have been aired, the language that protects the public interest is air-tight, and until Google has a track record of showing respect for the public sector.
Everyman is offline   Reply With Quote
Old 05-26-2005   #5
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
So is it only Google that can't copy stuff from a library? The reason being if so, because they are so big and potentially powerful?

As for Shakespeare suing, I'm pretty sure he doesn't hold a copyright on his works anymore. Who exactly would you ask in this case. This page:
http://www.cityshowcase.co.uk/index....D=knZlveuB 2v

and a few others I found say he's fair game:

Quote:
Unsurprisingly, all of William Shakespeare's works are now out of copyright. However, your question does raise an interesting point in so far as there is a separate literary copyright in the published editions of the various works of Shakespeare which arises through the skill and labour of the publishers in editing and formatting these works. This copyright does not, however, extend to the protection of individual sentences which are clearly recognised as having been written by William Shakespeare and therefore you do not require clearance. By way of interest, to infringe the publisher's copyright, you would need to borrow from a published edition of "Romeo & Juliet" to a far greater extent than the mere use of one famous sentence from the Work.
It is interesting that the scanning of the book might violate copyright, because it has been formatted in a particular way. But the words themselves aren't copyrighted.

Further, my understanding of copyright isn't that you get to protect your work forever and ever. Copyright, from what I know in the US, is designed to protect your work so you can benefit for a relatively short period of time (what is is, 30-100 years?). After that, it goes into the public domain.

If a work really is in the public domain, it seems absurd to single out Google as the only person who can't copy it, because you fear they'll put ads on it. If you've ever watched a performance of Shakespeare on TV and there were ads during breaks, same thing. By this logic, commercial televisions had better not show anything now in the public domain.

But then you turn around and say it's not about copyright, it's about accountability. OK, I've got some of the same concerns you have about what's happening with this material. But I'm going to be angry that Google got some library to stupidly agree to an NDA? Hey, where's some of the librarian community anger at the libraries who have bought into this? And why does the public have some right to know about the business plans of Google?

I need a little more reason here in terms of the public interest. Google is hardly preventing you from going to a library. The issues that have also been raised haven't been that "oh no, the public is going to lose" but instead that the publishers are worried. What's the public interest that's being violated here?

Again on Europe, public domain has NOT been raised as an issue. The European project looks like it wants to do a lot of what Google wants to do. You want to restrict Google but then it's OK for Europe to copy works?

I feel like your last part is really the best argument. Libraries should hold off on making deals with Google or anyone until they have all these type of issues answered.
dannysullivan is offline   Reply With Quote
Old 05-26-2005   #6
hardball
Member
 
Join Date: Oct 2004
Posts: 83
hardball will become famous soon enough
Webmasters stand in line to get greased with click fraud without a peep, what makes you think they would get militant about library books?

Google owns you, get over it.
hardball is offline   Reply With Quote
Old 05-26-2005   #7
Everyman
Member
 
Join Date: Jun 2004
Posts: 133
Everyman is a jewel in the roughEveryman is a jewel in the roughEveryman is a jewel in the rough
Quote:
So is it only Google that can't copy stuff from a library? The reason being if so, because they are so big and potentially powerful?
Now we've reached the crux of the issue. The answer to your question is Yes, size matters. Plus Google is irresponsible, and insensitive to the public sector.

Exactly right. Size matters. In antitrust, size matters. And in "fair use," the quantity of material involved matters:

Section 107 of title 17, United States Code:

"In determining whether the use made of a work in any particular case is a fair use the factors to be considered shall include --

1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit education purposes;

2) the nature of the copyrighted work;

3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and

4) the effect of the use upon the potential market for or value of the copyrighted work."

That's for copyrighted material and fair use. Size matters.

For public domain material, size matters too, because this is only common sense. Google is talking about indexing a substantial portion of the world's public domain printed material. Google has the money to do it. Google intends to monetize this investment somehow, because they have never done anything without planning to monetize it somehow.

I wouldn't worry if Jupitermedia was doing it, because I don't think Jupitermedia has the resources to pull it off. I also think that libraries might think twice before signing away everyone's rights to Jupitermedia, simply because Jupitermedia isn't as "cool" as Google. Jupitermedia doesn't worry me much, but Google does.

Yes, size matters. Of course it does. How could anyone think otherwise?

Quote:
Hey, where's some of the librarian community anger at the libraries who have bought into this?
I really don't understand why there hasn't been more interest at the American Library Association. I've tried to alert them to this issue on privacy grounds, but then if you go to ALA's website you end up with a tracking cookie that expires in 2037, despite the fact that generally speaking, the ALA is very protective of the privacy of library patrons. I'll keep at it.

I think the library profession, particularly in academia, have felt like dinosours in recent years. People are still coming to the library, but they all line up at the computers so that they can do Google searches. Then Google comes along and puts libraries back into the picture. Librarians feel so flattered that they forgot that they're supposed to look out for the public interest. A librarian who gets "Google juice" feels like he has a future. A librarian without Google juice starts wondering if he chose the wrong career path. It's no different from website rankings.

And that's because size matters.
Everyman is offline   Reply With Quote
Old 05-26-2005   #8
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
So Amazon's big, and Amazon's been digitizing books longer than Google. Is it OK for them to do it? And you think they don't want to monetize it more than Google? And if you restrict Google but allow Amazon, that's not restricting trade?

The EU is pretty big. Why is the project they want to do OK? Is it because some of my tax dollars are going to fund it?

And who exactly do you think will end up creating a digital library? Someone small? But its nature, such a project seems like it's going to involve some fairly large entities.

You've said:

Quote:
Google is talking about indexing a substantial portion of the world's public domain printed material.
Well if it's public domain material, by definition it's in the domain for anyone in the public -- including Google -- to do with what they want with it. Honestly, it's amazing -- you want to rule out this one company from doing things with unrestricted material?

In the end, you've gone from saying we should support the AAUP's concerns over copyright to just, "this is bad, because Google's big, and I don't trust them."

You need something more than this. You need to list some reason why Google is breaking some type of law if you want the government to step in, somehow.

If you want non-governmental agencies to step in, then you need something other than some broad claims you dislike Google. You are much better when you list issues such as we don't know how other people may be able to use this promised material, or that you think it might have an impact on research, and so on. But you are would be far more better if you applied these concerns to projects across the board. Amazon doesn't deserve a free ride any more than Google does.
dannysullivan is offline   Reply With Quote
Old 05-26-2005   #9
Everyman
Member
 
Join Date: Jun 2004
Posts: 133
Everyman is a jewel in the roughEveryman is a jewel in the roughEveryman is a jewel in the rough
As far as I know, Amazon has never claimed the right to digitize copyrighted material for which it has not received permission from the copyright holder.

Google is claiming the right to digitize copyrighted material simply because a library that they have cut a secret deal with happens to own a copy. The only question for Google is how much can they can get away with displaying to a particular user at a particular point in time. This is a considerable difference from what Amazon is doing. It's the start of mission creep, and it will end up badly.

I don't think the European Union will be running ads on their copies of anything they digitize. You don't like your personal tax dollars supporting the EU, but I don't like Jupitermedia acting as a mouthpiece for Google.

Maybe I should start worrying about Jupitermedia.
Everyman is offline   Reply With Quote
Old 05-26-2005   #10
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
Daniel, I'm not being a mouthpiece. Please, go look at my blog today and tell me I was mouthpiece for Google. I've been plenty critical of them.

I'm asking you questions to support the views you are presenting. Answer the questions or don't, that's up to you. But you started out saying that this was a big copyright issue, then you back up into it being you just don't like Google. Are you a mouthpiece for Microsoft then? They don't like Google, either.

Hmm, what do we know about the Amazon project. Wired had a very nice thing in 2003 about it, http://wired-vig.wired.com/news/prin...,60948,00.html

Quote:
The copyrights to these titles are spread among countless owners. How was it possible to create a publicly accessible database from material whose ownership is so tangled? Amazon's solution is audacious: The company simply denies it has built an electronic library at all. "This is not an ebook project!"
Where were you in 2003 worrying about Amazon being too big? Amazon, by the way, has been dinged in a settlement over privacy issues. Google never has. Google gets accused of lots of things on the privacy front by you, but they've not had any actual cases where they've been proved to have done something wrong. Amazon has.

I know, you haven't raised privacy and Google yet. But you've dropped that specter out there that Google's not in the public interest, and I'd assume part of that would be your privacy concerns. Amazon deserves as much of your attention, as well.

By the way, go back to that article because it talks about the million book project:

Quote:
Kahle is happy to sidestep the problem of digitizing commercially successful books. He has no wish to antagonize the publishing industry. What he hates is that the Million Book Project cannot legally digitize countless books that aren't generating money for anybody. US libraries hold about 30 million unique volumes. No one knows how many of those books continue to be protected by copyright or are available from commercial publishers. Still, Kahle says, "they can't be digitized because the copyrights can't be cleared, and the copyrights can't be cleared because it's too much work to identify the copyright holders. Some people call them abandonware. I call them orphans."
It goes on to say Kahle's going after public domain books. So again, why can the Internet Archive do this but Google can't?

You want some better arguments, I'll be happy to give them to you. At least one is that if Google really thinks libraries are giving it copyright protection, that's just foolish. Libraries don't control copyright -- publishers do. And Google investors have a right to know exactly where the company stands on these types of things, because if these secret agreements are based on that, then they might find Google declines rapidly in value when sued. How's that for a good reason, rather than a kneejerk, "they're too big."

As for the EU, you actually have no idea what they might do. Ads? Probably not. But I don't see you campaigning to stop ads on TV, print, etc. I'm sorry you don't like Google putting ads in Shakespeare potentially, but others have. Why does Google get to be so special?

You didn't like Google doing this because of their size. The EU is considerably bigger than Google. Why is the EU going to be OK but Google isn't?

It's especially odd that you'd be happy with that, because most of your criticism about Google to date has often centered around "what if some government agency got their data."

OK, so now the EU gets the data directly. Hmm, now you go search on their massive digital library in 20 years time. I'm sure they won't log you. I'm sure there will be plenty of privacy protection enshrined. Or maybe we'll be told that, and there won't be.

Don't back away into slinging mouthpiece accusations when I'm taking the time and giving you the respect to ask you real questions and understand your point of view. Explain to me why Google is so unique that they should be opposed on this. I'm asking you seriously to try and understand where you are coming from. So far, you've come a long way from what you originally said. You've yet to even answer what it was you thought webmasters were supposed to be doing 10 years ago.

Better, explain to me the general concerns you really have. Then we have a starting place to know what should happen for anyone who wants to embark on this type of project.
dannysullivan is offline   Reply With Quote
Old 05-26-2005   #11
randfish
Member
 
Join Date: Sep 2004
Location: Seattle, WA
Posts: 436
randfish is a name known to allrandfish is a name known to allrandfish is a name known to allrandfish is a name known to allrandfish is a name known to allrandfish is a name known to all
Danny,

I think you've made some excellent points. My only issue on this subject would be if Google takes on this project and no one else does. I think that Amazon, Yahoo!, MSN & IAC should all be thinking along the same lines because the one thing this could do is push Google's lead in the Internet eyeball marketshare even further, which I'm against.

I'd like to see the diversity of the web remain diverse and not become centralized - centralized resources and thinking don't promote the kind of creativity and open discourse that makes the Internet such an incredible resource. I think the only argument against Google doing this is to say "please, don't let them be the only ones."
randfish is offline   Reply With Quote
Old 05-26-2005   #12
mcanerin
 
mcanerin's Avatar
 
Join Date: Jun 2004
Location: Calgary, Alberta, Canada
Posts: 1,564
mcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond repute
Rand, I think the answer to that would be clear if we look at the contract surrounding the digital copies that Google is giving back to the Libraries in question. Which, of course, I don't have

If those libraries can turn around and make the digital copies available for indexing to Yahoo, MSN, Amazon, etc without restriction, then I would say that Google is performing a legitimate public service.

If there is some type of a restriction, then not. I think this would be an excellent acid test for intent.

Ian
__________________
International SEO
mcanerin is offline   Reply With Quote
Old 05-26-2005   #13
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
Google's definitely not the only one doing this. There are a number of other projects underway. See More Full Text Books from Gary on our blog about a number of other project. Gary's like the dude for knowing these type of alternative projects. He also did an update on the Internet Archive's project, that invovles 10 libraries: More About The Internet Archive's Library Digitization Project. Gary also blogged about some problems getting more details from Google over here: Google's Scholarly and Digitzation Initiatives.

I actually share some of everyman's concerns on the Google project, and we've covered some of this in a past forum thread, Something fishy with Google library project. My chief concern right now is that I dislike Google diving in and doing something on its own rather than seeing if it could work with some existing projects and perhaps help drive some standardization of the digitization.

But going back to the big, the Internet ArchiveInternet Archive Million Book Project sounds great when you see they are a non-profit group designed to build an internet library. But it has strong connections with Alexa, which is owned by Amazon. So the "size" issue I think still lurks.

I at least feel more comfortable with it. And Google is a big player. It would be great in my view if perhaps we saw an "operational pause" or summit, where Google could help lead a meeting with existing groups on the plan to do digital books. Sure, it might delay things for a year. But it would be much better if we could get as much consensus as possible.

Perhaps then Google or any company could then use the data for various activities. What I don't know is if Google or any other company would think that such an effort would result in no way to make money. And that's an important issue. Someone's got to pay for it. If it's not a private company, then perhaps we end up the EU route. But then again, it's easy for the EU to rally initially. Will taxpayers really fund the spending involved over the long term?

There are lots and lots of questions raised. Google, as often the case, gets to be the lightning rod. But I'm interested in overall solutions, not trying to restrict one particular company. That doesn't solve the overall problem.
dannysullivan is offline   Reply With Quote
Old 05-26-2005   #14
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
Ian, the libraries supposedly have copies to do with what they want. But here's the key thing. In what format? Do they have copies in some proprietary format that will work only with Google software and machines? That's something addressed in the earlier forum thread and is a real concern.
dannysullivan is offline   Reply With Quote
Old 05-26-2005   #15
randfish
Member
 
Join Date: Sep 2004
Location: Seattle, WA
Posts: 436
randfish is a name known to allrandfish is a name known to allrandfish is a name known to allrandfish is a name known to allrandfish is a name known to allrandfish is a name known to all
Quote:
Originally Posted by mcanerin
If there is some type of a restriction, then not. I think this would be an excellent acid test for intent.
Exactly, Ian - thanks for that reply. It is an issue of size, but also of openness - this is content that currently not easy to get over the web. If Google makes it easy, but only easy through their system, that's seems at least a little "evil".
randfish is offline   Reply With Quote
Old 05-26-2005   #16
mcanerin
 
mcanerin's Avatar
 
Join Date: Jun 2004
Location: Calgary, Alberta, Canada
Posts: 1,564
mcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond reputemcanerin has a reputation beyond repute
Danny, may I suggest that it's time for a summit, or at least a roundtable discussion, on this issue?

It's a legitimate search issue, and although it may not be of interest to purely commercial webmasters, I believe it's a valuable concept and an important forward step in search.

As long as it's done properly, respectfully, and openly.

Ian
__________________
International SEO
mcanerin is offline   Reply With Quote
Old 05-26-2005   #17
Everyman
Member
 
Join Date: Jun 2004
Posts: 133
Everyman is a jewel in the roughEveryman is a jewel in the roughEveryman is a jewel in the rough
Danny, I apologize for that "mouthpiece" comment. It was unfair of me.

I have filed a request with the University of Michigan freedom of information office asking for copies of all UM agreements with Google. Everyone else should too -- the more the merrier.

If they deny the request on the grounds that "(c) The University has entered into an authorized agreement to keep the information confidential," ask them for the name of the person at UM who authorized the nondisclosure agreement. Then you can appeal in writing to the President of the University. That would be Mary Sue Coleman, President, University of Michigan, presoff _ AT _ umich.edu

I don't think it will work, because Google is more powerful than the University of Michigan. But it's worth a try.
Everyman is offline   Reply With Quote
Old 05-27-2005   #18
dannysullivan
Editor, SearchEngineLand.com (Info, Great Columns & Daily Recap Of Search News!)
 
Join Date: May 2004
Location: Search Engine Land
Posts: 2,085
dannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud ofdannysullivan has much to be proud of
Thanks, Daniel -- and very interested to hear what comes out of the requests. Certainly if Google wanted better PR for this, they'd just be more forthcoming and drop these NDAs themselves.

The summit's a good idea. I'm not sure if I can work it together for San Jose, but maybe the Chicago show. But maybe San Jose. I'll give it some thought.
dannysullivan is offline   Reply With Quote
Old 05-27-2005   #19
garyp
 
Join Date: Jun 2004
Posts: 265
garyp is a jewel in the roughgaryp is a jewel in the roughgaryp is a jewel in the roughgaryp is a jewel in the rough
Representatives from Google, various Google Library members, and others will be on a panel at the American Library Association conference in Chicago next month.

Also, as legendary librarian (I hope to be one some day) said on ResourceSelf this week, a book can have both editions in-copyright while having the full text in the public domain.
http://www.resourceshelf.com/archive...95415675411626

So, the question is, will Google also provide a direct link to the full text available free while also linking to an edition still having a valid copyright?

As I pointed out this morning, sure Google Print can be a research tool but it is also a tool to sell books. It's one thing to read a few pages online but it's something else to be able to easily print, annotate, share material. This is not Google Print's purpose.

Remember, with most Google Library material (the copywritten stuff) you'll only be able to view a small amount online. Also, what you can view (full text) will also be determined by where you are in the world.

Yes, Google is the "big guy" but why only the focus on one company?

As Danny mentioned earlier in this thread plenty of projects and companies are providing complete ONLINE full text access of new books (still in copyright) for free via a local library. Services vary by library.

For example, have a San Francisco library card?
http://www.sfpl.org/sfplonline/dbcategories.htm

You have access to several thousand full text/printable/annotate books online
from many publishers. This includes the full text of many O'Reilly, McGraw-Hill, and SAMS technology books.

Btw, it's just not books. Most libraries also offer free full text access to thousands of new magazines and newspapers online. 24x7x365 access from home or office. For example, the SF Public Library offers a searchable database (Free) of every page and ad ever published in the NY Times back to Vol. 1, No. 1. in the 1850's.


Quote:
I think the library profession, particularly in academia, have felt like dinosours in recent years. People are still coming to the library, but they all line up at the computers so that they can do Google searches. Then Google comes along and puts libraries back into the picture. Librarians feel so flattered that they forgot that they're supposed to look out for the public interest. A librarian who gets "Google juice" feels like he has a future. A librarian without Google juice starts wondering if he chose the wrong career path. It's no different from website rankings.
As a librarian who once worked in academia, I would argue that my profession has once again done a poor job of marketing ourselves and what we have to offer letting people know that the world of the library and librarian extend beyone the four walls of the building. Said another way, their is more than Google out there.

In other words, people (students, faculty, etc.) can't use what they don't know about. Yes, some librarians have sipped the Google Juice but not all of us. Also, some interesting stats showing that in the business world people are using the web less and turning to specialized databases and librarians.

Information Seeking Behavior
Source: Outsell Now
Knowledge Workers: The Thrill Is Gone
"(P)eople who use the Internet in their jobs are starting to tire of going directly to the open Web. Just 67 percent say they go to the open Web for the information they need for the job, compared to 79 percent in 2001. They are increasingly more likely to rely on corporate intranets, colleagues, libraries, and other intermediaries." From survey: HotTopics: 2001 vs. 2005: Research Study Reveals Dramatic Changes Among Information Consumers, available for purchase.
From: http://www.resourceshelf.com/2005/05...sites-for.html


As the main Google database grows larger many people will realize that they can get a better result, from an authoritative source, in less time by using specialty databases and tools. That said, librarians can also play a role in user education to make engine like Google, Yahoo, and Ask Jeeves (they are sure doing some great stuff) even more powerful resources.

Last edited by garyp : 05-27-2005 at 10:30 AM.
garyp is offline   Reply With Quote
Old 05-27-2005   #20
ipfresh10
Newbie
 
Join Date: May 2005
Posts: 3
ipfresh10 is on a distinguished road
authoritative source

As the main Google database grows larger many people will realize that they can get a better result, from an authoritative source, in less time by using specialty databases and tools. That said, librarians can also play a role in user education to make engine like Google, Yahoo, and Ask Jeeves (they are sure doing some great stuff) even more powerful resources.

very nice comment!

danny
ipfresh10 is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off