« Shining Happy People | Main | Designer Siblings »
Thursday, September 29, 2005
Google Print
I have been trying to stay on theme so far this week, but there are so many interesting things going on that I have to deviate today. In particular on my mind tonight: the litigation between Google Print and the Author's Guild.
For those of you who haven't followed this one, the question is whether Google can build a searchable index that allows queries against a database of books that Google itself scans in. Google has already begun the scanning, and basically Google wants to scan books without the permission of the relevant copyright holders. Some authors have objected, arguing that unauthorized scanning is a copyright violation. Google appears ready to counter by saying that scanning for the purposes of the index is fair use.
Jack Balkin, Tim O'Reilly, Larry Lessig and many others have publicly weighed in on Google's side. But, for three reasons, I am not so sure.
First, to champion the fair use argument, Google needs to explain why it has to build its system without permission, rather than simply asking for permission from authors, copyright clearing houses, and the like. Jack Balkin's post suggests that Jack would happily give permission, and I believe him. But the question here is why other authors should be forced to do as Jack does. There are arguments along that line to be sure (insert your favorite story about coordination and transaction costs here) but the simple argument ("this is good for authors!") is not enough.
Second, none of the comments I have seen speak to the core concern I have: security. If Google makes a giant database with full scanned copies of millions of books, that database will be a target for hackers who want to grab those books for P2P and other forms of unauthorized distribution. Should Google be allowed to impose that risk on unwilling copyright holders? At a minimum, I would think that any fair use claim should be contingent on some minimum level of security. That is, if you are going to take my stuff without my permission, you have an obligation to protect it.
Third, I wonder if the many people supporting Google mean to support Google specifically or this sort of unauthorized use more generally. That is, I might myself be in favor of allowing the Google project. I like Google and could likely be convinced to trust Google to handle this project well. But legal rules rarely work that narrowly; and once this cat gets out of the bag, I fear the implications might be significant, and bad, and not yet fully thought through. For instance, if Google is allowed to do this, would I be allowed to make my own search engine where I scan in all books about Harry Potter and then index the content based on its legal themes? And if there are thousands of similar indexes out there, doesn't that at some point pose too great a security risk for copryight holders, who now will have to police and monitor and interact with a huge number of would-be unauthorized users?
Yes, yes .. I do hear the intuition that is motivating Jack, Larry, Tim, and Google, and I applaud it. I, too, want great indexes of books, and I, too, think books need some sort of powerful search engine if they are to keep up with the very searchable world of online databases, websites, and the like. But I think this fair use argument needs to be handled with extreme caution, and that there really are some serious reasons to pause before giving Google an unfettered right to scan.
Posted by Doug Lichtman on September 29, 2005 at 07:05 PM | Permalink
TrackBack
TrackBack URL for this entry:
http://www.typepad.com/t/trackback/3278827
Listed below are links to weblogs that reference Google Print:
» Reintermediation vs. Disinermediation of IP from Werblog
Over at Prawfsblawg, Doug Lichtman raises a good point about the Google Print litigation. We might be comfortable with Google scanning libraries of books into a database, he says, because they have incentives to behave responsibly with respect to intel... [Read More]
Tracked on Sep 30, 2005 9:49:59 AM
» GOOGLE PRINT ~ KELO from Begging To Differ
Some initial thoughts† on Google Print. Prof. Lessig makes a sweeping claim†† about the Google Print project:Google’s use is fair use. It would be in any case, but the total disaster of a property system that the Copyright Office has... [Read More]
Tracked on Oct 21, 2005 11:46:22 AM
Comments
The tide has turned. The copyright industry, through its hubris, disdain for the public, and one-way exploitation of the public domain, has significantly decreased any respect for copyright the public may have ever had. Millions of Americans and their children are in violation of federal law and technically owe trillions in statutory damages to copyright holders for doing something that only requires the movement of their mice. They will never pay.
That Google will begin to index centuries of printed literature and knowledge should be applauded by all far and wide. Google changed the face of the Internet without taking anything from the public. The way they index and cache the web is arguably indistinguishable (for the purposes of copyright) from what they are doing with Google Print. If the law fails to accomodate the modality of their latest project, then it doesn't account for their web search product either and should be changed.
The digital world is fundamentally different. The sooner we recognize this as a society, the better.
Posted by: bill | Sep 29, 2005 11:14:23 PM
Bill -
Larry Lessig wrote something very similar on his blog, but I'm not sure either of you are right. Can we really not distinguish Google's core business from this? For one thing, the normal Google search engine accesses and points to materials that have already been made willingly available online. My concern about security, then, would not apply. For another, most of what Google points to is being made available for free to the public. Again, that's not true for many books, especially books whose authors would deny permission if given that chance.
Overall, I am with you on the intuition that search engines are great and bringing books online would have real value. It's the details I want to explore and resolve.
Posted by: Doug Lichtman | Sep 30, 2005 8:24:11 AM
How do you feel about Google's caching of pages (which is opt-out and wholesale)? Google Images (i.e. thumbnailing)?
Also, why should the fact that what Google indexes and caches on the Web has been placed there for free access by the copyright holder matter? Is there something in copyright law that says I can't enforce a copyright because I don't charge for access?
Don't forget that authors do have the opportunity to not have their books indexed. All they have to do is write to Google and say "don't index my book!" Except, of course, they can't, because the authors aren't often the copyright holder, so it's up to the publisher to do so. But for some reason, they'd both rather sue to stop the whole thing rather than go to Google and ask to be excluded.
Posted by: bill | Sep 30, 2005 9:48:21 AM
Bill -
Those are good comments back, but I think there are answers. Specifically:
Yes, it does matter whether the original copyight holder is selling his work or giving it away for free. The fair use test makes that relevant (see section 107 of the Copyright Act, factor 4) as would our "common sense" policy intuitions about the harms and risks at stake.
Similarly, your point about opt-out is important (and one that has been stressed by Google) but again too quick. How would opt-out work in practice? If there are thousands of entities making their own unauthorized indexes, do I as a copyright holder now have to spend my time finding and opting out of each one? At some point, that would impose a huge burden and in essence render my copyright protection useless.
The latter point is why my original post asks whether advocates here are just thinking about Google, or are thinking about a broader legal rule. Sure, if opt-out is just about Google, then maybe opt-out would work. But once this becomes "fair use," then there would be hundreds of sites and technologies trying to claim the exception, and at that point opt-out looks untenable.
Thanks for the continued dialogue, by the way. This is helpful.
Posted by: Doug Lichtman | Sep 30, 2005 10:00:00 AM
First, I'd like recommend to you Bill Patry's change of hart on the subject (and excellent commentary on the factors of fair use): http://williampatry.blogspot.com/2005/09/google-revisited.html.
You make a good point that there may be a slippery slope here into expanding fair use. I say, "Great! I hope so!" If others want to try to compete with Google to index all the world's printed matter, more power to them and the better for society as a whole. If the advent of inexpensive, mass-digitization has changed the balance toward the public domain, I think it's about time. The pendulum has swung too far toward the interests of the copyright industry, and it's time that the public began to reclaim some of its side of the copyright bargin. I'm a bit gratified, in fact, to see this happen primarily through competition rather than through government action (though I'd be happy to have Congress roll back most of it's term extensions).
You ask about broader legal rules as though they are of primary importance. What's more important is what path benefits society most. After we answer that, then we can craft the law to achieve that end. Who defends the public domain?
To answer you specifically: I'm not just saying that this is good because Google is doing it. Mass-indexing is, in my opinion, good for society at large, and if copyright law is insufficient to make it possible, then it needs to change. In my opinion, opt-in is at least as untenable as opt-out. Based on the briefs in Eldred, it's clear that the number of out of print and potentially orphaned printed works dominates amongst those works not in the public domain. If indexers have to contact every copyright owner to carry out indexing, what should be done with those works that are no longer commercially exploited or the copyright holder unknown, uncontactable, or unknowable? Does copying of out of print, but still copyright protected works, fall inside or outside of your free-to-the-world view of fair use (i.e. factor 4)?
Posted by: bill | Sep 30, 2005 10:53:36 AM
I think the Google Print debacle, highlights the significant redistributive effects of copyrights.
I find it hard not to believe that an indexing program such as google print would have far reaching benefits for society. Aggregating and making searchable millions of books would be the greatest advancement to scholarship since the Great Library of Alexandria.
In our world with copyright laws, the value of which are increasingly challanged by open-source development like wikki, we just have a situation where the publishing company's are seeking to maintain their monopoly over the use of these books, for their own benefit.
In other words, these small group of corporations with copyrights, maintain their monopoly, at the expense of society's advancement.
To support such a decision, in my mind lacks compelte sense. Why should government, through its laws, overly subsidize a small group of corproations to the detriment of the rest of society. If such a situation, exists, the laws should be changed.
Posted by: Aaron Wright | Sep 30, 2005 11:25:43 AM
Doug, your second point is really interesting. Taken a step further, there would seem to be an argument there that the amount of security provided by the user should be an additional factor in the traditional fair use analysis when applied to digital uses. I'm not sure where I come down on that in the end, but it's an interesting idea. One initial problem that occurs to me is that it would be hard to permanently resolve disputes; every time the security mechanisms changed there would be the potential for re-litigation. But on the plus side it does seem to address one of the key problems concerning digital uses.
Posted by: Bruce | Oct 1, 2005 11:25:20 AM
[First, to champion the fair use argument, Google needs to explain why it has to build its system without permission, rather than simply asking for permission from authors, copyright clearing houses, and the like.]
Hmm.. there are scores and scores of books out there and to try to get written permission for all of them is just the kind of task that will kill the project from planning.
[Second, none of the comments I have seen speak to the core concern I have: security. If Google makes a giant database with full scanned copies of millions of books, that database will be a target for hackers who want to grab those books for P2P and other forms of unauthorized distribution. ]
Just because there is a *hypothetical* security concern isn't enough to deny fair use rights to google.
Security of databases are pretty pretty tight. Unlike security of financial transactions when information is transmitted through public networks. To talk about security and concerns thereof can be very 'geeky', but in reality isn't much of a scare. Also, the format google will be storing all the information scanned will be in a propriety format that isn't simple text. How about encrypting the text with the strongest encryption available.. one that will take a thousand computers to crack in thousand years? Now, that should settle your concern.
[But legal rules rarely work that narrowly; and once this cat gets out of the bag, I fear the implications might be significant, and bad,]
This line of creating fear was used against every known piece of technology, right from the Internet when many argued that e-mail system will help anti-nationals, terrorist elements to communicate incognito, etc..
If we keep fearing the slippery slope which pretty much can be just imaginery, will we ever take a first step?
Instead of coocooning, we must brave it. Its a large world of printed matter. Unmapped. Potential. Indexing it would only have good implications instead of let it all rot.
Maybe arguments like yours centuries ago prevent us from enjoying the thosands of works created then... why should we deny the pleasure to future generations?
Posted by: Jagan Mohan | Oct 2, 2005 4:34:12 AM
Prof. Lichtman,
How do you think Yahoo's recently announced Open Content Alliance affects arguments about Google? Yahoo's book-scanning project is looking to make a profit, yet not rely on unauthorized reproduction of copyrighted works. Certainly it would answer your security concerns, as any content owners could contract with Yahoo to provide and protect their protected works in this database. Also, copyright holders could further require insurance for damages caused by unauthorized access.
One interesting result is that if we assume these book-scanning projects are beneficial to copyright holders (and further that Google's activites are not restricted any time soon), then we could see whether copyright holders "opt out" of Google's project and choose instead to provide a similar service through Yahoo. By moving to Yahoo, content owners could likely share in the advertising related to these projects and negotiate over security risks. On the other hand, it may not be beneficial to the public if the content owners could contract with Yahoo to tailor the search results to certain ends. For example, Yahoo might be required to only link consumers to places where they can buy the book at full price. (This might not actually be a bad thing for the public though, as this likely creates more incentives to place copyrighted works in these types of databases.)
Also, one essential difference I see with Yahoo's database, in comparison to what Google is offering, is that it will likely not include orphaned works. The ability to access orphaned works, however, is a separate issue from that of fair use. So while Google's end result might be to include them, it doesn't answer I think whether the means Google used to do so should be legal.
Posted by: Cory Hojka | Oct 4, 2005 3:43:12 AM



