Google has already scanned 10 million books in its bid to digitise the contents of the world's major libraries, but a copyright battle now threatens the project, with Amazon and Microsoft joining authors and publishers opposed to the scheme.
William Skidelsky in The Observer, Sunday 30 August 2009
In recent years the world's most venerable libraries have played host to some incongruous visitors. In dusty nooks and far-flung stacks, teams of workers dispatched by Google have been beavering away to make digital copies of books. So far, Google has scanned more than 10 million titles from libraries in America and Europe – including half a million volumes held by the Bodleian in Oxford. The exact method it uses is unclear; the company does not allow outsiders to observe the process.
Why is Google undertaking such a venture, so seemingly out-of-kilter with its snazzy, hi-tech image? Why is it even interested in all those out-of-print library books, most of which have been gathering dust on forgotten shelves for decades? The company claims its motives are essentially public-spirited. Its overall mission, after all, is to "organise the world's information", so it would be odd if that information did not include books. Like the Ancient Egyptians who attempted to build a library at Alexandria containing all the known world's scrolls, Google executives talk of constructing a universal online archive, a treasure trove of knowledge that will be freely available – or at least freely searchable – for all.
The company likes to present itself as having lofty, utopian aspirations. "This really isn't about making money" is a mantra. "We are doing this for the good of society." As Santiago de la Mora, head of Google Books for Europe, puts it: "By making it possible to search the millions of books that exist today, we hope to expand the frontiers of human knowledge."
Dan Clancy, the chief architect of Google Books, offers an analogy with the invention of the Gutenberg press – Google's book project, he says, will have a similar democratising effect. He talks of people in far-flung parts being able to access knowledge as never before, of search queries leading them to the one, long out-of-print book they need.
And he does seem genuine in his conviction that this is primarily a philanthropic exercise. "Google's core business is search and find, so obviously what helps improve Google's search engine is good for Google," he says. "But we have never built a spreadsheet outlining the financial benefits of this, and I have never had to justify the amount I am spending to the company's founders."
It is easy, talking to Clancy and his colleagues, to be swept along by their missionary zeal. But Google's book-scanning project is proving controversial. Several opponents have recently emerged, ranging from rival tech giants such as Microsoft and Amazon to small bodies representing authors and publishers across the world. In broad terms, these opponents have levelled two sets of criticisms at Google.
First, they have questioned whether the primary responsibility for digitally archiving the world's books should be allowed to fall to a commercial company. In a recent essay in the New York Review of Books, Robert Darnton, the head of Harvard University's library, argued that because such books are a common resource – the possession of us all – only public, not-for-profit bodies should be given the power to control them.
The second, related criticism is that Google's scanning of books is actually illegal. This allegation has led to Google becoming mired in a legal battle whose scope and complexity makes the Jarndyce and Jarndyce case in Bleak House look straightforward.
At its centre, however, is one simple issue: that of copyright. The inconvenient fact about most books, to which Google has arguably paid insufficient attention, is that they are protected by copyright. Copyright laws differ from country to country, but in general protection extends for the duration of an author's life and for a substantial period afterwards, thus allowing the author's heirs to benefit. (In Britain and America, this post-death period is 70 years.) This means, of course, that almost all of the books published in the 20th century are still under copyright – and last century saw more books published than in all previous centuries combined. Of the roughly 40 million books in US libraries, for example, an estimated 32 million are in copyright. Of these, some 27 million are out of print.