Donald Brown
Search Me
‘Of course the company founded by Sergey Brin and Larry Page in 1998 - now reckoned to be the world’s most powerful brand - does not offer any substitute for the originators of content nor does it allow this to touch its corporate conscience. That is probably because one detects in Google something that is delinquent and sociopathic, perhaps the character of a nightmarish 11-year-old.’
— Henry Porter, “Google is Just an Amoral Menace,” The Observer, 5 April 2009
Porter’s article, which I found because two Facebook friends linked to it, resonated very tellingly after attending a symposium, ‘Library 2.0,’ held at Yale Law School on Saturday, April 4.
After an intro that featured much ‘lifted’ film content and a bright, buzzword-laden welcome that urged us to Tweet and Blog and upload photographs from our cellphones, etc., and a paper by Josh Greenberg of the New York Public Library that celebrated the outreach potential of blogs, we finally got to a presentation, by Michael Zimmer of Univ of Wisconsin-Milwaukee, that offered a few caveats to the collective zeitgeist of online über alles with the notion, picked up from Neil Postman, of technology as always offering a Faustian bargain.
Given the need for the internet in contemporary communications, we might think Zimmer was simply playing devil’s advocate or was a Luddite at heart, a throwback to the ancient days before we all went online. But not so, what Zimmer was really cautioning us about was all the unexamined consequences of our lemming-like acceptance of internet interaction. As librarians have had to at times stand up for civil liberties, like the right to privacy about one’s intellectual inquiries and sources of information, Zimmer had reason to wonder if ‘Library 2.0’ — the library as modeled on Google, essentially — will continue to provide a ‘safe harbor for anonymous inquiry.’ Not simply ‘who owns the content’ of what we post — but who owns the documentation, who gets to data-mine, and so forth. Ted Striphas, of Indiana Univ., extended this ‘Big Brother is Watching’ paranoia into Amazon’s Kindle system which relays its users’ annotations, bookmarks, notes, and highlights back to ‘the mothership.’
In the course of the day, there were several references to ‘the Death Star’: the four huge publishing conglomerates that now exist where twice that many major publishers existed a decade before. But the real ‘Death Star’ emerged when the topic of Google’s digitization plans for out-of-print books was on the table in the day’s last panel. Already we had heard, in an excellent presentation by John Palfrey of Harvard Law School, how 100% of a focus group of what he called ‘digital natives’ (those hitting 13-22 since the major internet wave of the late ’90s) used Google to search for information and all went to the wikipedia entry on the subject first. Though Palfrey didn’t elaborate on this at the time, the point became clear in the Google discussion when Frank Pasquale, Visiting Professor at Yale Law School, spoke of the possible consequences of putting all our searches for information in the hands of ‘proprietary black box algorithms subject to manipulation.’ Wikipedia is always the first or second entry in any Google search. The first ten are apparently all anyone looks at. Everything that gets buried by the algorithm is as good as not there. This is not how research is conducted.
Then there’s the question of all those out-of-print books. Obviously it would be to the public good to have them searchable and accessible online if only because anything not online or available through Kindle (in other words, anything not part of the Death Star of Google and Amazon) falls into the ‘here be monsters’ of off-the-map ignorance. Already Jonathan Band, a lawyer, had told us that ‘fair use’ was becoming more conducive for technological and creative appropriation, and Denise Covey of Carnegie Mellon University Libraries and Ann Wolpert of MIT Libraries had spoken about faculties pursuing an open access policy in which anything they publish can be searched and referenced online — a blow to academic publishers, but a victory for the notion that research on the internet should not be hampered by commercial considerations.
In other words, the notion of open access to all information, via the internet, of complete ‘transparency’ of provider and user, was more or less the mantra of the day. But what the Faustian bargain came to seem finally was not with the technology itself, but with giants such as Google and Amazon as the Big Brothers playing Mephistopheles, offering us the interconnected, easy access world of our dreams, but a world where we sacrifice something of our own intellectual curiosity, restlessness, and desire to see outside or beyond that black box algorithm that makes things so easily manageable for us.
Think about how Wolpert pointed out that what made the MIT professors move for Open Access was their realization that, in the world of electronic text, libraries only ‘lease’ access to online work, rather than owning it like all those printed copies they store in perpetuity. If something happens to the provider or to the lease, all that material is no longer available. And now the publishing world seems poised to turn over all electronic control of out-of-print materials to Google to broker for us, and to disseminate to us according to its lights. As Brewster Kahle, co-founder of the Internet Archive, urged us to consider, there are alternatives. But as Ann Okerson, of Yale Libraries, said at the end of the final panel with a kind of ‘fait accompli’ finality: if Google accomplishes this digitization, the students and users of libraries at Yale will simply want access to it, and her job will be to work with it, not fight it.
‘But it was all right, everything was all right, the struggle was finished. He had won the victory over himself. He loved Big Brother.’
— George Orwell, 1984
Comments
5 Responses to “Search Me”
Comment
Just a few details since this is an area in which I am expert (yes, expert, if you can believe it).
1. Libraries don’t lease all of their online content. Some of it is in fact purchased through what is called a “digital archive” arrangement. This arrangement typically affects collection of historical books (not from Google but other major library vendors) and certain collections of journals. The problem is not lack of ownership; it’s the ability to host the data on their own.
2. As for Google, all is not as awful as it seems. At present, Google, I believe, turns over digital copies to a library of books sourced from that particularly library. (E.g., Harvard gets digital copies of its books alone; Stanford gets theirs, etc.) The bigger problem is what happens when the one source of information for accessing those books goes bust or ceases offering that service. Now all the libraries, who hopefully got their digital copies of their books, have to figure out a new way to get all that content up on the Web and strung together.
3. There is no counterpoint regarding the problems of privacy. There really is a threat in that regard, although we should probably also note that datamining is, in a sense, what every scholar already does. The question is not the activity but the nature of the data and the purposes of the mining. I’m actually grateful when Amazon recommends books to me of related interest based on the buying habits of previous customers. There is no way in a pre-Internet world to have gotten that type of information.
Thanks, that’s helpful, but re: leasing, that was from a librarian who I believe was making the point in the context of budget-cutting shifts in how things are being handled or are likely to be handled. Of course this was more an issue at state university libraries, but even the private institutions represented at the conference seemed impacted by decisions to go digitial as much as possible (no longer acquiring print journals), so it was questionable what a given library actually ‘houses’ or ‘controls’ in the digital universe.
I tried to make the point that the concern was: “If something happens to the provider or to the lease, all that material is no longer available.” As you say, ‘goes bust or ceases offering that service.’ That indeed seemed to be the concern: being at the mercy of such out-sourced service providers.
And I should’ve been clearer: the concerns about privacy were raised only by a few participants; most seemed happy with the interconnectedness of information. But there was no mistaking a certain ‘threatened by Google’ tone by the end of the day, though again that was only from particular participants who seemed to feel, though maybe not as strongly as Henry Porter, that Google, as a commercial entity, would not be as accountable to the scholarly community as universities would be, were they in charge of the digitalization project.
The most telling phrase of the day for me was the one about the ‘block box algorithm,’ since I’ve had to deal with student papers in which the search methods are a bit too ‘googled.’
The question of the black box algorithm is an interesting one. It bears comparison, for example, with other “sorting” mechanisms we use for information. For example, the traditional online public access catalog (OPAC) in a library, based largely on the old card catalog system, resorts to the flattening effect of alphanumeric listings (of author, titles, subjects). Google (and other search engines) have added new set of tools to how we access information. It’s true that Google’s search algorithm is a trade secret. It also happens to change regularly in an effort to stay ahead of Google spammers and Google bombers, “black hat” SEO, etc., which is also one of the reasons it also sits in a black box.
Still, the ability of users to change variables in that algorithm, above and beyond tweaking the advanced search functions, would be a welcome addition, were such a thing possible.
I’m with you on the “black box algorithm” image: spooky and a certainly nefarious-sounding. But perhaps necessary and possibly manageable from the user-side in future versions.
that would be interesting indeed: a do-it yourself google. Of course, the whole vexed question of ‘search terms’ even back in the days of those huge print LOC indexes is now compounded by how you ‘key in’ the terms of choice and whether you get everything that would be relevant. Teaching search strategies because more essential, especially with so many lazy ‘google searchers’ out there.
Hello Everyone! How is everyone doing???