|
Once upon a somewhere. Georeferencing books using
toponyms identified in online book reviews |
Alexander Mackie www.mappit.net |
Stories
are at the heart of this project. Stories of adventures and afternoon tea,
dastardly deeds and love affairs, stories of the everyday and the fantastical
that bring the past to life and shape the future. Location can help readers
discover these stories and better understand both the stories, and the places
they inhabit. Recently there have been
attempts to geolocate stories and books, mostly
works of fiction. This allows the discovery of stories about, or set in
places using a map. A good example is Placing
Literature: http://www.placingliterature.com/map Placing Literature is a crowd-sourced database of mapped books. There is an argument that when
it comes to works of fiction: "?setting them in real-world locations gives
a sense of realism to the novels and helps make that connection between a piece of art and the physical world? (Williams, 2013). In addition to this humanistic argument, there
are also commercial applications of georeferenced
books in promoting book sales. If a book retailer can offer locally relevant
books in recommendations, or allow readers to search for books about their
holiday destination then they can improve sales. There are some features of books that make
georeferencing them challenging. Unlike a business such as a supermarket
which has a physical premises at a discrete
location, a book might be about multiple locations, fictional locations, or
no have location at all. A location might be specific down to a house on a
street or vaguely defined as somewhere within a particular country or galaxy. The main drawback with existing solutions is the sparsity of mapped books. A map simply isn?t very
engaging when there are hardly any books in the database. This dissertation
examined automated ways of generating this data, specifically using the
Unlock Text geotagger to identify place names from
online book reviews. This has potential to solve this issue of data sparsity. There reviews? for 72,000 books comprising 80
million words were scraped and processed. ?????????????????
Example of data for one book: ?The Northern Crusades? by Eric Christiansen ? A detailed evaluation of the accuracy of the data
was carried out. On average, approximate 60% of the books linked to a given
toponym using this technique were correct. This rate of errors would be
unacceptable in a book-searching application. Results:
|
Further reading: Grover, C., Tobin, R., Byrne, K., Woollard, M., Reid, J., Dunn, S. & Ball, J. (2010)
Use of the Edinburgh geoparser for georeferencing
digitized historical collections. Philosophical Transactions of the Royal Society A: Mathematical,
Physical & Engineering Sciences,
368, 3875-3889. |