I should really have done this yesterday when I put the list up in the first place, but the results were so much more fun than the method. Anyway, here we are.
The table shows the 12 lists that were included, their country of origin, publication date, coverage and brief methodology. You can also click on any of the lists to see it at the source!
|BBC Big Read||UK||2003||All time||Poll|
|TIME 100||USA||2005||1923-2005||Critics’ selection|
|World Book Day poll||UK||2007||All time||Poll|
|Norwegian Book Club||Norway/International||2002||All time||Survey of 100 authors|
|Observer critics||UK||2003||1600 onward||Critics’ selection|
|Modern Library||USA||1998||20th Century||Critics’ selection|
|Telegraph||UK||2009||All time||Critics’ selection|
|Le Monde||France||1999||20th Century||Poll based on critics’ selections|
|Librarians||USA||1998||All time||Survey of librarians|
|Radcliffe Library||USA||1998||20th Century||Critics’ selection|
|German Big Read||Germany||2004||All time||Poll based on critics’ selections|
|New York Public Library||USA||1995||20th Century||Critics’ selection|
For more lists, I can strongly recommend this page by Robert Teeter, which compiles a great many lists of both Western and Eastern classics.
Now, as I mentioned the other day, some of these lists gave a ranking from 1 to 100. In those cases, the top book got 100 points, second got 99, and so on all the way down to 1 point for book number 100. For unranked lists, every book appearing was awarded 50 points. Based on this, every book included in any of the lists was given a total score from across all the lists. I then checked for each book how many of the lists it was eligible for. For example, a book published in 1950 was eligible for all 12, whereas a book published in 1850 was only eligible for the 6 “all time” lists, plus the Guardian’s 1600 onward one. Books published after 200o also had limited opportunity to be included, as many of the lists were compiled in the latter half of the 90s. Therefore, each book’s total score was divided by the number of lists it could have featured on, i.e. Pride and Prejudice’s total score of 523 was divided by 7 to give it 75 overall, while The Great Gatsby’s apparently superior score of 698 ended up at 58 as it was divided across all 12 lists.
I think the two main issues with the method are the selection of lists and the way the points were awarded. If anything, I should have collected a larger number of lists from a greater variety of sources. If you follow the Robert Teeter link above you’ll see that there are many many lists to choose from and you could argue that my selection is kind of arbitrary. In a rough way, I wanted to include a mixture of academic and popular selections, so I didn’t select lists like the St John’s College one, but you could argue it both ways. Regarding the points, I think I was too mean on the books at the bottom of the ranked lists. According to the methodology, being considered the hundredth best book of all time (1 point) is almost equivalent to being out of the running completely (0 points). To illustrate why this is a problem, take Midnight’s Children versus Slaughterhouse Five. As late 20th century books, both were eligible for all 12 lists; Midnight’s Children appeared on 8 and Slaughterhouse 5 on 4. You’d have thought Midnight’s Children would win by knockout, but two very positive scores from American lists gave Slaughterhouse 5 a decent score for 71st place overall, while Midnight’s Children ended up in 100th place on three separate lists and faded to 128th place. If list rankings had been ignored and points awarded purely for presence on lists, Midnight’s Children would have been in 21st place overall. This is probably an injustice; a future edition of the list may give greater weight to list presence.
All this suggests plenty of scope for refinement – in fact I’ve already started fiddling with the system and found one plausible way to give Midnight’s Children a higher rank than Slaughterhouse 5 – but for the moment I’m going to concentrate on the data mining, which is more interesting than all the compiling and classifying work that went into building the dataset. Coming next: international face-off.