Hi. This is my list of books.

I am excited like kid at Christmas about this: a) I am a geek, who likes numbers and spreadsheets and things; b) I kinda like books too. Lists of books draw me in, but what has drawn me in even more over the past couple of weeks has been attempting to make book list-building somehow quantitative. A few weeks ago I came across TIME”s 100 novels (according to their critics, the best novels published between 1923, when TIME was first published, and 2005) and it occurred to me that I’m always seeing these sorts of lists and shouldn’t there be some way of combining them? Yes, it turns out, there is – or, there are – many ways of combining lists of books. There are also many lists, which is why this exercise expanded from a few hours on a quiet day in the Christmas holidays to taking most of my spare time for about 12 days.

Briefly, the how: I tracked down 14 “greatest books” lists, decided to drop two of them and combined the other 12. Between them, the lists contained 749 books. Each book got a score from each list it appeared on, for instance if the list ranked books from 1 to 100  then the top book got 100 points, the second book 99, etc. If, like TIME, the list was unranked, then each book appearing got 50 points. Adding up the points from all the lists gave each book a total score, however this was not sufficient to do a ranking because not all the books qualified to be on all the lists (e.g. TIME ignores everything pre-1923). Each book’s final score is therefore the average score it received from lists that it was qualified to be on. Which is pretty fair, right? I will do a proper methodology post shortly, with a discussion of all the lists; for the moment, know that there was a mixture of critic- and public-chosen lists and they came mostly from the USA and UK, but also with contributions from France and Germany.

The top 100 is here and the top 10 is, uh, here:

1. Pride and Prejudice, by Jane Austen

2. Wuthering Heights, by Emily Brontë

3. Jane Eyre, by Charlotte Brontë (wow! Go the Brontë sisters!)

4. Nineteen Eighty-Four, by George Orwell

5. The Great Gatsby, by F. Scott Fitzgerald (top-ranked American!)

6. David Copperfield, by Charles Dickens (Victorians making a very strong showing in the top ten; could be due to the American lists mostly focusing on the 20th century)

7. The Catcher in the Rye, by JD Salinger

8. The Grapes of Wrath, by John Steinbeck

9. Ulysses, by James Joyce (mostly picked by critics – it was #1 in the Modern Library list)

10. Anna Karenina, by Leo Tolstoy (top-ranked non-English-language. I was surprised to find Anna Karenina ranked higher than War and Peace; I should probably read it)

It’s my bedtime now, but just a couple more notes about this.

Firstly, I am aware that Newsweek did a very similar exercise a couple of years ago: they called it the Meta-List. Their list is different from this one because, well, it was built on top of a different collection of lists. Unfortunately, their full methodology seems to have fallen off the internet so I can’t replicate/add to it, but this is an ongoing project, so maybe some day.

Secondly, this is, uh, an ongoing project. I would be interested to hear of good lists that I could or should add to my collection to make the overall list more comprehensive. As I said, there should be a proper methodology post soon, where I’ll detail the 12 lists that have already gone in.

Thirdly, lists, eh? Isn’t it all a load of willy waving? Doesn’t attempting to rank works of art completely miss the point? Aren’t you taking all the joy out of it? Not really, I don’t think. If you only choose what to read by looking at lists and you think that reading the books at the top of the lists is the best way to be smart about books then yes, that is kind of dead. But noone actually does that, do they? Lots of people have common likes and dislikes in books and making lists is an interesting way to compare and contrast and discuss book-liking and why people do it. It can also be a revealing exercise: making this list has spotlighted my own rampant gender bias in reading. Of the three books in the top ten that I haven’t read, two are by women and the other’s title is a woman. It doesn’t get any better as you go further down the list: after the Brontës I’ll need to get around to reading some Virginia Woolf. The other reason this exercise gets me excited is the data mining possibilities. I now have the most awesome book spreadsheet you can imagine (can you imagine how awesome a book spreadsheet can be? No? That’s a terrible shame) and there’s all sorts of goodness in the data . Want to know which country has produced the best novels? Which decade saw the most spectacular burst of great literature? There’s a very real danger that I will be producing graphs of these things in the next few days. This is good.