Step 2: Google Books Ngram vs. Bookworm Ngram

google books ngram bookworm ngram

I uploaded the words “war” and “peace” into both the Google and Bookworm Ngrams. Predictably, both charts showed their biggest bulge from 1910 to 1930, which makes sense considering World War I took place from 1914-1918.  Interestingly, the Bookworm graph also depicted a second peak between 1860 and 1870, whereas the Google graph didn’t show much of a blip at all. Given that the American Civil War took place from 1861-1865, it would probably be the catalyst for those discussions. This makes me think that the Google Ngram covers more global texts, whereas the Bookworm corpus is primarily American-based authors.

Human nature being what it is, “war” was always mentioned more often than “peace,” which remained pretty consistent throughout the 19th and early 20th century.

Step 3: Lexos Topics

topics pic

Topic 5 was an interesting one, as it contained mostly monetary terms. “Banks, currency, specie, notes, money, treasury, gold, and circulation” are all among the top words — but interestingly, so are words like “constitution, times, fidelity and metals.” While one could explain this away by saying any State of the Union address is going to address the economy, it’s interesting that the word “economy” did not show up on the list at all. Because of the words “metal, gold, and specie,” I’m inclined to assume that much of this discussion came from the historical discussions about what standard to base the US dollar upon. The dollar has an interesting history, which I can’t fully delve into here, but in general this list probably draws significantly from the shift to the Gold Standard, Silver Standard, and finally the “fiat” standard of today, in which the US dollar is not actually backed by any physical asset.


Step 4: The Ideal Bookshelf

The website describes what this interactive graph is – a mapping of how various contributors’ imaginary bookshelves overlapped and interconnected. When reading the website, the graph makes a lot more sense. By itself the graph is very confusing and not at all self-explanatory. Moving the dots around does intuitively demonstrate how they are all related, although there are so many names that flash around so much, it’s hard to really get a feel for what any of it means. The only real thing the moving-feature ads is the sense of which dots are most central to the dataset, and which are more isolated with less pull on the core.

I think the interactive graph is a great idea, although in actuality it was kind of confusing. I can’t imagine anyone actually using that to gain information: it’s main use is to visualize the network from a distant zoomed-out perspective and maybe create intrigue through playing with it. While it is visually enticing and the interactive feature is entertaining, practicality calls for something else. If someone was trying to find out specific information, then an interactive and easily rearranging table would be better at efficiently conveying information.


Step 5:

Pie Chart

This chart shows the percentages of the top 3 countries. I had to hover the mouse over the orange slice to get it to show up, because the top 2 take up so much of the chart. The UK had over half of all the authors with 51.6%, and the US came in with 11.9% of the authors. The next closest country was France, which only had 5.2% of the authors in the collection. This is clearly an uneven distribution, but makes sense considering that the UK has a very long history of publishing works. Many smaller countries don’t have as big a population, so they won’t have as many authors. Although the US is much younger than many of these European countries like France and Italy, the US was born into a more modern era where the written word was more popular, and thus more people were likely to want to be authors.