For step two of this lab, I was able to pick two words and analyze data about how frequently they appear in works of literature. The two words I chose are very common words and ideas that appear to be thematic across my works and genres of literature: love and peace. I chose these words because I knew there would be a lot of data to work with for the information I wanted to analyze. The two tools we used were Bookworm and Google’s N-grams. The first graph was with Bookworm, and I was a little different than I was expecting. As the chart was set from 1800-1920, I wasn’t necessarily expecting the overall trends to be negative for both worms. This was something that stood out to me about the N-gram tool as well. There are some obvious differences about the two charts even though they relay almost the exact same information. For example, the N-gram chart has vertical and horizontal lines running through its entirety, while the Bookworm graph does not. Even though this is a small difference, I think it makes the line movements look more drastic. In addition, the N-gram tool has a y-axis that is built on percentages, but displays them in numeric form. This tool definitely creates more of a reliable and exact description of the data being displayed than the other. I think these differences are important because researchers will easily interpret the data differently if it seems more exact (N-gram) or more roundabout (Bookworm).
For the third part of the lab, we used our corpus of State of the Union Addresses in order to visualize topics that run throughout the written works. We used the Multicloud feature of the Lexos tool in order to create these graphs for our analysis.
I think this tool is really unique compared to some of the other tools because of how it is formed into a word cloud. Each one of these clouds represents a separate topic in the corpus of works we uploaded. For example, one of the clouds has the word “Mexico” highlighted, while the other has the word “government” highlighted. These two words are bigger than the rest because they represent words that are used the most often in the corpus. In the topic number 13, the word “Mexico” is used the most often, and “government” is used the most often for topic number 2. I think this type of topic modeling definitely fits within the digital humanities because it allows researchers to clearly find topical information about a corpus of texts without having to search through the texts themselves.
The network graph we used for part 4 is very interesting. While the interactive graph is definitely intriguing and very attention-grabbing, it does not seem to be very clear to me. It seems as though this graph is a chart of all of the works of literature different people in a group had on their personal literary “bookshelves.” While some of the nodes are connected quite actively with many other nodes, there are many outliers that only connect with one other node or no other nodes at all. When you hover over each node, I’m not sure what the information that appears means. It seems to be names, so I’m guessing each node represents. While I think this kind of visualization could open up new grounds for analyzing data, I think this data set may be too intricate and complicated for the typical analysis process. I think this kind of graph definitely allows for humanistic interpretation because it so clearly connects some of the works without including the context of the outliers’ statuses.
The last section of lab 7 we researched and worked with was through Google Fusions. This program creates charts that are focused on data you can upload. The information we uploaded with a data set of the UK’s Tate Art Museums. Through this program, we are able to see all of the different countries involved in the data. Specifically, this is the artists per country. The graph allows us to easily access the different cultures involved in this art. Google Fusions definitely allows for easy access to data to interpret. For example, if we wanted to use this program for our final project, we could upload our literary corpus and find topics or word frequencies throughout the works. You could use this program to easily interpret using humanistic methods, as you are able to access topical information very effortlessly.