EMD – Lab 7
N-Grams:
In these two screenshots, I captured the graphs that were made in both the Google and Bookworm N-gram tools. They are both showing the frequency of the words “gay” and “homosexual” found in books published between 1800 and 1920. There are two main differences that stood out to me. For one, the Google viewer’s Y-axis is numerically labeled by percentage to the nearest one-hundred thousandths place, while the Bookworm viewer’s Y-axis counts by five but specifies that it represents “words per million.” This difference could be major for some people like myself who, when dealing with a graph like this, would much rather be able to visualize how many times the words appear in the corpus than know what the percentage of the books these words take up. The other big difference between the two tools is that when hovering over one of the lines in the Google graph, one can see not only the percentage of the corpus that that word takes up but also, the percentage of the corpus that the other word takes up as well. With Bookworm, you are only given frequency information for the line that you are currently hovering above. In this case, a person who wants to compare the use of the words in one specific year, it would be much easier to use the Google viewer because you do not have to meticulously go back and forth between lines like you would have to do if you were using Bookworm.
Word Clouds:
This is a screenshot of six of the word clouds that were generated with Lexos, showing a visualization of the most frequent words found throughout the State of the Union corpus that I uploaded to the generator. The advantage of this kind of visualization is that you can see very quickly what the “hot” words were in each of the 24 topics. The disadvantage is that you are basically just looking at a bunch of words without any sort of context. Because of this, the visualization opens up many new questions about the data, including wondering what the functions of these words are in their original context. In other words, word clouds are “humanistic” graphical displays because although they lead us in a direction of figuring out lingustic trends, they are in need of deeper interpretation by an observer.
Network Graph:
This visualization looks very cool, but that is about as much as I was able to gather from it for quite some time. I was confused as to the purpose of the nodes, but once I understood the concept of the project, it became a lot easier to navigate the graph. “Intuitive,” in the context of this lab, means that I could look at this graph and know exactly what each node, edge, and attribute represents without needing it to be explained. However, I was completely lost without the introduction, so it definitely was not intuitive for me. It was not to my surprise, then, when I realized that I could get a lot more information from scrolling further down the page to the other graphs that represent the same information. For me, the network graph opened up too many new questions for me (most of which involve just the basic questions, like what the colors represent and so on), which is probably why I would not have used it to visualize this information. Because this visualization lists basically all of the information needed to understand this project, some work would have to be done in order to make it more “humanistic.” One thing that could be done is to somehow make the perspective random, so that an interpretive body would have to be present in order to understand what the graph is representing.
Google Fusions:
This screenshot shows a pie graph made in Google Fusions, using the “Tate_artists_percountry” document, which is based on information from the UK’s Tate Art Gallery. By using different colors to represent country names, the pie chart shows the percentage of artists (whose work is featured in the gallery) who were from each country. Google Fusions looks like a really cool tool. It allows us to visualize data by utilizing multiple graphs instead of just one. Therefore, we have a much better chance of answering the questions that tend to pop up when we are using any of the other data visualization tools. Of course, we probably still will not be able to answer them all, but we will be a lot closer to a broader understanding of the data that we choose to visualize. In our final class project, we could use Google Fusions to show us different trends for the same data that we may not have paid any attention to if not for the ability to quickly change back and forth between visualizations. Google Fusions seems like a tool that was designed to be used to foreground humanistic, interpretive principles. Because it only displays the data with no context or any other information, it promotes the involvement of an interpretive body because without that, the data that is being visualized is completely useless because Fusions cannot tell us everything about it.