Lab 3 Report
English 8120 – Dr. Thomas
Honestly, I had the most difficulty exercises 3.1-3.4. Least importantly, I had to puzzle out the question for 3.4, as the question was cut off. I figured it out through looking at the previous question and then I had to consult the solutions sheet. The solution sheet was useful, but not if you’d missed a step anywhere along the way. But I digress. I literally spent the better part of 6 hours working on the tasks outlined in this lab. Specifically, exercise 3.2 was a bit of a conundrum. I did, after two hours and copious swearing, make use of the solution for that problem. The primary issue is that the help file found under “?unique” was way less than useful at my current level of R experience. I input every conceivable permutation proposed by the help file within R, as well as outside sources (most of which repeated the help file on R verbatim). The solution (“unique(c(names(sorted.moby.rel.freqs.t” etc.) appeared in no discernible form in any of the help files. After consulting the solution, it seemed clear why the answer was what it was, but it was extraordinary frustrating. Pre-packaged formulae, as it were, would expedite the process greatly. It would, as a result, make such text processing much more relevant to scholarship, especially under any sort of time constraints. The downside, though, would be a lack of flexibility within pre-packaged vectors and arrays. A plug and play option would be wonderful, however.
The idea of “length, frequency, and location” of a project like MONK, discussed by Clement’s article, could certainly be executed by a knowledgeable practitioner of R. The data I was able to generate (eventually) using R, particularly the top words and the dispersion plots, would be very telling to someone studying what constitutes a genre, or to make an argument of exclusion of a particular text or range of texts from an overall genre. For instance, I found it interesting that “him” occurred with a higher percentage frequency in Austen’s text than Melville’s, though perhaps given the subject of each respective story, and their genres, it should not have been all that surprising. Austen’s is a Victorian Romance, while Melville’s is clearly targeting a more masculine audience, or at least one more concerned about the “man versus nature” motif rather than a comedy of errors or, more simply, a “love story”. In that vein, one might be able to extrapolate a variety of less obvious motifs based on word frequencies, or percentages relative to a single text or an entire corpus. However, no one is abandoning close reading methods (Take that, Moretti!) just yet. One possible limitation of distance versus close reading methods is recognizing and applying a cultural studies angle to what is not there. By that I mean, if one were reading, say The Count of Monte Cristo, a limited amount of information could be gleaned by the absence of homosexuals, other than that they are not there. With close reading, one might be able to discern typical symbols or signifiers of homosexuality within the text. The critical missing element is human interpretation writ small. Or perhaps written minutely would be more accurate. Digital tools, especially those like R, seem to be best suited to macroanalysis only, which largely diminishes the potential for cultural and critical lenses.