Lab 3: Text Analysis with R
This lab asks you to begin to experiment with various methods of digital text analysis: text analysis using some popular ready-made tools, and text analysis using R programming language.
- In the Required Readings portion of our Course Readings folder, you will find chapters 1 -4 from Matthew Jockers’ Text Analysis with R for Students of Literature (2014). You will be working through these chapters for much of this lab. Before class on Monday, Feb 8, please read and work through all of Ch. 1. Please download and install both R and RStudio; like Jockers, I highly recommend you work through the tutorial using RStudio.
You will also find the solutions to the practice exercises Jockers asks you to complete in our Required Readings folder. I have given you the solutions because, if you get stuck on a problem, it is sometimes helpful to see the solution and to then work backwards to try to understand it. However, try to work through the practice exercises without referring to the solutions as much as possible.
NOTE: The link that Jockers provides on page 5 is now incorrect. Here’s the correct link: http://www.matthewjockers.net/text-analysis-with-r-for-students-of-literature/. You can download the materials using the link titled “Download the Textbook Materials.”
Create a folder on your desktop (or wherever you store work for this class) called “YourName-Lab3.”
Work through chapters 2, 3, and part of 4 of Jockers’ Text Analysis with R for Students of Literature (pgs 11-35; stop at section 4.3), relying on the solutions as little as possible. Save the following things to your Lab 3 folder as souvenirs:
- The graph you produce in Practice 2.1 (pg 22). You can get this image by exporting your graph in RStudio, or just by taking a screenshot of it.
- The graph of the top ten words in Moby Dick (pg 27-8)
- The graph of the top ten words in Sense and Sensibility (Practice 3.1, pg 28).
- The code you use in Practice 3.2 (pg 28) and the results (screenshot).
- The line of code you write for Practice 3.3 (pg 28; screenshot).
- The line of code you write for Practice 3.4 (pg 28; screenshot).
- Dispersion plots for both “whale” and “ahab” (section 4.1, pgs 29-31)
- A screenshot of the results from step 3 on page 35.
Finish reading the rest of Chapter 4, but you do not have to follow along with the instructions unless you want to. Arguably, I’ve stopped you (at section 4.3) just before you start to learn about the interesting stuff. Try to understand the logic of what’s happening in the rest of Chapter 4, and of what Jockers is asking you to do. Try to complete the chapter and the practice exercises if you’re up for a challenge. If you complete the chapter, save evidence that you completed the Ch 4 practice exercises (or tried them) to your Lab 3 folder.
Upload your Lab 3 folder to our class Box drive (in the Lab 3 folder).
Create a post on our course site where you will write a report explaining and reflecting on what you did in Lab 3 (categorize it under “Lab 3”). Some questions you might consider as you compose this portion of your report include:
- Briefly explain any difficulties you had with step 3 (and, if you tried it, step 4). What was difficult to understand/complete, and why? If you got stuck and consulted the solutions, when did you do this? Why? Did consulting the solutions help you to understand the logic behind what Jockers was asking you to do?
- Given the small amount of time we have in this lab to devote to basic text analysis, it might be helpful to get a better sense of the kinds of things one can do with ready-made digital text analysis tools: Voyant Tools (see Voyant documentation) and Lexos (read/skim In the Margins, the Lexos manual; you might start with “The Lexos Workflow”). How might using the ready-made tools compare to using R to perform digital text analysis?
- What does digital text analysis allow us to see/do/know that close reading (and/or other more “traditional” methods in literary studies) does not? (Here you might think more about Clement’s article)
- What does close reading (and/or other more “traditional” methods in literary studies) allow us to see/do/know that digital text analysis does not?
You do not need to answer all of these questions in your report; focus on one or two. You do not need to have a central argument (although it’s fine if you have one). You should connect your reflections to course readings where appropriate.
Shoot for 500-750 words overall.