For Lab 5b, I produced a topic model using the mini-corpus I created for Lab 3, which contains eight texts: “The Talking Horse and Other Tales” by F. Anstey, “A House Boat on the Styx” by John Bangs, “A Christmas Garland” by Max Beerbohm, “Miss Mapp” by Edward Benson, “Fables for the Frivolous” by Guy Carryl, “Mr. Dooley in Peace and in War” by Finley Dunne, “The Lighter Side of School Life” by Ian Hay, and, last but not least, “Around the World with Josiah Allen’s Wife” by Marietta Holley. I chose to model 20 topics, as I wanted to see a broad range of topics that would give me an accurate and comprehensive yet fairly concise overview of my corpus. I felt that selecting more than 20 would saturate my topic model, but selecting fewer than 20 would not provide as much detail as I wanted.
Overall, topic modeling was not a difficult process, just—as we have said of other labs and processes—a meticulous one. Therefore, I did encounter a couple of challenges, primarily with importing data commands into Mallet. I’ll admit that the problems I had were not due to technological or tool issues, but arose when my attention to detail wasn’t up to par with a lab that required much attention to detail. We talked in class about how the vast percentage of errors made when topic modeling with Mallet are typos, and sure enough, I made one or two mistakes when typing in the commands. But, even after correcting my spacing and ensuring the commands I entered into Mallet were identical to those written out for us on the hand-out, Mallet still did not do what I wanted it to. That led me, with the help of Professor Thomas, to figure out that my mallet folder needed to be placed alongside other files under my username in my computer. Once I did that, downloaded the full Java Development Kit—I had only downloaded a part of it—and re-entered the commands, I was able to successfully create a topic model of my mini-corpus.
Topic modeling is an innovative and important tool for researchers of literary studies, for a number of reasons. Topic modeling can save time for researchers who want to quickly gain an understanding of a volume of texts without having to pore over them individually. Through topic modeling, researchers can be given a list of words that recur or are used most often, and from that, deduce conclusions about the text or a series of texts as a whole, and relate what is learned to other works or studies. For example, by studying texts over different periods of time through topic modeling, researches can pick up on words that have been used most often over time, and study these habits to make connections or develop theories about why that may be. Topic modeling is a useful method of seeing patterns that may or may not be important, but have the potential to guide researchers to new knowledge in literary studies.