Lab #4 Laurie Epps

Topic modeling certainly shows substantial possibilities in analyzing components of literature online.  The use of mallet was a challenge, but was much easier than using program R. I only chose two pieces from my corpus because I did not want to be overly ambitious in my mallet exercises.  Just as anticipated, two was plenty because I got stumped in my command lines.

My problem in bringing in the corpus was I did not originally put the two corpus into one file in the mallet program.  Instead, I loaded each one as its own file—that was my problem.  But thanks to assistant, it was straightened out quickly and I understood exactly why it needed to be placed where it did and how to refer the mallet program to that direct file. Franco Moretti had his own trial and error as I explored in my methodological analysis. This has been my experience with all the labs thus far—you try and try again until you get the results you are searching for.

My two choices that I included in the mallet file were two pieces of children’s literature, both of which were anthologies.  The feedback of words was interesting and many recognizable for children’s literature in the time frame when they were printed.  There were terms like knight, bon bon, kindly, butterfly, girls, lessons, etc., but many were classified into topics that were not too similar.  I later learned that had I had a larger corpus with better divisions, my results would have been more pronounced. Thinking about the genre of children’s literature, topic modeling might be helpful in determining texts that have references to specifics in parables, girl role models or treehouses.  For example, elementary school teachers might find searching a children’s corpus to find science projects that include the use of sunlight and mirrors or those that involve the human mind. This would produce a starting point of material for them to use in the classroom that is subject specific and also, ties in to the topics that are similar. The possibilities in using topic modeling are endless, but the issue lies in having the right material to construct data with.

The Moretti pieces that we read this week were especially helpful in tying the lab and the material together for better understanding.  Like I said in class, and like Moretti tells his readers, it is all in the process.  This process of topic modeling shows promise, but all of this is still in the beginning stages—the trials of seeing what works and what doesn’t. Moretti’s struggle in his own trials made my lab frustrations seem small but legit. Since the whole dynamic involved in digital humanities is evolving, we are likely to see many changes and accommodations to technology that better fits the needs of those who use it.