I used my corpus of science fiction texts from Lab 3 for this experiment.

Screen Shot 2016-03-01 at 7.46.33 PM

This is the list of topics given by MALLET from my corpus. Two of the topics contain words associated with the Gutenberg Project and copyright stuff. I consider those outliers since they have little to do with the actual stories. Interestingly though, words associated with man and masculine titles show up in multiple topics.

Screen Shot 2016-03-01 at 7.53.23 PM

In several of the topics, one work will dominate it over the rest, particularly the larger stories like Rudyard Kipling’s With the Night Mail _and Ayn Rand’s _Anthem.

Screen Shot 2016-03-01 at 7.55.25 PM

This isn’t necessarily true in all topics though. Some have a much more even split.

Screen Shot 2016-03-01 at 8.00.01 PM

I tried fewer topics this time, ten instead of fifteen.

Screen Shot 2016-03-01 at 8.03.04 PM

The ten topics look very similar to the fifteen I had before. The two Gutenberg/copyright-related ones are still there and there are many words related to men present still.

Screen Shot 2016-03-01 at 8.05.55 PM

And I get similar results with five topics as well.