Topic Modeling Tool – lab 5a
Starting with the Topic Modeling Tool, I first selected my Fantasy-plaintext folder from lab 3 and selected an output to my desktop so I wouldn’t need to go searching for the folders later in my mess of an inbox.
Placing the Number of Topics to 15, I made sure that the “Advanced” setting was on “remove stop words” and continued from there by hitting “Learn Topics” to add the files to my computer. Before; however, I changed the topic words printed to 20 as we did so in class.
Continuing from there, I opened the newly created folder on my desktop labeled “output_csv.” This is where the files” DocsInTopics, Topics_Words, and TopicsInDocs are located and they were easy to access.
All the files are in Excel and I first chose to search through the DocsInTopics file. This Excel file holds the works I chose for my plain text corpus for lab 3. The page shows my corpus’ rankings of my eight plain text topics. This also shows us which documents are important with which topic.
Next file I searched through was the Topics_Words Excel. This file tells me my 15 topics that the Topic Modeling Tool sorted.
The last file is called TopicInDocs and shows the topics in a percentage. This gives the document a detailed account with topic relevancy. This file also shows the weight of each topic in ranking.
The next folder called “output_HTML” contained Docs, Topics, malletgui.css, and all_topics.html folders. I clicked on the Docs folder first.
Within the Docs folder, I found nine separate files labeled 1-9 Doc. I was actually a bit surprised by this because I thought I had eight items in my corpus instead of nine. I might have had an error somewhere in my topic modeling uploading. But this folder and separate files show the items in my corpus and when I clicked on one, it brought up the ranking top topics in the document. Here is a screenshot of one of the items:
I then clicked on the Topics file from the previous folder and selected the first topic. See screenshot: this shows a ranking from 2-10 for my items in that document.
I clicked on “all topics” and a list of my 15 items in my corpus showed up.
I couldn’t access the malletgui.css folder because it told me my free trial had ended. But these are my steps I undertook to complete lab 5a.