Data mining & Quantifying literature- Alexa Prewitt

After reading about data mining and quantifying literature, I've learned about how they intertwine, what the significance of each are, and how they play a role in digital humanities. The textbook defines data mining as such, "Data mining is an automated analysis that looks for patterns and extracts meaningful information in digital files." The way I define it would be that it's a tool that has the ability to pull specific information from a piece of text. Usually the information being pulled follows a specific pattern or has commonalities. Being able to extract quantitative data from literature is extremely helpful and allows us to analyze a piece of literature with a lot more ease.

Data mining tools such as Voyant can show us the vocabulary density, readability index, and average words per sentence. It also has features like bubble lines and collocates. Voyant Tools allow you to analyze the more obscure aspects of a piece of text that most would normally overlook. When students are asked to analyze a text, they are usually supposed to concentrate on things like plot, setting, characters, point of view, figurative language and style. Voyant Tools allow you to see things on a much broader scale and look at more unique aspects. For example, when using Voyant Tools, you can see the most frequently used words. This allows you to think about why certain words are used more than others and what that tells you about the text. 

It's possible that specific authors or genres tend to use a fixed set of vocabulary and Voyant Tools make it easier to identify these type of trends. Traditional research is what most people practice and naturally do when given text to analyze. Incorporating Voyant Tools and using data mining in our literary anaylsis' can allow us to think with uniqueness. It also can help us uncover many things easily and it doesn't require much work from the user. I don't think we should totally replace traditional literary research with data mining but a good mix of both is favorable. 

When I was using Voyant Tools to analyze my chosen text, "The Voyage Out," I was able to uncover a lot about Virginia Wolf as an author. Wolf uses a lot of words per sentence in her writing and her books tend to be fairly lengthy. In my opinion, this shows that she writes with complexity and she is extremely detailed. When reading her novels, it is important to pay close attention because she includes a lot of important details in each sentence.

Comments

  1. I liked your interpretation of digital analysis because it is straight to the point and you break down the purpose of using tools like Voyant clearly. I also agreed to when you said, I don't think we should totally replace traditional literary research with data mining but a good mix of both is favorable," because when you're using Voyant tools it only shows the frequency of words and the patterns of these words in various ways, not including the context for why these patterns are the way they are. With a little bit of traditional research and digital analysis, we can find the most accurate reasons for the style of our author's writing and the patterns of the language they use and why they do it.

    In my novel "Mansfield Park", I found through Voyant that Jane Austen morals the theme of morality in her work and she favors manner words such as quite and dear as well as expressional words like thought and felt, which makes sense because Mansfield Park is about a young lady developing feelings towards her cousin in a new and healthy environment after growing up with a poor family.

    ReplyDelete
    Replies
    1. That's an interesting observation in Austen's language! Very proper.

      Delete
  2. Sentence length is definitely something to consider with Woolf.

    ReplyDelete

Post a Comment

Popular posts from this blog

Blog 4: Information Visualizations and Distant Reading

Blog Post 6: Maps & Virtual Spaces (Pat Pasong)