Data Mining and Quantifying Literature
Information visualization uses digital tools and techniques to visually present historical records, complex datasets, textual information, and cultural artifacts in ways that help our understanding, analysis, and interpretation of said data. In the case of our DH class, we are using Voyant to extract data through distant reading and close reading. Based on my understanding of data mining from the reading, it is the process of discovering trends, patterns, and insights from large data sets. Voyant has helped me extract some crucial information not only about my novel but about the author himself. The novel I am studying is The Lost World, by Sir Arthur Conan Doyle. It is a sci-fi novel that tells the story of an expedition to a plateau in the Amazon basin of South America where prehistoric animals still exist. Upon first looking at this novel, I had no idea what it was about, but as I spent more time with the text through Voyant, I gained a further understanding of the narrative. The word ‘man’ is used the most throughout the chapter I am exploring, this tells me how the novel is male character dominant. This is very interesting because when talking to my group, we made the comparison that Doyle uses the word ‘man’ on many accounts in his novels. Another word that is used frequently is ‘love’. I found that this word was used mainly when the main character explains the love interest character Gladys. This is very interesting because through exploration of the quantitive data throughout the chapter, I learned a lot about the main character’s feelings and I gained a deeper understanding of the narrative. When extracting quantitive data from literature it is important to recognize the trends of the author because it can tell us a lot about the person they are, and also tell us historically how life was when the piece of literature was published. I think we can apply these methods to those of traditional literary research to deepen our understanding of certain works of literature. We can understand much more when combining traditional practices with present digital practices because we have information from two completely different types of research and sometimes comparing two completely different things can introduce new and fresh information that we wouldn’t have extracted using traditional methods.
Averie,
ReplyDeleteyou bring up good reasons why data mining is important to literary analysis. I think that when dealing with any literature, one can link the findings of a distant reading to a larger literary movement, characterizing some of daily life at the time of publication, as you say. I find the idea of distant reading for the furthering of humanities research an important development consequent of NLP.
I found patterns in my text paralleling what I also found in the close reading. The characters are focused on formality and expressing their thoughts to each other with certain etiquette that would not be present today or in other places/classes at the time. The tone, too, is quite speculative and mysterious based on a few words I found to occur frequently.