SS Blog Post 5: Data Mining and Digital Literature

    Data mining/analysis is a controversial topic in the humanities world. Data mining surrounds the harvesting of online information and analysis relates to finding patterns in the data extracted, such as using Voyant tools to form textual word clouds. Chapter seven of the Digital Humanities Coursebook states, “made as part of the larger diatribe against the digital humanities, the defensive posture suggests that data mining was meant to replace methods of reading that have a long history in cultural practice (Drucker, 110).” Some academics who oppose the more technical methods of DH argue for more traditional methods to analyze text such as close reading. Data mining processes can only analyze humanities products such as literature and art that have been uploaded digitally. Therefore the coursebook notes that those analyzing analog forms such as 3D sculptures that have been uploaded as 2D images need to be transparent of what medium was utilized during processing. Utilizing digital processes such as data mining/analysis provides for a larger sample size to better prevent major generalizations to be made about certain authors or genres of texts. I believe that it is helpful to extract quantitative as well as qualitative data from literature and art to obtain a well rounded analysis of a certain work or genre. Data mining with Voyant tools shows us word patterns in literary works that we often would not pick up on when reading with the naked eye. We can and should apply the Voyant tools to traditional literary research when connecting the documented frequencies of specific word usage to textual concepts/ideas. For example, plugging in my text: Chapter 1, of The Man in the Brown Suit, by Agatha Christie, into Voyant placed emphasis on the largest word in the Voyant word cloud; “papa,” surrounded by supporting words that relate to him such as his job and his forgetfulness. The words with the highest frequency influenced me to make connections between Papa and how his presence would be important to the plot of the story; ‘how will his forgetfulness and distractions of his job come into play?’ The words emphasized by Voyant tool reveal Christie’s techniques for foreshadowing character death, as Papa’s demise is later utilized as a plot device to spur main character Anne into a trip to London that will be the main setting of the book. 





Comments

  1. I agree with your point about extracting both qualitative and quantitative data from literature and art to get a well-rounded analysis. Literature and art are subjective, so objective data mining tools can miss more complex meanings and interpretations. I do think that data mining can be helpful when analyzing patterns in large quantities of data because as the textbook explains, data mining technology can work with a lot more content than a human can.
    Using Voyant Tools on my novel, Night and Day by Virginia Woolf, has be a little difficult. I'm still trying to figure out the best way to recognize patterns using the tools, so I think that comparing results with my group will help me see more. As of right now, the Word Cloud tells me that there is a lot of dialogue in the novel because the word "said" is used very often. It also shows me the names of main characters and which chapters they appear in more frequently.

    ReplyDelete
    Replies
    1. I think "said" is significant in the Woolf texts!

      Delete
  2. We can see in this chapter how important distance reading really is. Similarly to Sophia death holds a very important role in my Agatha Christie novel as well. I learned more about the bubble tool which shows us certain trends with words in our texts.And it is very essential for us to get quantitative data through close reading.But on the other hand it's good to see a texts qualitative data from the Voyant Tools. In my project the tool that has been the most affective is the word collage which shows you the words that are used the most.

    ReplyDelete
  3. It is really interesting Voyant tools' analytics for the first chapter of The Man in the Brown Suit can be used to interpret events that will occur later on in the book. I will also use Voyant's analysis to pick up on foreshadowing for my project on The Adventures of Sherlock Holmes. One pattern that I am noticing across different readings is that having prior familiarity with the text makes patterns much more understandable to researchers. In the case of your book, for example, your understanding of the text enabled you to decode Voyant's emphasis on otherwise meaningless words or phrases. Another pattern that I am picking up on throughout the text is the capacity for data mining techniques to exceed the limits of the human mind. I think that this is an interesting consideration with regard to how much time it would take an individual to finish reading a work. I wonder what patterns data mining would reveal when looking at a large group of texts from the same time period.

    ReplyDelete
  4. I like that you brought up the the phrase that it is equally as important to get qualitative and quantitative data when reading any type of work or literature. If you as a reader truly want to get the most out of an analysis and the most significant amount of data then you need to think about how much data you need, but also the quality of the data. This is why when reading a book or long article you aren't supposed to summarize it using the whole entire text, however you need to find the perfect amount of data to explain all key and important parts of it. Voyant helps us get a jump start at doing so because it gives us a broad example of patterns and main ideas without even having to go through the entire text. From my text, Northhanger Abbey, just from plugging in a chapter I can already figure out the main character of the book, and get a main sense of how its going to be.

    ReplyDelete
  5. I like how your post introduces your own analysis, goes into the book's analysis, and then finishes with a related text. You give a bit of your own opinions, reinforce it with a quote, then repeat that process three times to build a post that is both well-rounded and informative. I agree with your dissertation that both quantitative and qualitative are useful to analyzing both art and media, in the same way that both close and distant reading are important. Close reading and distant reading can be used before taking a full quantitative and qualitative analysis, so that you can fully understand a text as a reader before taking a top-down view as an analyzer. I also have Agatha Christie, and I similarity viewed which words were repeated most often to give depth to my analysis.

    ReplyDelete

Post a Comment

Popular posts from this blog

Blog 4: Information Visualizations and Distant Reading

Blog Post 6: Maps & Virtual Spaces (Pat Pasong)