Blog Post #5: Data Mining & Quantifying Literature

- February 25, 2024

In the realm of literature and art, delving into quantitative data through the lens of data mining unveils a treasure trove of insights. As defined in our coursebook, data mining is an automated analysis method that looks through digital files, seeking patterns and extracting meaningful information. One of the significant advantages is that it can explore cultural materials in their original formats, including texts, images, and media, giving rise to the concept of "distant reading." Distant reading shifts the focus from close examination of individual texts to the exploration of broader patterns, trends, and statistical information.

Quantitative data from literature helps compare different works, authors, or literary traditions. It helps in identifying distinctive author styles, from vocabulary to recurring motifs. This helps figure out who wrote a piece and shows how different writers or writing styles might affect each other through literary movements or individually. For example, an author could use a certain word or phrase to convey a theme and that usage of the word or phrase can change over time. By finding what's similar and different in each type of writing, scholars can understand how writing styles change throughout history.

Tools like Voyant make this exploration even better by showing us recurring themes, patterns in how language is used, and stylistic elements in large collections of texts. By measuring these things, researchers can better understand the main themes of literary works and see bigger patterns in certain types of writing or time periods. Voyant employs multiple ways to look at these patterns such as bubblelines, links, and graphs for trends. I think it is a good idea to embrace new approaches to analyzing literature and art. Mixing quantitative results with close analysis helps us understand texts better and gain deeper insights into complicated literary subjects.

In my own exploration with Voyant, I have input Chapter 3 from Virginia Woolf’s novel, Jacob’s Room. The analysis revealed that the most frequently used words in the chapter were "young," "said," "Jacob," "men," and "man." This suggests that the novel revolves around a young man named Jacob, with a narrative focused more on conversation than on characters' inner thoughts.

- Pat

Comments

Lily McDonoughFebruary 25, 2024 at 12:09 PM
I enjoyed your definition of distant reading, connecting it to the topic of quantitative data. Distant reading is a skill used to identify quantitative data in my eyes. I also enjoyed how you connected Voyant to your post and showed examples of data mining. I am also impressed with your interpretation of "Jack's Room", I am still struggling to find meaning in my novel based on the tools in Voyant. Well done
ReplyDelete
Replies
Dr. MFebruary 26, 2024 at 3:17 PM
I do think "said" is significant, and it's interesting that "young" is such a frequent word, is that because he is called young Jacob throughout the text? Combined with the recurrence of "man" and "men" you might infer it's a coming-of-age story... (But I have not read this text!)
ReplyDelete
Replies

Add comment

Search This Blog

Digital Humanities

Blog Post #5: Data Mining & Quantifying Literature

Comments

Post a Comment

Popular posts from this blog

Blog 4: Information Visualizations and Distant Reading

Julianna Pascuccio - MEdiation Website

Blog Post 6: Maps & Virtual Spaces (Pat Pasong)