Metadata & Databases

- February 07, 2024

Metadata is essentially structured information about a digital resource or asset. It is usually organized by fields like creator, date, place, medium, format, and similar info. For digital resources, there are two levels of metadata required – info about the original source (films, artwork, etc.), and info about the digital file itself (who made it, when, in what format, etc.) (Drucker, 52). Metadata is most often descriptive, but it can also be administrative and operational. It performs many functions. One common example of metadata is a library record.

Metadata can be...

Descriptive – "Get Info" section of a file. Systems of naming, identifying, and describing objects are called metadata schemes
Administrative – organizing data in large sets of records according to their use or type
Operational – giving data a role or task, such as the relation of one file to another (Drucker, 52)

To put into perspective how important metadata is, think of the value of weather data. How valuable would that data be if we didn't know where it came from? We wouldn't know who that data applied to.

Referring back to metadata schemes, ones that are referred to as "resource description" follow the standard Dublin Core. Dublin Core metadata standards are a good place to begin a humanities project. (Drucker, 55)

Markup languages are sets of codes used to analyze the context of digital texts. Examples of markup languages – HTML (previously discussed), JSON (JavaScript Object Notation), and SGML (Standard Generalized Markup Language)

Many markup languages are used in the humanities, including XML (Extensible Markup Language), TEI (Text Encoding Initiative), and KML (Keyhole Markup Language)
The standard format in which semantic markup (use of a markup language such as HTML to convey information about the meaning of each element in a document) is produced is XML (Drucker, 62)
The choice to use markup languages depends on the nature of the project

Now we move on to databases...

Databases are organized collections of structured information, or data (Oracle)

Databases come in several forms – flat, hierarchal, graph, etc.

A relational database is the kind of database composed of multiple tables of information which are connected to each other (Drucker, 71)

Benefits of a relational database – reduce redundancy, increase efficiency, decrease errors

Object-oriented databases work with complex data objects (MongoDB), and they combine operations and entities in their design
Principals of database management – modularity, data modeling, and relations

These chapters helped me to understand some of the struggles that digital humanists face in organizing and identifying data for their research projects. There were many terms and technology in this chapter that I had either never heard of before or didn't understand. I don't think that learning how to code is an integral part of curating digital humanities projects, however I do think it's important to know where your data came from and how it was made. An ongoing theme in the book is that data are never neutral, and are always made with some angle. Even metrics like Fahrenheit and Celsius are manmade.

When I clicked "inspect" on the Robots Reading Vogue project, I noticed that it was made using the HTML markup language, which is the most common

There isn't much information on the website about how the data was processed, apart from the fact that all materials were gathered from the Vogue archives. I will have to look into it.

Comments

Eve HuotFebruary 10, 2024 at 11:12 AM
Your explanations of metadata and databases were clear and helped me understand the concepts further. I liked the example you used about weather data, as it demonstrates how important metadata is. Chapter four helped me understand my project better because it explained markup languages. The Women Writers Project uses a schema that is a customized version of TEI guidelines. When I first read this section of the website, I was confused about what TEI meant, but now I understand that it is a markup language that makes tagging, interpreting and researching the content easier.
ReplyDelete
Replies
Dr. MFebruary 12, 2024 at 2:41 PM
Data are never neutral! Once again, great breakdown! The Robots webpages are certainly made with html, but it would be worth digging into the visual analytics.
ReplyDelete
Replies

Add comment

Search This Blog

Digital Humanities

Metadata & Databases

Comments

Post a Comment

Popular posts from this blog

Blog 4: Information Visualizations and Distant Reading

Julianna Pascuccio - MEdiation Website

Blog Post 6: Maps & Virtual Spaces (Pat Pasong)