Introduction to Digital Humanities

RELI/ENGL 39, Fall 2015, University of the Pacific

Author: Kat

Palladio Featuring Cushman

Screen Shot 2015-10-29 at 2.36.20 PM

With this particular map, the points indicate Genre 1 of the Cushman photographs, and the size is related to whether or not the photo is untitled. It has a satellite tile background, and no specific timeline, so it shows the entirety of the Cushman Collection photographs that we downloaded previously.

Digital Technology and the Improvement of Maps

According to Seed, digital technology has played a very important role in pointing out the differences between original maps and their reprints. First of all, digitizing maps has improved the portability of maps, making it easier to carry a digital image of the original map rather than the maps themselves, which were often printed in large, heavy books on thick, coated paper. Scanners also allowed the better reproduction of images in a higher quality, making maps more accurate. The increase in digitization of maps and the better access resulted in the discovery that many printed maps are inaccurate to the original, despite beliefs to the contrary.

Seed says that the printed reproduction of maps can be highly misleading to researchers. Publishers or printers often “touch up” or “improve” the original maps so that they are more aesthetically pleasing, but it alters the spatial history of a map. Imaging departments generally treat maps as pictures and attempt to correct the color or contrast, straighten crooked lines, and eliminate bumps or wrinkles before putting it in a book. According to Seed, a map is not a picture or an illustration. It is not meant for aesthetic, it is meant to convey a specific meaning. While these alterations made by reproduction companies are done with good intentions, they completely change the meaning that the original map was trying to convey. Original maps are intended to have a meaning behind the specific color it uses, the lines are indicative of the spatial relationships, and changing any of these results in a completely different meaning than what was originally intended.

Seed says that when creating or digitizing maps, curators/librarians should be brought into the process of evaluating the digital image of a map rather than being excluded, which can prevent the previously mentioned alterations from happening in the first place. Seed also states that, when a map is altered for publicity purposes, a copyright notice should be placed on the altered image to acknowledge the changes made. This can prevent any mistakes made when researchers use those reproductions as they would the original.

historical-map-of-sicily-bjs-1

This historical map of Sicily (found on Google Images), when evaluated with Seed’s methods, probably isn’t accurate to the original. Both the color and contrast is very bright and stands out. The lines are all straight, and thus most likely do not convey the actual spatial relationships on the island of Sicily.

Cushman Collection Charts

Who knew charts could actually be fun. With the Cushman Collection, it was pretty cool to upload it to the google fusion tables and have it lay out pie charts with pretty colors. I think, however, my favorite one was this:

Screen Shot 2015-10-22 at 2.34.24 PM

It’s super pretty, and looks a bit like a dandelion, not to mention that you can move it around and it looks kind of derpy whenever you drag it. This chart, however, seems to be trying to convey exactly what kind of slide condition occurred on what date the photograph was taken. Since there are far more dates than there are slide conditions, there are many more nodes (the orange ones) that connect back to the slide conditions (the blue). ┬áThis charted relationship can be important for the photographer or the photographic community because it reveals the commonality of a specific slide condition. All-in-all, it’s pretty interesting what a chart can reveal, even if you don’t have a precise intention when creating it.

Drucker and Digital Visualizations

analytics-marketing-data-technology-ss-1920

So, not only is Drucker’s article about the different important aspects of digital visualizations and such, but the author seems to like to use big words to make it seem more difficult than it probably is. Either way, this article was very difficult to grasp, so, as a tip, don’t read it when you’re already tired.

After reading through all of Drucker’s pretentious vocabulary, I tried to get to the core of this whole data vs. capta thing. Drucker seems to say that it is important for data to be reconfigured into capta so that it can by expressed in a graphic display. Now, what is the difference between the two you ask? Drucker says that capta is “taken actively,” whereas data is assumed to be known and so can be observed and recorded. Essentially, data is a given, and to reconfigure it into capta, it has to be reproduced using a humanities-driven thought process , thus making it “taken and constructed”.

Drucker not only makes a point about reconstructing data into capta, but she also emphasizes that the representation of knowledge must be acknowledged. Drucker declares that the history of knowledge itself is basically the constantly changing forms of knowledge that we, as humans, have had throughout time. Knowledge has only ever been changed or transformed in the different cultures and times, and so has not been explicitly new, making the representation of knowledge important to what it actually means.

Knowledge representation is key to visualization, mainly because it enables one to see the relationships and patterns between different pieces of information. Knowledge, Drucker says, must be carefully scrutinized and contain theoretical insight in order for it to be used in a graphic display.

Following Drucker’s obscure reasoning and incorporating what Yau said in the chapter of his book, charts and other graphic displays are both knowledge and the representation of knowledge. A graphic display uses set information to exhibit knowledge, but it also shows the relationships and patterns that may emerge upon comparing them. Thus, visualizations of data can reveal new information through the knowledge it already possesses and presents, and also allows for new interpretation to be gleaned from what it is showing.

Limitations of Digital Archives

 

This week we’ve been talking about digital archives of cultural objects and how they get published online, as well as the reactions they can cause in the cultural groups themselves and other audiences. Digital archives are intended to allow easy access to research materials, but by delving deeper, we can see that there are limitations to what is published online, as well as the issue of what, exactly, gets published, and what it can say about the whole archive.

In Amy Earhart’s article, she talks a lot about how what is included or excluded from an archive can reveal a lot about what the archive is about- intentional or not. She gives the example of MONK (Metadata Offer New Knowledge), which combined the contents of several other archives to form an archive that could provide a visual analysis of the literature and documents that came from the American 19th century. However, in doing this, they overlooked the fact that the majority of the documents were not written by people of color, and thus only provided one general perspective of the time period: the predominant view of white culture. Other archives that attempted to address the issue of not having content written by people of color found that many of the texts have been lost. There are several excuses for this, mainly that the digital world was viewed by many of its authors/contributors to be free from the classifications of race, gender, or class, and thus would not address such things. However, this also highlights issues within the digital humanities field, particularly that of selection (of what goes online) and historical structure. Thus, scholars of the digital humanities are attempting to address these issues by, as Smith says, “construct[ing] a digital canon that will weigh content and technological choices equally.”

In Jerome McGann’s article, he also talks about the limitations of online archives, as well as how essential they are. He describes online archives as being enormously helpful to scholars all across the globe, who can access things digitally and thus conduct research more easily. But he also states that there are limitations to these archives, mainly because of its scholarly design of the texts. There are many texts that haven’t been published online because they are hard to attain, are too costly, or are lost.

The Real Face of Screen Shot 2015-09-30 at 6.14.56 PMWhite Australia is an online archive that is intended to show the overlooked immigrants in Australia, and how the majority of the people are not white at all, but of a different race. However, while the archive is intended to show this, it does not have very much information beyond the fact that it is depicting all the overlooked members of “White Australia.” Clicking on the pictures doesn’t yield very much information, so getting to the point of why this specific person was included in the archive is difficult.

 

 

Metadata

 

So… Metadata. Anne Gilliland explains that metadata is, “data about data”, and thus is all the available information about information objects. That probably doesn’t really help explain anything at all. The thing is, though, that metadata doesn’t seem to quite have a set definition, as there are so many different components, types, and aspects of it. But Gilliland breaks metadata down and explains it as having three components: the content, context, and structure of the information object in question.

A prominent example of metadata in today’s culture include libraries, museums, and archives that use metadata to provide access to their materials, as well as the context those materials are in to provide a value to their information. Libraries, museums, and archives thus use metadata as a means of cataloguing their information objects so that other people can use it to their own knowledgeable purposes. More specifically, metadata can provide a means of description and resource discovery, not only in libraries, museums, and archives, but in just about anything.

As I said before, there are several different kinds of metadata, in which they are categorized by their purpose and function. However, all of the different kids of metadata are unified under a certain set of aspirations and thus functions, that Gilliland states as being: “creation, multiversioning, reuse, and recontextualization of information objects; organization and description; validation; searching and retrieval; utilization and preservation; and disposition,” of all information. Metadata is meant to attain and accumulate knowledge over time, thus expanding our information about all things informative.

However, metadata can be used for more than just contextual and descriptive information. It can be used to identify individual patterns through the information provided, thus supplying a means of infringing on privacy. In Robert Lee Hotz’s Wall Street Journal article, he states that metadata can look at a variety of patterns and identify them with individuals, based on their unique patterns. He gives the example of looking at shoppers’ patterns and how data analysts were able to identify who the shoppers were based on what they bought and looked at, as well as how much time they spent shopping.

So I guess the real question that is elicited from the concept of metadata is, how much information is too much, when it provides the means for invading our privacy?

 

My Love/Hate Relationship with Technology

 

So… as whoever actually reads this has probably guessed, my relationship with technology is a very intense love/hate one. Meaning, I love it, but it hates me with a passion. That’s not for lack of trying, I’ll try very hard to understand technology, but it kind of just slams the door in my face and says, “Yeah. Good luck with that.” So rude. I can understand the basic functions of computers, like, ooh if I hit the “Pages” button on my computer I will be able to type stuff, or hey, if I want the internet I have the choice of Google Chrome or Safari. It makes me feel like I’m as dumb as a box of rocks, but that’s okay, because basic functions are the way to go.

I guess my experience relates to the experiences that Williams and the podcast describes because I didn’t have a lot of access to computers when I was younger. I had one desktop computer that was shared by the whole family, and hey, YOU try pushing your older brother off the computer when you want to use it. The point is, I didn’t have a whole lot of access to mess around with computers, or develop my relationship with them. The podcast and Williams talk about similar experiences in the fact that, a lot of disabled people (as Williams says) don’t have computer programs that address their disability and work around it, so that they can still use computers and share and learn new things. The podcast also talks about how a lot of women in the 80s didn’t have a lot of access to computers either, which is why there was a drop in female computer science engineers.

The issues that Williams and the podcast describe are important because, as Williams says, it’s just morally right for everyone to have equal access to computers. Also, it enables the flow of information to continue, unhindered, and for women people with disabilities to contribute whatever ideas they may have. So, as Williams says, they may contribute an idea that others would never have thought of (lie the blind woman who was able to hear and understand things that came out of her speakers at a really fast pace), thus broadening the variety of perspectives and areas of research.

Specifying the Digital Humanities

I think one thing we all learned from our readings last Thursday was that “specifying” the digital humanities isn’t exactly easy. There are multiple reasons for this, of course, not the least of which that, as an emerging field, all the scholars of digital humanities are having a particularly difficult time deciding whether the basis for digital humanities should be theorizing about it or actually practicing it. There are several other conflicts, including how the “digital” part should interact with the “humanities” part of digital humanities. Many scholars are unsure whether people should be using technology to study the humanities, or if technology should be studied in terms of the humanities. Observing all these disagreements, you can see why it’s pretty difficult to define exactly what the digital humanities is.

I think, perhaps, the most important idea that Spiro introduced in ways of defining the digital humanities is by introducing a set of common principles among digital humanities scholars. This would unite the interests and possibly the general aims of digital humanities research, even though there are already several divisions regarding what that research should actually be. Spiro strongly implied that collaboration was the most important aspect of digital humanities research, as well as openness (which, I might add, could be depicted as a facet of collaboration), diversity, and experimentation. Altogether, Spiro’s values can be construed into establishing a more united identity for the field of digital humanities, despite the multiple conflicts that occur within the parameters of digital humanities work (like the two I mentioned in the first paragraph).

Mark Sample, while his article was much shorter and less repetitive than Spiro’s, also got his point across that, even though the field of digital humanities is already experiencing several divisions about how the research should be conducted, collaboration was key to the main goal of digital humanities, that is: spreading knowledge. Sample states that it doesn’t matter how scholars go about doing their research, as long as they share it in the end.

The digital humanities is particularly beneficial to scholars of all fields because, as Sample says, it enables us to more easily share our research, no matter what it is about. Digital humanities provides the means for the important communication that Spiro describes, as long as we do exactly what we should as digital humanities scholars, that is, collaborate about our findings.

I think, perhaps, the most evidential aspect of digital humanities that we, as a class, have already experienced is when we form our groups and communicate our different perspectives as to the readings or whatever other topic we’re discussing that day. We collaborate, and thus gain a broader sense of what we are talking about through the different perspectives. How cool is that?

 

Utilizing Voyant in the Digital Humanities

 

Voyant is an extremely useful and clever tool to use in the digital humanities… especially when you’re looking at vocabulary. However, when looking at something other than vocabulary and word frequencies within the content, it’s pretty much useless.

When first entering Voyant, there’s a colorful word cloud that visually depicts how often a word will appear within the dataset, and one can remove the more common words like “the” or “and” by going to the “Stopwords” options. Then the more interesting words, the words that are more able to show the point of the content, appear.

Screen Shot 2015-09-02 at 12.54.24 PMClearly, these words are much more interesting than “the” and “and”. Not to say those words are unimportant or anything, but, well… you get the point. Looking at the word cloud as a whole, it seems like the content of the dataset is really interesting, I mean, look at all those cool words: “death,” “tortures,” “shall,” “martyrdom”, et cetera, et cetera. This obviously implies that the content is a lot more complex than “and” or “the” would entail.

 

 

 

Moving on from the word cloud (difficult, right? There’s so many pretty colors), one can see that there are a lot more tools that can be utilized in examining the vocabulary content of the dataset. The summary shows how many documents are in the dataset, the longest and shortest of those documents, the highest vocabulary densities in the whole set, and the frequencies of the words. The corpus reader, just to the right of the word cloud and summary, shows the content of the dataset in its natural form, along with certain words that you can select to be highlighted.

Now, possibly the neatestScreen Shot 2015-09-02 at 12.54.50 PM thing about Voyant is the”Words in the Entire Corpus” tool, as it shows you the most common word frequencies (which can also be filtered by using the “stopwords” option), and allows you to compare certain word frequencies throughout the dataset.

Here, I compared the words “men” and “beasts”, just because they seem pretty opposite in definition, and it’d be neat to see how many times they’re used in the same document. What I found was that, there was always a notable difference in the word frequencies of each document (besides 10)scilitan and 12)readme, in which both words do not appear at all). While “men” would be used multiple times within a document, “beasts” would appear quite infrequently, if at all, and if “beasts” was used generously in the document, “men” would seldom appear.

Interesting, right? It kind of makes you wonder what these words were being used for. And that’s exactly the problem with Voyant.

 

It’s undeniable that Voyant has its uses, but it doesn’t quite have a knack for finding the context in which a word will appear, without searching through the whole “Corpus Reader” tool to find it. You just don’t know if, in the documents provided, men are being called beasts instead of men, or if they really are alluding to men. Sure, the Corpus Reader can help with that, but it can be pretty tedious to have to search through the whole thing for two words that repeat over one hundred times to see how they are used in context. So it really does seem like you would have to read just to see how a word is used instead of clicking on the nice, pretty words provided in the “Cirrus” tool to find out just what the texts are about.

Test Post