Introduction to Digital Humanities

RELI/ENGL 39, Fall 2015, University of the Pacific

Author: Under the Ash Tree

Navigating Networks with Palladio and Google Fusion Tables

So, how ’bout them networks, huh? They’re a pretty intriguing way to look at data sets, if I do say so myself. Unlike with a regular old chart or graph, networks allow you to explore different connections between “types” of data, such as the example of books and authors Scott Weingart gives in his article, Demystifying Networks. He breaks down the complexities of networks into very simple terms, explaining their benefits through, essentially, relations of stuff. In the case of the books example, books and authors are both “stuff,” and a network would show the relations between them through a fancy display of nodes and edges. And while Weingart warns that networks shouldn’t be used for everything, they can definitely come in handy for a lot of data, particularly sets that only deal with one or two types of “stuff.”

Of course, perhaps my favorite thing about networks is that they’re mobile. You can move around the different nodes, make them bigger or smaller, change their colors, and each new thing you do to them gives you a new perspective on the data you’re networking. Let’s take a look at some, shall we?

http://

Now, the first network I’ve embedded here is from Google Fusion Tables. It was made using a sample data set taken from Palladio, a free site that allows you to create maps and networks at your leisure. The data set displays a list of influential people who went to Monaco at some point in their lives, along with several attributes about them – their dates of birth, death, birthplaces, etc. In my little network here (which you, humble reader, can drag around if you so desire), I’ve chosen to display connections between people’s places of birth and places of death. The colors of the nodes differentiate between the two different types of data – yellow is for birthplace, blue is for death place. And Google Fusion Tables has kindly sized the nodes for me, so it’s easier to see which birth and death places were common among these people (the biggest nodes being the places that correlated the most, like, unsurprisingly, Monaco). I also chose to make the links between nodes directional, giving them cute little arrows so you can get a better sense of how these connections flow. The wider arrows, like the bigger nodes, show a greater amount of connections between nodes than the smaller arrows do.

Moving away from Google Fusion Tables, I have a few networks I created in Palladio as well. These came from a different data set – a spreadsheet that includes the names and relationships of people who helped one another in a historical documentation of the Holocaust. (The data was collected by Marten Durer and can be found here.)

palladio network - givers to recievers

This network up above (though a little blurry, I apologize) shows the web of connections between people in Durer’s list who both gave and received help from each other. I chose to highlight the recipients of help in a darker gray, and to size the nodes so that it was clear just how much help each one received from other people. Rita and Ralph Neumann were obviously the biggest recipients out of everyone, but they were not the only ones; they also gave help to several different people, as the network demonstrates with, say, their middle-ground connection to an Ausweis Nazi. It’s interesting to see the relations of aid that occurred between all these people, because even though some may have never met, or didn’t help each other out, they’re still connected through their helping of someone like Rita or Ralph.

palladio network - form of help to recievers

Now, this last network is a little different. Instead of focusing on givers and receivers, I decided to look strictly at receivers, and the types of help they received. Durer organized his data on types of help into a numeric system, hence all the numbers in this network instead of word labels. Though I can’t actually tell you what each number means, it is interesting to see how this kind of network differs from one dealing with two different lists of people. For one thing, the amount of names on this list is a lot smaller. With all the givers gone, it’s easier to see how few people actually got some kind of help out of this data set. The dark gray nodes are still the receivers, but the light gray ones are types of help this time, and they’re much more varied in size than the giver nodes were. This allows us to see what kinds of help were most commonly received. The connections between receivers are established through kinds of help here, instead of through who gave help to them. So while it shows us nothing about human relations, it provides a different view of what actually happened when givers offered help to the recipients. (Who knows what happened with poor Herald up in no-man’s-land there. Apparently he needed his own special brand of help.)

Looking at these different kinds of connections between people reminds me a lot of Kieran Healy’s article, Using Metadata to find Paul RevereHealy took a simple set of metadata with the names of colonial U.S. people like, of course, Paul Revere, and the different organizations those people were a part of, and used the metadata in order to create various connections between those people. One data set showed how the organizations were connected by how many members each had, how the people were connected by their organizations, and several other snippets of interesting info. By creating a network of the metadata, Healy was able to pinpoint Paul Revere as one of the centers of all this activity, for he was connected to a large number of people and organizations. What we’ve been able to do with Palladio and Google Fusion Tables is similar. Particularly with my Palladio examples, we can see how almost any metadata can be rearranged and viewed from different angles, in order to discover and display different relationships between people, places, and other quantifiable things.

Maps and their Mesmerizing Meanings

So for this blog post, I delved into two different digital map making tools in order to explore what kinds of things can be learned from maps: Palladio and Google Fusion Tables. The data set that acted as the centerpiece for my maps was the Cushman Collection, a spreadsheet archive of photographs from around the U.S, dating from 1938 to 1969 and taken by Charles W. Cushman. (The website where the photographs originally came from can be found here.) In creating different maps of this data, I was able to discover various things about the photographs, particularly some of the most recurrent locations where they were taken.

palladio map 1 This map at the left was the first I made in Palladio. It’s rather difficult to see, but each pink dot on the map represents a place in the U.S. where a particular photograph – or a group of photographs – was taken. Obviously some areas have more condensed dots than others, like the West Coast. These dots act as markers of the places Cushman traveled and documented with his camera. Behind each little spot of pink is a bit of spacial history, specific to the year any given photograph was taken.

The second map I made in Palladio was a little different. I made a slight change palladio map 2to the way my lovely pink dots were displayed, adding the “size points” feature so that areas with more photographs documented had larger dots. This gives a much more dramatic visualization of how many pictures were taken in places like California, versus how few came from places like the Northern states. Here we’re provided with a means of comparing the spacial history that these photographs show, and we can easily see how much these pictures might tell us about certain parts of the country over others.

google fusion map 1 Now then, moving on to Google Fusion Tables. The first map I created with this tool was very similar to my first Palladio map. I wanted to see what a simple layout of picture locations might look like between the two different tools. Instead of dots here, I used little location markers, like the ones you would see on a Mapquest search, but the principle is the same. In this case, each marker on the map stands for an individual photo, rather than some acting as representations of many photos. Clicking on one marker would tell me exactly what photo it represented, and the metadata that went along with that photo. We can still see here that some locations in the U.S. have a much more dense collection of pictures than others, but the impact seems greater when Google Fusion Tables provides so many more markers than Palladio does.

Out of curiosity, I tried creating a different kind of map with Google Fusion Tables than my previous three. It’s called a “heat map”, and from what I gather, google fusion map 2it’s supposed to represent the densest parts of a data set on the map. Again, the image I took of my map is rather small, but there’s a very obvious red and yellow circle around the North Eastern part of the country, meaning that a vast majority of the Cushman photographs probably come from there. Other places on the map have lighter green splotches on them, showing that there are indeed many photographs from there, but not nearly the amount as are in that big circle. While this map does give a better idea of the hot spots in the U.S. for these photographs, it does a disservice to the data as a whole, because it leaves out locations that didn’t have enough photos to be included. The Northern states and Texas aren’t included at all here, when it was clear that there were photos from these places on the other maps. So this kind of map eliminates some of the important spacial history to these photographs that the other maps represent more accurately.

Observing these different maps, and the perks and drawbacks to each, reminds me of the article we read for class by Patricia Seed. In her writing, she drives home the point that maps are more than just pictures, they’re visualizations for conveying meaning. That comes from the spacial history of maps – from being able to see things that have occurred in the past, connections between different parts of the world, and pretty much anything else under the sun. To treat maps as mere pictures is to lose the most important element of them. But in order to convey the appropriate meaning, a map has to be suited for the job. Some maps, like the heat map I showed above, or maps that have been tampered with upon going digital, don’t display the spacial history of their data like they’re supposed to. And that can sometimes leave out the most important details of the data. Maps must be made and treated with respect and care, otherwise the stories they tell may be lost.

Google Fusion Charts!

I think after throwing this Cushman Collection of photos into Google Fusion Tables, I have sufficiently learned that I should not have the power of data displays in my hands. I have way too much fun playing around with them. Particularly this one here:

http://

Network graphs though. They’re like jellyfish turned into graphs. And I even figured out how to embed it so everyone can play with it. In the case of this little graph, I was comparing the “Genre 1” and “Genre 2” categories of the Cushman photographs. I must confess, I’m not entirely sure how network graphs work, but I think what can be gleaned from this is that the bigger “nodes” are the genres that show up more often, and the lines between different nodes show genres that’re connected to each other. This is a pretty handy and fun way to figure out which genres the makers of this collection use the most, and what correlations there are between the genres.

I also decided to make a couple pie graphs about the genres, complete with Comic Sans font because I’m that obnoxious:

http://

http://

The first pie chart is for Genre 1, and the second is for Genre 2. I knocked the Genre 2 chart down from twenty slices to ten, because there weren’t enough categories in Genre 2 to have that many slices. These pie charts are a little different from the network graph, because although they have prettier colors, they don’t show the connections between these photographs’ different genres. It does make it much easier to see which genres are used the most often, and compare their frequencies to each other, but Genre 1 and 2 remain very distinct categories with the pie charts. Clearly, the way you choose to display data is significant, because different kinds of graphs and charts can reveal very different things about data.

Google Fusion Charts!

I think after throwing this Cushman Collection of photos into Google Fusion Tables, I have sufficiently learned that I should not have the power of data displays in my hands. I have way too much fun playing around with them. Particularly this one here:

http://

Network graphs though. They’re like jellyfish turned into graphs. And I even figured out how to embed it so everyone can play with it. In the case of this little graph, I was comparing the “Genre 1” and “Genre 2” categories of the Cushman photographs. I must confess, I’m not entirely sure how network graphs work, but I think what can be gleaned from this is that the bigger “nodes” are the genres that show up more often, and the lines between different nodes show genres that’re connected to each other. This is a pretty handy and fun way to figure out which genres the makers of this collection use the most, and what correlations there are between the genres.

I also decided to make a couple pie graphs about the genres, complete with Comic Sans font because I’m that obnoxious:

http://

http://

The first pie chart is for Genre 1, and the second is for Genre 2. I knocked the Genre 2 chart down from twenty slices to ten, because there weren’t enough categories in Genre 2 to have that many slices. These pie charts are a little different from the network graph, because although they have prettier colors, they don’t show the connections between these photographs’ different genres. It does make it much easier to see which genres are used the most often, and compare their frequencies to each other, but Genre 1 and 2 remain very distinct categories with the pie charts. Clearly, the way you choose to display data is significant, because different kinds of graphs and charts can reveal very different things about data.

Observing the Inner Workings of Omeka

Well, if there’s one thing you don’t realize about the Internet before you actually do it yourself, it’s that making a website is a lot more challenging than it looks. And it already looks pretty challenging to start with. Working through Omeka to create our Digital Humanities class site eased the process a little, but it was still quite an uphill climb to get to our finished exhibits.

For starters, searching for the right information to include in an exhibit like ours is hard. I’d imagine ease of access would vary by exhibit topic, but for a website about Perpetua and Felicitas, I found it a lot more difficult than I thought it would be to scrounge up items for display. It seemed that every time I found an object that I wanted to use, someone else had put it up already, or it was under a license that kept it from being shared. There was one painting in particular that was being very stubborn, because it kept cropping up everywhere on Google, but I couldn’t find an original source for it, and so I couldn’t add it to the exhibit, even though I really liked it.
 The mosaic at the left here was something I eventually did find and could use, but that was after sifting through mountains of other pictures and items. And then, even once I had some good content to share, there was so much information that had to come with it. I hadn’t anticipated how much detail would go into the metadata of our items. Half of the the info boxes we filled out for each item were things that wouldn’t have even crossed my mind otherwise, much less have been put in the metadata if Omeka hadn’t pointed them out to me. In the scheme of things, our exhibits probably aren’t as extensive as a lot of other similar websites might be, but the work we put into them was still way more involved than I would’ve imagined. Sorting and classifying all those items and bits of metadata was pretty tricky. But it also made our items a lot easier to navigate in the end. So despite how much effort goes into making collections and figuring out how things should be grouped, classification adds a lot more coherency to jumbles of information.

And I feel like that’s something Omeka does really well as a tool. It provides a smorgasbord of ways to organize whatever data you want to throw at it. The structure of items, collections, and exhibits gives it a unique hierarchy, too, with each rung of the ladder allowing you to do different things with information. One Omeka website can show you a hundred ways to read the same images. You can zero in on a single item, or explore a broader topic with a full exhibit. The sky’s the limit, really, and that’s an advantage Omeka has over a tool like WordPress. This post I’m writing now is pretty much the epitome of what WordPress can do. It lets you blog. You can organize things by tags or categories, if you want, but it doesn’t give you the same complex kind of organizing that Omeka does.

Of course, that can be a place where Omeka falls short, too. I’ve already rambled a bit about how complicated using Omeka can be, especially if you’ve never done it before. So while it does give you the chance to expand upon and organize your information pretty much however you want, it has a much more complicated interface than WordPress. There’s a lot more that goes into an Omeka exhibit than a WordPress blog, and I think which one you used would depend on whatever intentions you have for your own website.

On a slightly different tangent, throughout the process of our class building the Perpetua and Felicitas exhibits, I couldn’t help but be reminded of the blog post we read by Melissa Terras. I feel like I can sympathize with her on a deeper level than I could before. Her whole post was about how difficult it can be to find cultural information that’s available for sharing, and now that I’ve gone through that first-hand, her arguments seem a bit more justified. Though I’m still not quite on-board with the idea of making everything accessible to share and reuse, I feel like Terras was right in calling out online museums and other sites on their lack of helpful resources. Their interfaces, while not extremely challenging, can be a bit frustrating to work with, especially if you don’t know exactly what you want from them. And it seems that so little is actually available on websites like that of the Metropolitan Museum. Many items in the online museum archives didn’t even have images attached to them, and some were guarded under licenses from being shared or reused. I agree with Terras when she says that the rights to use or not use something need to be clearer, and more accessible, because navigating those museums for things related to Perpetua and Felicitas was a huge pain in the butt. If we’re going to go to the effort to share some cultural content online (emphasis on some, because again, not all of it should be shared), then the least the providers could do is make it easy to access.

Overall, I feel like the takeaway from our experience with Omeka is that being able to share, organize, classify, and analyze online content is an invaluable pursuit. Though I can’t say I would personally use Omeka again unless it was required, since it gave me a headache at some points, once we had everything pieced together, it was pretty cool being able to look at all the information we’d scrounged up. If you’ve got the patience and ambition to work with it, then Omeka has the potential to make some really awesome online exhibitions.