Voyant is an interesting piece of software, that technically speaking, works very poorly as a web based tool. I find its uses and applications interesting and vast, yet the volume of data that it is processing is so large, that when I found myself playing with particular tools, specifically the “sunburst” tool, it would cause my internet browser to stop responding and I’d have to start the whole process anew. Relating this back to modularity, the reason I would find this more useful as a downloadable standalone application is because, in my experience, web based tools are far more susceptible to program failures such as I experienced with Voyant, and locally hosted applications don’t react in such a “if I’m going down I’m taking you with me” sort of manner. Having an external failure of one program that doesn’t take everything else I’m doing across my emails, this blog post, and my research down with it leads to lessened frustration in the end user, for example if Voyant crashed and I had to restart my entire computer every time it did, I would lose efficiency and be more frustrated. Additionally, all of the tools would function without the use of plug-ins, for example, I was unable to get “lava” or “mandala” to work because I was missing some unspecified web plug-in (the picture looked like I was missing something from Adobe Flash, but looking at my available in-browser plug-ins I’m not missing anything crucial and I don’t expect Voyant to be using pointedly specific plug-ins without telling the user what they are.


 

Alt text is cool right?

I would frame this and put it on my wall. Maybe give a poster of it to a favorite high school Lit teacher.

Now moving on to why Voy-aunt (which doesn’t rhyme with buoyant, but savant) is a particularly interesting tool to utilize in humanities research. I used the Shakespeare texts, and I found myself playing with the visual aspects of the program, such as “bubblelines” which visualizes the words you input in a very nice almost artwork fashion. What caught my eye using this tool was how out of the words “good” “shall” “lord” “come” “sir” and “love”, Shakespeare’s Comedy of Errors only uses the word “sir” throughout it’s text, which sets it apart as the singular, though still visually appealing, monochromatic line among a series of more psychedelic ones.

 

Next I used the “knots” tool, which unlike “bubblelines” or “cirrus” gave me no usable information, and the ability to change the “angles” and “tangles” with no relevant correspondence to the data makes this tool seem very questionable.

Okay fine, my artwork at 12.

My artwork as a 6-year old, or classic Microsoft screensaver?

And when I was clicking around in it, this message popped up: raising more than a few questions while actually providing me with more interesting subject matter than what the tool generated.

Why is "good" bold?

Who is this for? Why is it here? Why is my only option to say “OK” after it rants to me?

Some practical applications to this software would be comparing two translations of a text (lets say Shakespeare again) to compare exactly how the wording changes between the slight variations in text. Or we could take, say, the First Folio and run it against Folgers modern translations to see how the language has or has not changed over time and how similar or completely different what we’re reading now is compared to the original texts. You could do the same with various translations of the bible and compare word clouds to see if one favors particular words over other synonyms and why that is. A short comment on the word clouds, looking through the media library (sharing this blog means sharing the libraries too if you noticed), I see how the cloud generated different shapes, patterns, and colors for the same data (credit for below clouds goes to whomever uploaded them).

downloadThe dataset provided in Figure 1 is provided from the Test Corpus 2 files.word cloud Screen Shot 2015-09-02 at 12.54.24 PM Cirrus

Five different visualizations of word clouds or “Cirrus” for Dr. S’s test corpus.


A brief note on the shared media library: it is interesting to see what data is associated with each word cloud, such as file name. The diversity in the naming across these five clouds is more than I would expect.


Returning my post back to practical applications of Voyant, I think it can be used for many purposes other than finding commonalities within a corpus, though the visualizations seem to be most grand when they are accessing a large body of work. I could see myself using Voyant in many “this wasn’t made to do that but okay, I guess it works” kind of ways such as:

  • Running a personal journal through Voyant and analyzing the recurring themes, people, and places mentioned in the text to better understand how I got to where I am today.
  • Running the data of a series of lists, such as the lyrics to the Billboard Top 40 songs of any given day to see a visual representation of what words to you would probably hear if you turned your radio on. Another example would be using a list of ingredients for each menu item of a restaurant to see what their most used item is and use that information to gain insight to how they may make their recipes.
  • Analyzing the code of a program through Voyant to see how often a certain function is used.

If Voyant was slightly more powerful and could search short key phrases (Name Surname, places that aren’t one word like Los Angeles or New York, or just common word combinations or descriptors like chocolate milk or tired student) I think it would become exponentially more useful. I do not believe that the program accounts for aspects of the upload that are not actually “part” of the text, such as the Project Gutenberg disclaimers at the start of each text in the Shakespeare upload. Since it leaves that information, it skews the data slightly past what you are actually analyzing, and a system that allowed you to choose which parts of the document upload functions as text to analyze and which functions as non academic information would be something that takes Voyant one step further. Additionally, if it was able to count pluralizations and their singular forms as one set of data used (at least have a setting to inclusively count both as one countable object), this tool would be able to offer better analysis of comparing two subjects that may be missing information because it’s reading and comparing “love vs. hate” as opposed to “love/s vs. hate/s”.

Now as I round out this blog post, I would like to offer up a Cirrus of my own and some other analytics that I made to visualize all of the blog posts posted so far (including this one up to this point in the text) and how frequently some words are used.

This is us. We sure like talking about Voyant huh? Many words are repeated alongside their plurals too.

Collectively, we used a total of 1,267 unique words, said “Voyant” a total of 75 times, “Cirrus” a total of 7 times, and “fun” a total of 3. Though two of those were from one person, so they really liked using Voyant. To the one person who posted their blog while I was making and analyzing the above Cirrus, I’m sorry I couldn’t include you! Adding text to the corpus reader after initializing the program now suddenly seems like a useful feature too.

-Luke