Playing with visual text analysis using Voyant

As I’ve started to dip my toes into the DH current, one thing I’ve been excited to play with is visual presentations of text analysis. Until I hadn’t had a strong need for it, but with the approaching SCI survey of alt-academics and the analysis it will entail, I finally have a good reason to start exploring what’s out there.

The first tool I’ve checked out is Voyant (developed by Stéfan Sinclair and Geoffrey Rockwell as part of their project), which allows you to upload a document, point to a URL, or copy text; it can analyze a single document or a corpus. I uploaded my dissertation as a sample and, after stripping out articles and such (which the tool makes very easy), I got a nifty word cloud:

Below it, Voyant displays a list of words by frequency. Checking boxes next to one or more words gives a distribution of word appearance in the document or corpus. Here are three commonly appearing words charted through the diss:

I found it interesting to see that while I clearly used the word “trauma” a ton, the places where it appeared the most were in the intro and conclusion–suggesting that I relied on the term when I was pulling my argument together, but much less in the actual analysis. A section below the chart shows the context of the selected words in a table that can be sorted in a variety of ways. All the data in each section can be exported in a number of formats, too, for use in other sites or documents. (More than ever, I’m feeling pinched by having my blog hosted by, which doesn’t support things like iFrames; I hope to get a more flexible set-up going before too long.)

There’s a lot more that Voyant can do, and I’m looking forward to playing with it (and other tools) a lot more as I get a clearer sense of what kind of analysis I want to do. More soon!


Learning by destruction

In preparation for my first THATCamp, I’ve been breaking things. I’m new to the DH world, and only recently have I been dipping my toe into the “hack” side of the hack/yack divide. Enchanted by why’s (poignant) guide, I explored a bit of Ruby; then, like many others, I tried (and, by the end of February, failed) Codecademy. Most recently, I’ve been learning a little bit about HTML and CSS (I’m mainly using this book by @jcmeloni).

While the first two attempts to gain concrete technical skills didn’t take me very far, this latest effort is yielding some real results. The difference, I think, has to do with motivation and goals. My first attempts at Ruby and Javascript stemmed from a sense that learning a programming language was something I should do. Though the DH community has taken care to emphasize that coding isn’t everything, my thinking about it hasn’t changed–it’s an increasingly important literacy, and a basic level of knowledge is already important and will become more so, if for no other reason than to understand which problems are hard and which are easy. (My spouse, a programmer, is continually dismayed when he describes some cool new innovation, and I fail to be impressed, not realizing that it solves a very tricky problem.)

I didn’t abandon the lessons because I found them unimportant. I just couldn’t dig into them. This confused me; after all, if I’m good at anything, it’s learning things! Plus, why’s (poignant) guide and Codecademy take such different tacks that if one didn’t work for me pedagogically, it seemed the other should have. But still, I walked away from both of them.

What I’ve come to realize is that without an actual problem that could help me contextualize and apply the new skills, I was having a hard time making the connections I would need to really learn and understand what I was doing. Weeks in, I still didn’t really know what Ruby or Javascript looked like in the wild, and so while I was enjoying making little snippets of code that did things (enjoying it a lot, actually), my interest in both tapered as other priorities came up.

I cracked open the (e-)book on HTML and CSS for completely different reasons. I had been working on the census of #alt-academics that I’ve written about before, as well as the not-yet-public survey that will be its more rigorous counterpart, and I was hitting some roadblocks. Most of these were stylistic: I wanted the logo to appear here, not there, and I wanted it to link back to the site (well, Wufoo wouldn’t let me get past that hurdle, but it wasn’t for lack of trying–and I succeeded on the survey); I wanted a wider margin around the text. In short, I wanted to have more control than the visual editing interface allowed. I picked up the lessons in HTML and CSS because I had a problem I was trying to solve, and that has made all the difference in the way the instruction clicks for me.

My pre-THATCamp efforts have been similar. I’m at a point where I want to start having more control over my site (I am an Order Muppet, after all), so I want to learn more about what I could do with WordPress beyond its ready-made themes; I also want to start doing more with visual and other multimedia materials in my research, so I want to learn more about Omeka. For both of these things, I need to know about web servers and FTP clients, both of which I’m sure are second nature to a ton of THATCamp participants, but they’re new for me. So I have been tinkering, with the guidance of ProfHacker and my stellar colleagues in the Scholars’ Lab.

And along the way, I’ve been breaking things. I had a single triumphant moment in which everything seemed to be working as it should–and then, I went one step further, and managed to completely lock myself out of MySQL, rendering the whole setup unusable.

Here’s where things got tricky, and a little interesting. I had to find a way to dig myself out, and I had no idea how to do that. Most of the troubleshooting instructions I found on involved the command line–which I do not know how to use, much to my chagrin. (Again, following Prof Hacker’s lead, I’ve learned how to do the simplest of tasks–but really, knowing how to create a text file wasn’t going to get me out of the trouble I was having!). I went down a time-consuming and frustrating rabbit hole. As I tried to figure out what to do, I realized a lot of things–among them, I didn’t totally know what MySQL was, why I needed it, or what had gone wrong.

That sounds bad, but it has actually been energizing. The risks for me are still low at this point, but the potential reward is high. I’m finally starting to get a sense of what I don’t know–whereas before, all I saw was an abyss of confusion. My questions at this point are still incredibly basic (and, to be honest, I’m not always comfortable asking them), but I feel like certain elements are slowly coming into focus.

Breaking things has given me problems to solve, which is where the opportunity and desire to learn seem greatest. This is not new news to most of the DH community–@samplereality has argued compellingly that DH is about destroying things, and @jessifer tweeted about giving his students a problem without giving them the tools to solve it. As a pedagogical strategy, it makes sense: that’s how we learn when we’re doing things on our own–with a sense of urgency and a problem to solve.

As I continue to think about reforming humanities graduate training, my own experience of trying to learn, failing, and then needing to learn and (at least partially) succeeding, will remain at the front of my mind. I still haven’t really figured out what all went wrong, but I managed to get things working again (that counts as hacking, right?) and have a much better sense of what questions to ask my fellow THATCampers. Had everything gone smoothly, I would have learned so much less.

#Alt-ac: Moving toward a broader humanities community

I’m back home in New York after several exhilarating days at the MLA Convention in Seattle. Despite my background in the humanities (I completed a Ph.D. in Comparative Literature at the University of Colorado in 2010), I had never attended an MLA Convention until this year. The surprisingly positive experience that I had, plus the mere fact that I made the decision to go this year, suggest the deep and exciting changes that are taking place within the association and in the humanities community more broadly.

Two main topics contributed to the unique atmosphere of this year’s convention (and have already received a ton of attention): alternate academic careers (#alt-ac or #altac) and the digital humanities (#dh). (Note! While there are many areas of resonance and overlap, they are not the same thing.) Neither needs another triumphal account of how it will Save the Humanities; still, I came away with strong favorable impressions of the ways these two topics are affecting the broader conversation, and the people involved in each deserve accolades for the excellent work they’re doing. Though I’m kind of in an alt-ac profession myself, I’m a newcomer to the conversation and don’t pretend my comments can address the full spectrum of the work being done, the people involved, or the issues that have been or should be raised.

As a graduate student, I never attended an MLA Convention because I decided not to go on the academic job market; I didn’t see much use in going to the convention if not for interviews. After completing my degree, I let my membership lapse, because again, I didn’t perceive much value for an academic outsider within the MLA. The convention didn’t sound fun; I had heard tales of a stressful environment, riddled with the tension of people waiting for interviews or presentations, with a cutthroat mentality imbuing even the panel sessions as people viewed one another as competitors rather than colleagues. Plus, the thing is huge, which I thought would make it difficult to connect with people. I decided to risk it because I am deeply excited by the work being done by a number of individuals and organizations, including some recipients of Sloan grants.

What I found when I got to Seattle couldn’t have been further from the scene of tension and anonymity that I had anticipated. As I discussed with Kathi Berens at the end of the conference, I was impressed by the generous encouragement and cheerleading that went on. I heard many, many people credit the excellent work of others during panel presentations, showing a great willingness to highlight good work even if doing so didn’t directly benefit them. People were friendly and happy to introduce themselves, and nobody was particularly surprised by my description of my own work outside of the university (and at an organization largely focused on STEM at that). True, the people I was interacting with most were either on alt-ac tracks themselves or highly informed about the trends in the alt-ac world, so it was a somewhat skewed sample. Nonetheless, I was so pleased that I could jump in and share ideas with people as a colleague, even my email address no longer ends in .edu.

Much of the alt-ac conversation has already been well documented on Twitter (Brian Croxall’s storify gives a good sample), in blogs (Bethany Nowviskie‘s latest entries are great and link to many other useful sites), and in the Chronicle. William Pannapacker seemed to surprise himself, undergoing a sort of conversion experience with regard to alt-ac, digital humanities, and even Twitter; oddly, I can relate to his sense of unexpected elation. I have had enormous respect for the alt-ac and digital humanities communities for awhile, especially as I’ve come to engage with specific projects through my work at the Sloan Foundation, so it wasn’t surprising to me that I was enthusiastic about the work people were doing and discussing during the panels. Rather, what surprised me was the markedly positive tone that dominated many of the informal side conversations that I heard, as well as the Twitter backchannels on many sessions. (The way Twitter was used at the conference was amazing; my experience was deeply enriched by it.)

One transformative idea has really stuck with me, and it’s something I hope the MLA will consider. In his presentation called “Five Questions and Three Answers about Alt-Ac,” Brian Croxall proposed that the MLA shift its membership scope from those engaged in teaching languages and literatures, to those who have studied languages and literatures. I think this is a fabulous idea. Everybody knows the academic job market is a problem, and there are multiple ways that the issue can and (I think) should be addressed (including, importantly, better and different training for graduate students. As I mentioned in my previous post, I think that at least some of the frustration that current and recent grad students feel when facing the job market could be alleviated by improved networking opportunities that allow them to see paths that their peers have taken. Engaging a broader range of humanities scholars under the umbrella of the MLA could really help with that transparency.

Happily, I learned from Fiona Barnett that HASTAC is launching a group that will take a big step toward helping establish such a network. If the MLA could also explicitly broaden their member base so that people like me who are not employed by a university but who continue to feel compelled by and attached to current happenings in the humanities community, the variety of paths that scholars take would become much more apparent. It would be easier to maintain valuable and meaningful connections to people who share values, training, and sensibilities regardless of institutional affiliation, and the community could collectively help one another pointing toward (and developing new) resources applicable outside the narrow(ing) profession of professorship. Not insignificantly, the MLA could also gain dues-paying members this way, and would benefit from a breadth of perspectives that could strengthen its organizational health.

There are many questions that will need to be addressed for the alt-ac movement to continue to grow and thrive. For one thing, unemployment is high across all sectors right now, so alt-ac and digital humanities won’t provide a magic bullet that propels all of us into satisfying jobs; indeed, any job is hard to come by at the moment. Matt Gold (in “Whose Revolution? Toward a More Equitable Digital Humanities”) also pointed out that funding is already concentrating in the elite institutions. The people that are drawn toward alt-ac and digital humanities tend to be the kinds of people who like to get things done, though, so I am optimistic that questions will be raised and addressed in productive ways. The culture of humanities scholarship can start to change if the conversation about it shifts, and alt-ac is helping both to change the dialogue and to accomplish real work.

A great deal of movement is already happening naturally within the MLA community, and the MLA itself is doing a tremendous job in welcoming and encouraging such changes. The leadership of Russell Berman as evidenced in his outstanding address at the convention (excerpted here); Rosemary Feal’s deep and energetic engagement with the alt-ac and digital humanities communities (including an unbelievably active and engaging Twitter feed throughout was must have been an unbelievably busy few days), and Kathleen Fitzpatrick’s crystal clear and eloquent articulation of the issues facing scholarly communication are undoubtedly some of the big reasons that the MLA Convention felt the way that it did. I hope that the energy of these last few days is indicative of a catalytic moment that the association and the community will take advantage of. The timing is right, people are hungry, and a revitalization and expansion of how we understand the humanistic profession will benefit all of us, both inside the university and in the myriad other institutions that we call home.