Ugh, so clearly my discipline in posting is nonexistent. It’s fine—I’m still just dipping my toes in the water here—and not only that, I’m interested in trying to document my learning curve here, such as it is … here’s a WordCloud for Bram Stoker’s Dracula:

Fig. 1: Word Cloud for Bram Stoker’s Dracula

This past July I got an opportunity to do a NEH Summer Seminar over at the University of Iowa ("Religion, Secularism, and the Novel"), where the group read a number of canonical novels (Crusoe, Silas Marner, Dracula, On the Road, and Home). I enjoyed the seminar and loved all the so-called traditional humanities stuff: deep discussion, attentiveness to rhetoric and form, and all the other things that we literary scholars do so well. I did also fidget around with some of the newer DH methods (at the time it was really just basic computational stuff: counting tokens, plotting frequency dispersions, making silly little Word Clouds, etc.) on those very same texts we were looking at (excluding Robinson’s Home).

Here’s a “lexical dispersion plot” for the word “time” in Stoker’s Dracula:

Time

Here’s a dispersion plot for some of the major characters in Dracula:

Character Dispersion Plots

Here are some common nouns in Dracula:

Common Nouns

Another Dispersion Plot

I didn’t really find anything too profound—although it did for some reason surprise me just slightly that a word like “time” occurred as frequently as it did in Dracula (simple counts found it 373 times [full exploratory notebook for this is available here—the dispersion plots too at this moment in time aren’t actually plotting in Jupyter Notebooks {the issue, it seems, is known as of 24 December 2019}]).

I also spent a little bit of time messing around with spaCy—tinkering around with the “Part of Speech” (POS) tagger and other things on the four texts. Here are some counts of different parts of speech in the four novels:

Different Counts of Parts Speech

Dean, in On the Road, always trying to live in the eternal now, makes total sense the verb “to be” would show up all around him.

Dean Verbs

The code to generate the above figure is available here)

One of the day’s questions about On the Road at the NEH Seminar had to do with what exactly it is like to read a book like On the Road after the #MeToo movement. Some quick counting of the verbs used to describe Marylou, for instance, comes up as follows:

 ('be', 34),
 ('have', 10),
 ('want', 10),
 ('know', 9),
 ('go', 9),
 ('say', 8),
 ('sleep', 7),
 ('see', 6),
 ('get', 5),
 ('sit', 5),
 ('make', 5),
 ('take', 5),
 ('tell', 4),
 ('find', 4),
 ('jump', 3),
 ('run', 3),
 ('do', 3),
 ('drive', 3),
 ('wait', 3),
 ('lean', 3)

Looking at Sal:

 ('be', 25),
 ('go', 18),
 ('say', 14),
 ('get', 10),
 ('tell', 8),
 ('think', 8),
 ('have', 7),
 ('want', 5),
 ('know', 4),
 ('come', 4),
 ('dig', 4),
 ('do', 3),
 ('call', 3),
 ('let', 3),
 ('make', 3),
 ('see', 3),
 ('ask', 2),
 ('arrive', 2),
 ('remember', 2),
 ('find', 2)

And now at Dean:

 ('be', 198),
 ('say', 105),
 ('go', 44),
 ('have', 41),
 ('take', 29),
 ('come', 28),
 ('see', 27),
 ('know', 26),
 ('tell', 25),
 ('get', 25),
 ('yell', 21),
 ('do', 17),
 ('drive', 17),
 ('want', 14),
 ('cry', 14),
 ('sit', 13),
 ('look', 12),
 ('talk', 11),
 ('sleep', 11),
 ('stand', 11)

As I say, I’m still just dipping my toes in … more to come I’m sure.

P.S. Code used to generate things here can be found in this repo.