Distant Reading

Approaching this distant reading project, I had a few concerns with regards to the textual analysis project. As a whole, textual analysis produces more interesting results when there is a larger corpus from which to work. As such, I spent a considerable amount of time constructing a corpus of Frank O’Hara’s oeuvre spanning his active years from 1950 until his death in 1966. Looking at this larger corpus, comprised of 45-50 poems, we can start to make some broader claims about O’Hara’s poetry both in toto and across time.

The digitization process took a variety of forms; first I scoured the web for readily available O’Hara poems from websites like Poets.org and O’Hara’s official homepage. Because of copyright and ownership issues, this was still a particularly limited corpus. From there, I took it upon myself to alternatively scanning and OCRing text as well as manually transcribing the poetry. During this collection process, I also tried to make sure that I had a fairly equal representation of poems across time, as well. Somewhat arbitrarily I divided the poems into three periods – 1950-1955 (City Winter and Oranges), 1956-1960 (Meditations in an Emergency and Second Avenue), and 1961-1965 (Odes and Lunch Poems).

All told, the corpus consists of about 8,300 words with 2,300 unique words. The corpus was not lemmatized, and the following analysis does use the Taporware list of English stopwords.


To begin, we can start with the “Wordle test” by looking simply at the word frequencies. As we saw looking solely at “Chinese New Year,” the most frequently occurring non-stopword in the corpus overall is the word ‘like’ with 69 total occurrences across the corpus. Again, while ‘like’ is primarily a functional word in O’Hara’s poetry – with only a very small percentage of occurrences expressing a sense of pleasure – it is interesting to see a notable dip in overall frequency across time.


As you can see, there is a huge dip in the middle period of O’Hara’s work where he does not favor the word nearly so much. The 1956-1960 period has only 11 total occurrences, compared to 25 and 33 in the early and late periods, respectively. It is difficult to go very far in analyzing how much of a stylistic shift this really represents for O’Hara, except insofar as he uses few similes in this middle period. Certainly attempting to analyze this device relative to say, his uses of metaphor, gets particularly problematic as looking for metaphor is more difficult to automate.

Looking beyond like, it makes sense to look instead to “love.” Across the corpus ‘love’ is one of the most common non-stopwords; I might argue the most common, as the ones ahead of it are the aforementioned functional ‘like’ and the similarly functional adverbial form of ‘just.’ Not only is it the most common non-stopword, it is almost entirely evenly distributed across the entire corpus (8/7/8 distribution). Consistently, O’Hara is concerned with love in all its various aspects from finding it to making it – a fact similarly reflected in his balanced usages of love in both noun and verb forms (a 43% to 57% split, respectively).


It is, however, interesting to notice that despite his diffuse appreciation of love across his oeuvre, only two of the occurrences of “love” actually refer to the act of love-making. Following this train of thought, there is an interesting pattern to O’Hara’s attention to matters of sex and love-making: While O’Hara is not shy about discussing sex, much of it is left to innuendo and play rather than direct statements. Even the word ‘sex’ in any form is notably absent from this corpus, occurring only 3 times as “heterosexuality,” “homosexuality,” and “sexual experience.” In fact, this final occurrence of sexual perfectly displays O’Hara’s tendency towards innuendo and playfulness; in his poem “Ave Maria,” O’Hara opens by urging “Mothers of America/ let your kids go to the movies! get them out of the house so they won’t know what you’re up to” (1-3). While not explicitly about sex, I have no doubt that the reader is meant to imagine the suggestion – after all, in most normal circumstances, what might the mothers of America be “up to” that they would not want their children to be aware of? I’m sure more fertile minds might be able to come up with many possibilities, but I maintain the parental ‘primal scene’ seems the most logical supposition. Continuing through the poem, O’Hara argues that by allowing the kids to go to the movies that they might one day “they may even be grateful to you / for their first sexual experience   / which only cost you a quarter/ and didn’t upset the peaceful home” (12-15). This again, I believe, reinforces the sexual reading of this poem.

Even looking back at “Chinese New Year” there are at least two such innuendos or insinuations in that poem: “so / what if I did look up your trunks and see it” (39-40) or “I’m tired of always going down […] priceless words like come” (49-50). While the referents are not explicit, per se, it doesn’t take much reading into them to understand them.

This same trend – O’Hara’s love of playful innuendo – can be traced throughout his body of work (see also “Homosexuality“). But this one of the interesting challenges of applying textual analysis to a poet like O’Hara and to poetry more broadly. Automated textual analysis tools can point to many trends, but it seems to still require human reading to understand his winkwink nudgenudge style of play.

