Stylometry and the Shakespearean Clinic

Part 6 of "Critically Examining Oxfordian Claims"

I finally have some time to sit down and write about stylometric issues, as I've been promising to for a while.

1) Why don't I start with the claims of Oxfordians. Mark Anderson, way back in his second post as E of O, cited a study by Nina Green which found that the Earl of Oxford used "Shakespeare rare words" (i.e. those found in the Shakespeare canon ten times or fewer) at a rate of 30 percent in his letters, roughly the same rate found in Shakespeare's plays. Now, I don't know how accurate Green's numbers are or exactly what standards she used, but I'll assume for the sake of argument that the 30 percent figure is right. Unfortunately, it's virtually worthless as evidence of authorship, because it's been established that simply counting the Shakespeare rare words in a text cannot distinguish Shakespeare from other authors. A good summary of this issue is an article by M.W.A. Smith in the Spring-Summer 1989 Shakespeare Newsletter called "Linkages of Rare Words to Deduce Shakespearean Chronology and Authorship." Linkages of rare words have been used as evidence that Shakespeare wrote all kinds of disputed works, including King Leir, The Troublesome Reigne of King John, and Edmund Ironside, but the same kinds of linkages can be found between Shakespeare and works known to be by other authors, such as Kyd's Spanish Tragedy and Greene's James IV. Rare-word linkages can be used to approximately date works known on other grounds to be by Shakespeare, but they cannot distinguish Shakespeare from other authors. However, Don Foster's SHAXICON, which I'll discuss below, provides a wealth of evidence based on rare-word patterns which indirectly excludes Oxford from authorship of the plays.

2) Next, let's consider the study of Ward Elliot and Robert Valenza, which got some media attention a few years ago. What they did was take twenty-six poets roughly contemporary with Shakespeare (or more accurately, people who wrote poetry, since the list included such names as Queen Elizabeth and Francis Bacon) and compared their poetry with Shakespeare's according to a variety of factors; they found that none of the candidates tested was close to Shakespeare, with Oxford coming near the bottom of the list. For example, Shakespeare consistently used relative clauses less frequently than his contemporaries, and hyphenated compound words much more frequently; Oxford's writing shows neither of these characteristics. The test that got the most attention, though, was modal analysis, a method developed by Valenza based on his work in signal processing. This test involves taking fifty-two keywords (common in Shakespeare's writing, but not the most common words), breaking up Shakespeare's poetry into 500-word blocks, and determining his pattern of using these words relative to each other. They found that Shakespeare had a very consistent pattern across his career, and that none of the claimants was close to Shakespeare. The closest claimant was Sir Walter Raleigh, whose modal score was 2.4 standard errors away from Shakespeare's, with "not much more than a two percent chance of common authorship" (in the words of Elliot and Valenza); Oxford's modal score was 18.37 standard errors away from Shakespeare's, ranking him 22nd out of the 26 claimants tested. Elliot and Valenza wrote an article describing their method and results ("A Touchstone for the Bard," in Computers and the Humanities, v.25, no.4, p.199) and a shorter article concentrating on the Earl of Oxford's claim (in Notes and Queries, December 1991).

Oxfordians have predictably attacked this study with all the contumely they can muster, but as far as I can see their protests contain more energy than substance. A number of Oxfordian attacks on the study appeared in the Shakespeare Newsletter in 1991, based on preliminary reports of the results, but Elliot defended himself very ably in the Winter 1990 issue of the Newsletter, showing that the criticisms were unfounded, being based on incomplete or incorrect information. He concluded that "we do not claim to have the last word on this subject, but if there is a convincing refutation of our results out there, we are still waiting to hear it." I would have to agree with him, based on all I've read about the study. As Elliot says, his study should not be taken as the last word on the subject, but the extreme differences he found between Shakespeare's and Oxford's poetry, and the uniformity within Shakespeare's poetry, are at the very least strong evidence against the notion that one author wrote both sets of works. One objection Oxfordians commonly raise at this point is that Oxford's poetry under his own name was written in his youth, while the works of Shakespeare were allegedly written in his maturity; thus the differences just reflect his development as a poet. One objection to this objection is that Oxfordians routinely backdate the plays a decade or more so that they will fit into Oxford's lifetime; thus their own dating scenario undermines their "youth vs. maturity" argument by dating the plays to roughly the same time period as Oxford's acknowledged poetry. However, a more serious objection to the "youth vs. maturity" scenario is the fact that E & V, anticipating such claims, did some preliminary tests which indicate that other poets stayed remarkably consistent in their modal scores over periods of many decades. Milton's early poems (written from age 19 to 25) and his Samson Agonstes (finished at age 64) both showed strong consistency with Paradise Lost, written in his middle years, and all of these were distinct from Shakespeare. Spenser's Epigrams and Sonnets, written in his teens, and his Amoretti, written in his forties, are both consistent with his Shepherd's Calendar, written in his late twenties. Spenser's Faerie Queene, contemporary with the Amoretti, had anomalous modal scores, which may have something to do with the fact that E & V used the Shakespeare keywords for these tests, rather than keywords custom-chosen for Milton and Spenser; in any case, the anomalies are nothing like the huge difference between Oxford and Shakespeare.

I'm going to stop here for this posting, and have a separate posting on Don Foster's SHAXICON, which deserves an extended discussion.

