by Donald Foster, Vassar College
First, what is it? SHAXICON is a lexical database that indexes all of the
words that appear in the canonical plays 12 times or less, including a
line-citation and speaking character for each occurrence of each word.
(These are called "rare words," though they are not rare in any absolute
sense--"family [n.]" and "real [ad.]" are rare words in Shakespeare.) All
rare-word variants are indexed as well, including the entire "bad" quartos
of H5, 2H6, 3H6, Ham, Shr, and Wiv; also the nondramatic works, canonical
and otherwise (Ven, Luc, PP, PhT, Son, LC, FE, the Will, "Shall I die,"
et. al.); the additions to Mucedorus and The Spanish Tragedy, the
Prologue to Merry Devil of Edmonton, all of Edward III and Sir Thomas More
(hands S and D); Ben Jonson's Every Man in His Humour (both Q1 and F1) and
Sejanus (F1); and more; but these other texts have no effect on the
12-occurrence cutoff that sets the parameters for SHAXICON's lexical
What SHAXICON demonstrates is that the rare-words in Shakespearean
texts are not randomly distributed either diachronically or
synchronically, but are "mnemonically structured." Shakespeare's active
lexicon as a writer was systematically influenced by his reading, and by
his apparent activities as a stage-player. When writing, Shakespeare was
measurably influenced by plays then in production, and by particular
stage-roles most of all. Most significant is that, while writing, he
disproportionately "remembers" the rare-word lexicon of plays concurrently
"in repertory"; and from these plays he always registers disproportionate
lexical recall (as a writer) of just one role (or two or three smaller
roles); and these remembered roles, it can now be shown, are most probably
those that Shakespeare himself drilled in stage performance.
SHAXICON electronically maps Shakespeare's language so that we can now
usually tell which texts influence which other texts, and when. Moreover,
when collated with the OED or with early modern texts in a normalized
machine-readable format, SHAXICON provides an incomplete record of
Shakespeare's apparent reading. The main value of this resource has less
to do with biographical novelties, however, than with problems of textual
transmission, dating, probable authorship of revisions, early stage
history, and the like. And because SHAXICON is a closed system, human
bias in measuring lexical influence of this sort is effectively
eliminated. The evidentiary value of supposed "verbal parallels" is no
longer a matter of private intuition or subjective judgment, but
quantifiable, using a stable lexical index (and measurable against a
virtually limitless cross-sample of machine-readble texts).
In 1991, I published a 3-part report in SNL (see Summer, Fall, Winter
1991) about SHAXICON (the database was not then completed, and not yet
dubbed), in which I made (in a few cases, mistaken) projections concerning
Shakespeare's apparent stage roles (based on entries for about a third of
the final lexical sample). The few botched projections derived in part
from key-punching errors--e.g., "Pand" (Pandarus of TRO) was often being
entered for "CPan" (Pandulph of JN), and "QnElz" (R3) for QnEliz (3H6);
and in part from unavoidable limitations, explained in the SNL series,
concerning the variable "richness" of character-specific lexicons, which
could not be measured until the whole canon was indexed. These problems
have been eliminated.
The following list represents a corrected catalogue of those roles that
Shakespeare is most likely to have acted. These assignments vary somewhat
in statistical significance, depending on sample size, etc. A fuller
report (with instructions on how to run cross-checks and fully automated
statistical analysis) will appear in my "SHAXICON Notebook" (a written
commentary that has yet to be completed). In the meantime, here follows a
list of Shakespeare's most likely stage-roles, as statistically derived.
Keep in mind that this catalogue cannot be proven to represent historical
actuality. SHAXICON handily selects Adam of AYL and the Ghost of Ham as
probable Shakespeare roles, both of which are supported by hearsay
evidence from the 17th century; the remaining roles find no external
historical confirmation (although Davies mentions that Shakespeare played
some kings, and SHAXICON indicates that Shakespeare played king-roles in
AWW, 1H4, 2H4, HAM, LLL, PER, and probably MAC). Having studied the
evidence from every conceivable angle, I'd say that the assignments below
are good bets, even despite the lack of archival evidence to back them up,
for the disproportion in Shakespeare's persistent recall of these roles is
quite striking relative to other roles in the corresponding texts. There
are a few texts (principally ADO, MV, and Jonson's EMI) in which
Shakespeare may have played two different roles in two successive seasons
of the same theatrical "run." But the statistical weight of Shakespeare's
selective recall of particular roles is in most instances pretty clear;
in fact, when multiple roles are identified by SHAXICON as probably
Shakespearean, they are in most instances roles that are easily doubled
(exceptions and problems are are noted below).
MOST PROBABLE SHAKESPEARE ROLES, BASED ON THE POET'S PERSISTENT AND
MEASURABLE RECALL OF PARTICULAR CHARACTER-SPECIFIC LEXICONS:
- ADO: Leonato; later switching to Friar (Q version registers
higher lexical recall for Leonato, F1 version higher for Friar). This
could be viewed as a problem, since the same actor cannot have played both
roles simultaneously, yet Shakespeare clearly "remembers" both roles
(unlike all other principal parts in ADO, which he "forgets").
- ANT: Agrippa, Philo, Proculeius, Thidias, and Ventidius, probably
simultaneously [!] (thus requiring some accommodation at 3.2.1 for
Vntd/Agri), and probably with Proculeius taking Agrippa's lines in 5.1
(hence the textual crux recently discussed on SHAKSPER).
- AWW: King of France
- AYL: Adam; adding old Corin the Shepherd in two revivals of AYL.
- COR: Shakespeare role uncertain. Shakespeare's writing after
Coriolanus registers disproportionate recall of the Sicinius role, but
without the hugely lopsided excess in lexical recall that obtains for the
designated Shakspeare roles in most other plays.
- CYM: 1.Gent (I.i), Philario (I.iv, II.iv), and Jupiter (V.iv)
- EMI-F (Jonson): SHAXICON indicates that F1 may represent a major
Elizabethan revision of Q1, followed by a minor Jacobean revision (as per
established textual scholarship on EMI). SHAXICON also confirms that
Shakespeare probably knew the play in performance: in 1598, and again in
1604, words from EMI come pouring into Shakespeare's writing, forming very
distinct peaks of lexical influence just when we know that EMI was,
indeed, acted by the King's Men (and again in 1612-13). But lexical
influence by character (entirely independent of general lexical overlap)
gives mixed signals: after both Q1 and F1 EMI, Shakespeare's dramatic and
non-dramatic writing registers disproportionate recall of the
Lorenzo-Kno'well role (esp. the F1 Old Kno'well); but Judge Clement (esp.
the Q1 Clement) seems to influence some of the post-EMI non-dramatic
texts; and the Thorello-Kitely role has an inexplicably high correlation
with Shakespearean texts both before and after the production of EMI. It
is possible that Shakespeare alternated roles in at least one version of
EMI, but the senior Lorenzo/Kno'well figure is the one showing the most
pervasive and persistent lexical influence on his own writing. One
suspects that Jonson's extensive and multiple revisions in EMI are
responsible for the slightly fogged picture presented by the
character-distributions for this play.
- ERR: Egeon (I.i, V.i) and Dr. Pinch (IV.iv).
- 1H4: King Henry.
- 2H4: King Henry (and perhaps Rumor, but only briefly).
- H5: SHAXICON indicates that Shakespeare probably played the French
Constable and Exeter in the Quarto version (in 1599, while also playing
Exeter in a revival of 1H6). In F1 Henry V, Shakespeare appears to have
performed Bishop Ely and Montjoy, and probably, on some occasions, the
Chorus. (The influence of the Chorus-role is less strongly marked than
that of Ely and Mountjoy, but still pronounced than for other roles in the
play). The Chorus-role is easily doubled with Montjoy--but tripling with
Ely raises a problem at I.i.0, when the Chorus walks offstage and Ely
walks on. Shakespeare may possibly have performed Ely and Mountjoy in
some productions, the Chorus and Mountjoy in others.
- 1H6: Bit parts only, most probably including Vernon or Exeter, but
possibly Bedford, Lucy, Mortimer, Suffolk, Warwick, or Winchester
(insufficient pre-1H6 sample, and uneven figures thereafter).
- 2H6: Suffolk (also Suffolk in the "bad" 2H6-Q, which appears
certainly to antedate the F1 version, as has been argued by Steve
- 3H6 Warwick (Old Clifford in the "bad" 3H6-Q, which appears
certainly to antedate F1 version, as has been argued by Steve Urkowitz).
- H8: Prologue and 1.Gentleman; or none (statistically uncertain,
due to insufficient post-H8 lexical sample).
- HAM: Ghost, 1.Player, Mess-Gent. of 4.5 (and perhaps also role in
the Mousetrap, most probably Lucianus). In F1 Hamlet, the Mess-Gent role
is partly folded into Horatio; but given Shakespeare's persistent recall
of the role even after 1601, it seems likely that the F1 variant in this
instance represents a casting-change made later than the bulk of
Shakespeare's F1 revisions.
- JC: Shakespeare role(s) uncertain, due to apparent revision and
shortening. Most probably, Decius; and, somewhat less probably, Flavius.
Note: Decius-Flavius doubling is not possible in the F1 version unless
F1 has been shortened from an earlier version. In F1, at I.ii.0, Flavius
and Decius enter as mutes; but the very text of JC I.ii offers some
evidence that the text has, indeed, been shortened at this point (e.g., in
the same scene, at I.ii.285, Casca reports that "Murellus and Flavius, for
pulling scarfs off Caesar's images, are put to silence"; but, if we may
believe the F1 stage direction at I.ii.0, Casca was on stage with Murellus
and Flavius moments earlier--from I.ii.0 to at least I.ii.214--and Casca
hasn't heard boo about Caesar's images in the interim). SHAXICON thus
seems to confirm the view that JC-F1 is a shortened text (albeit with some
added bits (e.g., the second account of Portia's death, which are indexed
in SHAXICON under JC-b). I am inclined to accept the assignments of both
Decius and Flavius to Shakespeare, but there is room for doubt.
- JN: Cardinal Pandulph.
- LLL: Ferdinand (possibly with one brief stint as Boyet).
- LR: Albany. (The Albany role reduced in the revised F1 version.
This is one of several designated Shakespeare roles that appears to have
been cut or reduced ca. 1612. It is doubtful that Albany was performed by
Shakespeare subsequent to the 1612 revision.)
- MAC: Duncan (and perhaps one or two bit roles after Duncan's
assassination, including the nameless Lord of III.vi) -- but perhaps
Banquo. The evidence of Shakespeare's role in this equivocating play is
itself equivocal, probably as a result of late revision, but possibly
because Shakespeare alternated roles, playing Banquo in some productions,
Duncan in others.
That MAC was revised ca. 1612 seems altogether likely from the evidence
of SHAXICON (principally in I.v.1-30,. I.v.71-3, IV.iii all, and
V.ix.1-19, indexed in SHAXICON under MAC-b). Simon Forman's eye-witness
account of MAC as acted in 1611 suggests that the ur-MAC had a larger
Duncan-role than in the F1 version. It has recently been argued on
SHAKSPER that there was an Elizabethan MAC (not extant) on which the 1606
version was based; if these theories of revision are correct, even in
part, they would provide a satisfactory explanation for the irregularities
in the SHAXICON data for MAC.
- MM: Escalus; possibly switching to Friar Peter in a late revival,
later than 1610.
- MND: prob. Theseus, but with very irregular figures. SHAXICON
traces enormously high Theseus-"influence" on the post-1594 poems, but
relatively slight Thesus-"influence" on the post-1594 plays. Although
Shakespeare's post-MND writing registers lexical recall of the
Theseus-role that is significantly higher than for any other role in MND,
there is inexplicably high recall of the Helena role as well, especially
in texts written 1602-4.
- MV: Shakespeare seems to have played Antonio in all productions;
but Morocco is a second "remembered" role, especially as manifest, albeit
intermittently, in the lexicon of the post-1594 poems and in the 1595-6
plays. No other role in the play comes close to these two parts in lexical
"influence" upon the poet's post-MV writing. Morocco is doubtful:
Shakespeare's averaged recall of the Antonio role, after 1594, is a full
55% higher than would obtain in an exact distribution, while his averaged
recall of the Morocco role is only 2% higher than expected. But Morocco
tends to register its strongest influence on Shakespeare's writing when
Antonio doesn't, and vice versa; perhaps Shakespeare alternated roles. He
cannot easily have played both simultaneously, at least not in the Q1 or
- OTH: Brabantio (and possibly a second small role after Brabantio's
farewell in I.iii; but this possibility has not yet been tested). The
Brabantio role is reduced in the Q1 version (as revised and cut from the
antecedent F1 version ca. 1612); this is one of several designated
Shakespeare roles that was cut, omitted, folded into another character ca.
1612, all of them evidently signalling Shakespeare's retirement from the
stage. SHAXICON identifies a final "run" of OTH in 1611-13, first in the
F1, then in the Q1 version; but it is doubtful that Brabantio was
performed by Shakespeare later than 1612.
- PER: SHAXICON suggests that PER is a very early play (ur-PER), the
palimpsest of which is imperfectly represented by acts I-II of PER-Q. The
play was clearly revised by Shakespeare, with altogether new or greatly
re-written acts III-V. This revision may have taken place as late as
1606/7 (as customarily dated), but a date of 1600 appears more likely.
SHAXICON offers no support for the view of the Oxford editors that PER-Q
represents a Wilkins-Shakespeare collaboration, yet it leaves open such a
possibility insofar as Wilkins could be shown to have tinkered some with
acts I-II. (This could be tested by indexing other texts by Wilkins, and
running lexical cross-checks.) Shakespeare appears to have acted both
Antiochus and (at least when doubling was needed) Simonides, and he may
have performed or read Gower's part from time to time, most notably ca.
1608/9 (cf. notes on H5-F1, another script for which Shakespeare registers
sporadically high recall of the chorus-role, especially ca.
1608/9--perhaps the company was short-handed in that year). With a reduced
cast, he may have simultaneously performed Antiochus, Simonides, Cleon
(without the dumb shows) and Gower, all of which register higher than
expected lexical recall, the last two only intermittently. Because of its
episodic structure, Pericles can be performed with a small cast, and the
SHAXICON data suggest that Pericles was indeed performed, on occasion,
with at least one player performing multiple roles.
- R2: Gaunt (in I.i - I.iii, II.i), the Gardener (III.iv), the Lord
(IV.i), and probably also the Groom (V.v). Troublesome dating: SHAXICON
seems to indicate that R2 derives from an earlier play, and that R2 was
revised immediately after 1H4 (but prior to publication of R2-Q1). This
finding is at odds with all past textual scholarship on the play, which
has been nearly unanimous in viewing R2 as a text begun and completed ca.
- R3: Clarence (in I.i, I.iv, and V.iii) and Scrivener (III.vi).
Possibly also Third Citizen (II.iii) in a late revival.
- ROM: Friar Lawrence in ROM-Q1,-Q2, and -F1 versions, plus Chorus
in ROM-Q2 version (The Chorus-role was evidently omitted altogether in a
late revival of ROM, as is intimated by its lexical distribution in the
SHAXICON files, and confirmed by its omission from ROM-F1).
- SEJ (Jonson): Macro (I.i, II.iii, III.i, IV.ii); probably also
(but less well-marked) Sabinius (I.i, II.iii, III.i, IV.iii), with some
accomodation for a costume change after IV.ii (but Jonson reports in F1
that he has revised Sejanus, which means that this problem at IV.iii.0 may
not actually have come up in the performed text).
- SHR: Lord, and perhaps also Pedant.
- TGV: Antonio and the Duke.
- TIM: Poet in TIM-a (representing ur-F1 version, the parts of
TIM-F1 customarily ascribed to Shakespeare); no role apparent in TIM-b
(widely supposed to represent Middleton or late-Shakespearean revision.
SHAXICON suggests that TIM-F1 is a late, unfinished revision (ca. 1613) of
a play first acted in 1601. TIM-F1 appears not to be a collaborative text
- TIT: probably Aaron or old Lucius, or possibly alternating between
these roles (neither is as strongly marked statistically as most other
roles identified in this catalogue).
- TMP: no Shakespeare role apparent; possibly Shipmaster or Antonio
(insufficient post-TMP sample).
- TNK: no Shakespeare role apparent; possibly Prologue and Doctor
(insufficent post-TNK sample).
- TNT: Antonio (later adding Valentine [I.i]).
- TRO: Uncertain. Possibly, in a brief run of TRO-F1 ca. 1602/3,
the Prologue (perhaps also Agamemnon or Ulysses, or a second bit part; not
yet fully tested). The Q1 version may not have been acted until about
- WIV: In WIV-F1, Shakespeare appears to have played Ford, but only
in two evidently brief runs. In WIV-Q Shakespeare appears almost
certainly to have played the Host. Though one of the "bad" quartos, Q1
Wives appears certainly to antedate the F1 version; but SHAXICON casts
doubt on Shakespeare's authorship of Q1 Wives.
- WT: Shakespeare roles uncertain; probably Archidamus (I.i),
Antigonus (II.i, II.iii, III.iii), and 3rd Gentleman (V.i) in 1610/11,
switching to Polixenes in 1612; but the post-WT lexical sample is too
small to speak with confidence.
WHAT YOU NEED TO USE SHAXICON:
In advance of publication we're drawing on the expertise of people in
various fields so that when it's finally distributed SHAXICON will be
fully intelligible even to those users without expertise in computers,
statistics, and/or textual scholarship. I'm shooting for 1996
publication, but cannot guess what technical problems may arise in the
interim. CD-rom may be too slow to be practicable, but disk-space may
otherwise be a problem for many users.
- Disk space. In its present form, SHAXICON sucks up 40+ megs just
for the raw data, plus another 20 megs or so for the commentary, help
files, and graphics; plus another 20 megs or so for the software. But
don't start erasing those electronic games just yet in order to make room
for it. The main database for SHAXICON is now complete, purged of errors,
and generally usable; but it's not yet ready for prime time: SHAXICON now
runs on ETC Word-Cruncher, which is limited in its capabilities and
requires way-too-much manual labor (keying in lexical searches, etc.).
We're now using Excel for the summary figures and graphics, which is a big
time-saver--but we're likely to change over, prior to publication, to a
slicker and more fully automated database-management system so that
SHAXICON is more user-friendly in ALL respects. I'm inquiring after
Oracle, 4D, and Fox.
Address queries to Professor Donald Foster, firstname.lastname@example.org.
Back to Shakespeare Authorship Home Page