Contingency tables can answer the question "How many times does each value occur in my data frame?" Take a moment to think about what this xtabs shows: In this case, it is a count of counts: How often does each count of the word "America" appear?For example, we can check, in the inaugural speeches dataset, how often we find each president's name: > xtabs(~president, data = inaugural) Or we can ask, again about the inaugural speeches dataset: How many inaugural speeches were there that mentioned "America" never, just once, twice, etc? We can send the output of xtabs() directly to barplot for graphical inspection. > xtabs(~America, data = inaugural) America 0 1 2 3 4 5 6 7 8 10 11 15 19 20 21 27 6 4 2 1 2 3 3 1 1 1 2 1 1 1 > barplot(xtabs(~America, data = inaugural)) Cross-analyzing two columnsxtabs() becomes especially useful when we cross-analyze two different columns. This is better demonstrated with the dataset "verbs" from the languageR package. (You will need to install this before you can use it. In the "Packages and Data" menu of R, choose the Package Installer. When you install a package, don't forget to tick "also install dependencies".) The "verbs" dataset describes corpus occurrences of the dative alternation, that is, occurrences of either the pattern "John gave Mary the book" or "John gave the book to Mary". It encodes a number of different characteristics (features) of each occurrence, such that one can check what circumstances coincide with (and maybe lead to) which form of the alternation. Here is how to check how often the receiver ("Mary" in the example above) was realized as an NP versus a PP, and how often it was animate versus inanimate: > library(languageR) > xtabs(~RealizationOfRec + AnimacyOfRec, data = verbs) AnimacyOfRec RealizationOfRec animate inanimate NP 521 34 PP 301 47 When you send this to barplot(), it shows the numbers of NP versus PP as two different-colored bar parts for the values "animate" and "inanimate". barplot(xtabs(~RealizationOfRec + AnimacyOfRec, data = verbs)) If you like to compare the NP and PP counts for each value of AnimacyOfRec, asking barplot() to show the different-colored bars next to one another, rather than on top of one another, is helpful: barplot(xtabs(~RealizationOfRec + AnimacyOfRec, data = verbs), beside=T) |
Courses > R worksheets >