Reflections on the SOTU
So I mentioned a few days ago that I didn’t watch George W.’s State of the Union this year. But I was still very interested in it. So I pulled down the text of the speech and had a read. There were a few things in there that annoyed me, as usual, and a few pleasant surprises such as the bit about reducing our oil dependence. But overall, it reminded me of every other State of the Union address. That got me thinking… are they really all the same, or is it just my perception?
Thanks to the Internet and a wide variety of text hacking tools under Cygwin, I knew I’d be able to find out. I pulled down the text from the SOTU speeches of 2002-2006, plus Bush’s first policy address in 2001. After hacking the heck out of them in a case-independent way, I was able to compare the frequency of the words in each speech. I wasn’t sure I’d find anything interesting. But here are a few of the results.
The first thing I noticed was this arc. While the occurrence of “peace” is pretty random, with a generally upward trend, the use of the word “war” has a pronounced hump. It looks like “war” is going out of vogue. Perhaps the administration thinks they should focus less on the war, since the American public seems tired of it.
There’s a clear predominance of “terror” in general over “Al Qaeda” in particular (note that the use of “Osama” or “Laden” does not add much to this either). It’s interesting to see how “terror” was popular for some time before both “Iraq” and “Al Qaeda” gained strength. This mirrors the slow spin-up to the war with Iraq after September 11. And again you can see that “Iraq” is slowly disappearing from the scene.
But “Iraq” is still quite dominant over other countries. As it should be. On this chart you can see the increase in mentions of “Iraq” and “Afghanistan” that coincide with our wars in those countries. The scary bit is the slow increase in the use of “Iran”. Looks like the “Korea” bump didn’t quite rise to the level of war; how much further up does “Iran” have to go?
Lastly, just a comment on Bush’s speech style. What you can see on this chart is a gradual decline in the use of the word “I”, and an increase in the use of “we”, “our”, and “us”. I think this mirrors Bush’s attitude. When he first got into office, he was The Man, and he was going to take care of things. “I” this, “I” that, etc. As he has matured in his presidency his rhetoric has improved, and he increasingly uses these more encompassing terms. This is an improvement for sure; the question is, is it just an improvement in rhetoric, or an improvement in attitude? I hope the latter. But it’s at least nice to see his speeches improving. 🙂
These are just a few thoughts. Nothing earth-shaking, or ground-breaking. Just interesting. Perhaps you would interpret the data differently than I. What do you think?
I think for the most part it looks like the talk of war is coming to a close. That doesn’t nessicarily mean the end of the war, but at least it shows a reaction to the public’s dissaproval of his war efforts.
All in all I hope that his actions inspire future Presidents to make changes for the better.
Neato. For a sanity check, do the same analysis with vowel usage. I wonder if the letter “e” is a good indicator of the direction of federal deficit.
This is really great stuff. I’m diggin’ it.
I think your analysis is about accurate. What I would find interesting (but painful if not automated) is to do this for all words, then exclude all non-meaningful words (and, is, etc.) then plot them out and find out which word occurances are statistically significant changes (like if the president mentioned cookie monster and never had before) then start extrapolating data from that. You could probably find a ton of cool patterns.