Methodological Cautions for Ngrams

The results from Google Ngrams I’ve posted definitely show something, but it’s important to be cautious when interpreting them, particularly in imputing cause rather than just association.

1. The frequency of the use of a word says nothing about the nature of the usage, or whether the thing mentioned
is being discussed in a positive or negative light.

2. Sometimes the use of a word will decline because it’s now taken for granted. For instance, “feminism” appears less often in print today because feminism has gone from being a revolutionary ideology to a status quo institutional power group. If the use of a word declines to its former levels after a 40 year spike, this does not imply that all the effects of that spike have been undone, but it does perhaps indicate that the time is ripe to undo them.

3. Ngrams appears to yield substantially different results for different corpora. “English,” which is the default, appears to be the most reliable – I’ve zeroed out “American English” and “British English” with search terms that “English” yields a clear result for. The general tendency is for “English,” “English Million,” and “English Fiction” to give roughly the same qualitative results, while “American English” and “British English” sometimes show different trends. It seems unlikely that this is the result of English language books that are neither American or British – the size in the shift in international English needed to produce such shifts in the overall result, given the predominance of American and British English, seems too large to be plausible. The discrepancy is unsettling, though, and I’d like a better explanation. The basic findings on the decline of feminism are robust with respect to corpus, bot some of the others are not.

4. What’s done that doesn’t have to be written about is just as important as what is written about. Indeed, exercising power without being written about is a very powerful position. That said, we shouldn’t discount the long term power of ideas and written discourse to both reflect and shape the direction of society.

In summary, all these signs I’ve pointed out from Ngrams are not triumphs. They are, however, opportunities.


About Pechorin

A Hero of Our Time
This entry was posted in Uncategorized and tagged , . Bookmark the permalink.

2 Responses to Methodological Cautions for Ngrams

  1. Pingback: Collected Posts on the decline of feminism and related cultural trends, as seen through Google Ngrams | Pechorin

  2. Pingback: Ngramology: The Soul | Pechorin

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s