Start with Google N-grams to quickly identify broad, long-term frequency trends for words or phrases. For example, you might notice a sharp increase in the use of "metaverse" or a decline in "phonograph." This helps you pinpoint interesting periods or terms for further investigation. Contextual and Detailed Analysis (COCA/COHA):
Once a trend is identified in Google N-grams, switch to COCA and COHA to understand why the trend occurred and how the language changed. Semantic Shift: If "gay" increased in frequency, use COCA/COHA to see if its primary meaning shifted from "merry" to "homosexual" during a certain period. Collocations and Usage Patterns: Examine the words that frequently appear with your target term. Has "virtual" started to collocate more with "reality" over time? Grammatical Patterns: How is the word being used grammatically? Is "impact" more commonly used as a verb now than a noun? Genre and Register: Analyze the distribution of the term across different genres (spoken, academic, fiction, etc.) in COCA/COHA. A word might be rising in overall frequency but only in specific contexts. Homograph Resolution: Use the POS tagging in COCA/COHA to differentiate between different senses of a word (e.g., "bank" as a noun vs. verb). Validation and Triangulation:
Use the academic corpora to validate findings from Google N-grams. A trend observed in Google N-grams might be noisy or an artifact of the corpus. Confirming it in COCA/COHA, with their higher data quality and contextual information, strengthens your findings. Conversely, a pattern observed in COCA/COHA might be too specific to a particular genre or time slice. Using Google N-grams can help determine if it's a broader, more general trend. Example Research Flow:
Hypothesis: The term "sustainable development" has become more common in academic discourse since the late 20th century. Google N-grams: Search for "sustainable development" to see its overall frequency trend across centuries. You'll likely see a sharp rise from the 1980s onwards. This confirms the general trend. COCA/COHA: Search for "sustainable development" in COHA (for earlier periods) and COCA (for contemporary use). Filter by genre (e.g., "academic" or "non-fiction") to confirm if the increase is specifically within academic discourse, or if it's a general societal trend reflected across all genres. Examine collocations and KWIC to see how the term's meaning has evolved or what other concepts it's frequently associated with (e.g., "environmental," "economic," "social"). Analyze grammatical patterns to see if it's primarily used as a noun phrase or if new adjectival/verbal uses have emerged. By combining the "big data" overview of Google N-grams with the "deep data" contextual richness of COCA and COHA, researchers can conduct more robust, nuanced, and reliable studies of language change and usage.
My Semantic Programmed link to help out the students
ReplyDelete
ReplyDeleteInitial Trend Identification (Google N-grams):
Start with Google N-grams to quickly identify broad, long-term frequency trends for words or phrases. For example, you might notice a sharp increase in the use of "metaverse" or a decline in "phonograph."
This helps you pinpoint interesting periods or terms for further investigation.
Contextual and Detailed Analysis (COCA/COHA):
Once a trend is identified in Google N-grams, switch to COCA and COHA to understand why the trend occurred and how the language changed.
Semantic Shift: If "gay" increased in frequency, use COCA/COHA to see if its primary meaning shifted from "merry" to "homosexual" during a certain period.
Collocations and Usage Patterns: Examine the words that frequently appear with your target term. Has "virtual" started to collocate more with "reality" over time?
Grammatical Patterns: How is the word being used grammatically? Is "impact" more commonly used as a verb now than a noun?
Genre and Register: Analyze the distribution of the term across different genres (spoken, academic, fiction, etc.) in COCA/COHA. A word might be rising in overall frequency but only in specific contexts.
Homograph Resolution: Use the POS tagging in COCA/COHA to differentiate between different senses of a word (e.g., "bank" as a noun vs. verb).
Validation and Triangulation:
Use the academic corpora to validate findings from Google N-grams. A trend observed in Google N-grams might be noisy or an artifact of the corpus. Confirming it in COCA/COHA, with their higher data quality and contextual information, strengthens your findings.
Conversely, a pattern observed in COCA/COHA might be too specific to a particular genre or time slice. Using Google N-grams can help determine if it's a broader, more general trend.
Example Research Flow:
Hypothesis: The term "sustainable development" has become more common in academic discourse since the late 20th century.
Google N-grams: Search for "sustainable development" to see its overall frequency trend across centuries. You'll likely see a sharp rise from the 1980s onwards. This confirms the general trend.
COCA/COHA:
Search for "sustainable development" in COHA (for earlier periods) and COCA (for contemporary use).
Filter by genre (e.g., "academic" or "non-fiction") to confirm if the increase is specifically within academic discourse, or if it's a general societal trend reflected across all genres.
Examine collocations and KWIC to see how the term's meaning has evolved or what other concepts it's frequently associated with (e.g., "environmental," "economic," "social").
Analyze grammatical patterns to see if it's primarily used as a noun phrase or if new adjectival/verbal uses have emerged.
By combining the "big data" overview of Google N-grams with the "deep data" contextual richness of COCA and COHA, researchers can conduct more robust, nuanced, and reliable studies of language change and usage.