Adina Williams (Facebook AI) presents:
Measuring the predictability of nominal classification elements
Since Shannon originally proposed his mathematical theory of communication in the middle of the 20th century, information theory has been an influential scientific perspective at the interface between linguistics, cognitive science, and computer science. In this talk, I provide two examples of how information theory--powered by state-of-the-art NLP systems trained on large-scale, multilingual corpora--can help us uncover new facts about linguistic typology. More specifically, I measure how much classification elements (East Asian Classifiers, and Grammatical Gender) can be predicted from other words. While typological descriptions often point to clear edge cases where the choice of classification element is nearly completely (for example, only one classifier can be used to modify the noun “horse”, but several can be used to modify the noun “man”), it is unknown whether such predictability persists throughout an entire language. To address this, I measure mutual information between (i) classifiers and nouns in Mandarin Chinese, and (ii) noun gender and adjectives or verbs in six languages. For all comparisons, mutual information was significant, raising the tantalizing possibility that systematic, lexical semantic properties might underly classification element predictability. These results encourage possible comparison with other grammatical phenomena long argued to be idiosyncratic, such as declension and conjugation class. More broadly, these studies can be viewed as an initial step towards a general methodological program which measures information theoretic quantities over large, ecologically natural, written corpora to shed light on cognitive scientific questions about language structure and use.