Disentangling effects of digitization on linguistic diversity
About 7000 languages are spoken around the globe, constituting a remarkable extent of linguistic and cultural diversity. However, research has shown that linguistic diversity has decreased through the past couple of decades and is vanishing rapidly, an observation that cannot simply be pinpointed to a single factor. Large-scale studies on language endangerment and linguistic diversity have already accounted for environmental and socio-economic effects. Our project adds to this research by considering the effect of digitization on linguistic diversity.
Our approach unfolds in three work packages. In the first one, we study the effects of measures associated with digitization on linguistic diversity in the non-digital sphere. We correlate country-level estimates of digitization, linguistic diversity, and other covariates, crucially also considering the diachronic dimension. In the second work package, we focus on linguistic diversity in the digital sphere on a global scale. We derive country-level estimates of linguistic diversity based on social-media data and a diachronically layered web corpus. In the third work package, we zoom into a constrained region, Québec (CA). In a mixed-methods approach we investigate the relationship between digitization and linguistic diversity in that region by exploiting regional (digital) data and results from a qualitative survey on language choice and attitudes. Finally, we bring together and visualize our findings in the fourth work package.