ICT Call 2010ICT10-049

Machine Learning Techniques for Modeling of Language Varieties


Machine Learning Techniques for Modeling of Language Varieties
Principal Investigator:
Harald Trost
Status:
Abgeschlossen (01.01.2011 – 28.02.2014) 38 Monate
Fördersumme:
€ 529.000

 
Kurzzusammenfassung:

Language varieties are gaining importance in man-machine interaction. Using them in speech based communication enables computer systems to reflect the socio-cultural identity of users. Current language technology cannot deliver on this, yet. There are a few synthetic voices with localized pronunciation, but language varieties are multi-faceted, involving deviations on various levels.
We will develop algorithms capable of capturing and reproducing all major idiosyncracies displayed by a language variety, be they syntactic, lexical or phonological. The task can be viewed as machine translation with some unique properties: the difficulty posed by the scarcity of available data is counterbalanced by the relative proximity between the varieties and the standard language. Our approach will therefore rely on optimal selection of data and smart use of linguistic knowledge. Standard German and Viennese varieties serve as a test bed for the realization and exploration of our techniques.

 

Wir nutzen Cookies auf unserer Website. Einige von ihnen sind technisch notwendig, während andere uns helfen, diese Website zu verbessern oder zusätzliche Funktionalitäten zur Verfügung zu stellen. Weitere Informationen