About me

My research in computational linguistics focuses on the typology of morphological structure.

During my PhD, I studied inflection classes (declensions or conjugations) and their typological variation. I am currently a post-doctoral researcher at the Max Planck Institute for the Science of Human History, in the department of Linguistic and Cultural Evolution in Jena, working on morphophonological evolution with Erich Round.

I see computational tools as an opportunity to systematize linguistic analyses, a solution to study precisely large amounts of data, and a necessary methodological step towards typological investigation.

Interests

  • Computational Linguistics
  • Computational approaches to linguistic theory
  • Quantitative typology
  • Word and Paradigm morphology
  • Inflected lexicons

Education

  • PhD in Linguistics, 2018

    Université Paris 7

  • MA in Language Sciences / computational linguistics, 2014

    Université Paris 7

  • BA in Language Sciences / computational linguistics, 2012

    Université Paris 7

  • BA in Modern Literature, 2010

    Université Paris 7

Publications

(in press). One lexeme, many classes: inflection class systems as lattices. One-to-Many Relations in Morphology, Syntax and Semantics. PDF
Descriptions of inflection classes usually take the form of broad of fine-grained (Stump & Finkel 2013) partitions of the set of lexeme, or link both in a hierarchic system of classes (Corbett & Fraser 1993; Dressler & Thornton 1996). Recent efforts to infer those automatically (Brown & Hippisley 2012; Lee & Goldsmith 2013; Bonami 2014) all rely on the assumption that the …
(2020). Automated Parsing of Interlinear Glossed Text from Page Images of Grammatical Descriptions. Proceedings of The 12th Language Resources and Evaluation Conference. PDF
Linguists seek insight from all human languages, however accessing information from most of the full store of extant global linguistic descriptions is not easy. One of the most common kinds of information that linguists have documented is vernacular sentences, as recorded in descriptive grammars. Typically these sentences are formatted as interlinear glossed text (IGT). Most descriptive grammars, …
(2020). Opening the Romance Verbal Inflection Dataset 2.0: A CLDF lexicon. Proceedings of The 12th Language Resources and Evaluation Conference. PDF
We introduce the Romance Verbal Inflection Dataset 2.0, a multilingual lexicon of Romance inflection covering 73 varieties. The lexicon provide verbal paradigm forms in broad IPA phonemic notation. Both lexemes and paradigm cells are organized to reflect cognacy. Such multi-lingual inflected lexicons annotated for two dimensions of cognacy are necessary to study the evolution of inflectional …
(2018). Classifications flexionnelles: Étude quantitative des structures de paradigmes. Université Sorbonne Paris Cité - Université Paris Diderot (Paris 7), PhD thesis under the supervision of Olivier Bonami. PDF
This dissertation adopts the Word and Paradigm approach and elaborates computationaltools to investigate precisely the similarity structure of inflection class systems based on in-flectional lexicon. We study Arabic, Yaitepec Chatino, Zenzontepec Chatino, English, French,Navajo and European Portuguese verbs as well as Russian nouns.
(2017). When segmentation helps. Implicative structure and morph boundaries in the Navajo verb. First International Symposium on Morphology (ISMo). PDF Slides
Recent work in Word and Paradigm morphology argues that the implicative structure of paradigms is expressed in terms of relations between surface words, and that studying the structure of paradigms in terms of sub-word units is misleading if not outright impossible (Ackerman et al, 2009; Blevins, 2006, 2016; Bonami & Beniamine, 2016). The argument typically rests on the observation that a word …

Talks

Simulating paradigm Evolution

analogical change and morphomic patterns

Segmentation in morphology: wh-en, wh-ere, how ?

invited talk

The segmentation problem in inflectional morphology

Datasets

Inflected lexicon of Russian Nouns in IPA notation

This inflected lexicon of Russian Nouns is based on data generated by a DATR fragment for the nominal system of Russian (Dunstan Brown …

Romance Verbal Inflection Dataset 2.0

The Romance Verbal Inflection Dataset 2.0 is a multilingual lexicon of Romance inflection covering 73 varieties. It provides verbal …

Software

Feature Viz

This script generates natural class lattices for phoneme inventories defined by distinctive features. It is useful to visualize the natural classes implied by distinctive features.

IPA Keyboard

A keyboard layout for Onboard Keyboard, allowing for easily typing International Phonetic Alphabet symbols in utf-8 on linux.

Qumín

Qumín (Quantitative Modelling of Inflection) is a set of scripts written during my PhD to explore the structure of inflection class systems.