Skip to content

Eesthetic: Estonian Paradigms in Phonemic Notation

This is a human-readable rendition of a JSON file defining a frictionless package. It was generated automatically.

  • name: eesthetic
  • licenses:
  • keywords: estonian, paradigms, lexicon, morphology, paralex
  • profile data-package
  • contributors
  • [1]
    • title Sacha Beniamine
    • role author
  • [2]
    • title Matthew Baerman
    • role contributor
  • [3]
    • title Mari Aigro
    • role contributor
  • [4]
    • title Maria Copot
    • role contributor
  • [5]
    • title Jules Bouton
    • role contributor
  • version 1.0.5
  • languages_iso639 ['ekk']
  • basepath /home/jules/Documents/Nextcloud_LLF/Documents/Thèse/Informatique/Lexiques/Estonian

This package describes the following tables:

cells

Paradigm cells - This table is located in estonian_cells.csv.

  • The identifier column (or primaryKey) is ['cell_id']

Columns defined by cells-schema:

  • cell_id (string): Cell identifier. The set of feature values as would appear in a gloss, separated by dots, eg. prs.ind.1sg or f.pl

    • constraints: a cell_id is obligatory; it must be unique; it must match the regular expression (impers|iness|trans|abess|part|elat|term|ipfv|quot|cond|pers|ptcp|nom|gen|ill|all|abl|ess|com|ind|imp|1sg|2sg|3sg|1pl|2pl|3pl|neg|prs|pst|sup|inf|ger|ad|sg|pl)(\.(impers|iness|trans|abess|part|elat|term|ipfv|quot|cond|pers|ptcp|nom|gen|ill|all|abl|ess|com|ind|imp|1sg|2sg|3sg|1pl|2pl|3pl|neg|prs|pst|sup|inf|ger|ad|sg|pl))*.
  • ekilex_cell (any)

  • vabamorf_cell (any)

  • unimorph (string): Cell in unimorph format. The cell, written following the unimorph schema

  • POS (string): Part of Speech. The relevant part of speech for this item. This must refer to a PartOfSpeech entity from the lexinfo (https://lexinfo.net/) ontology.

    • constraints: a POS must be one of the values: verb, numeral, conjunction, noun, adposition, determiner, article, adverb, pronoun, fusedPreposition, adjective, symbol, particle, conditionalParticle, demonstrativePronoun, interjection, semiColon, diminutiveNoun, possessivePronoun, prepositionalAdverb, compoundPreposition, interrogativeRelativePronoun, possessiveParticle, plainVerb, letter, interrogativeDeterminer, relativePronoun, postposition, fusedPronounAuxiliary, interrogativeOrdinalNumeral, indefiniteOrdinalNumeral, strongPersonalPronoun, possessiveRelativePronoun, ordinalAdjective, collectivePronoun, commonNoun, infinitiveParticle, comparativeParticle, partitiveArticle, invertedComma, lightVerb, emphaticPronoun, distinctiveParticle, genericNumeral, possessiveAdjective, reflexivePossessivePronoun, colon, coordinationParticle, presentParticipleAdjective, fusedPrepositionPronoun, cardinalNumeral, indefiniteDeterminer, numeralFraction, questionMark, generalAdverb, superlativeParticle, point, indefiniteMultiplicativeNumeral, comma, closeParenthesis, futureParticle, personalPronoun, reflexivePersonalPronoun, adverbialPronoun, reciprocalPronoun, openParenthesis, pastParticipleAdjective, negativePronoun, relativeDeterminer, existentialPronoun, pronominalAdverb, relativeParticle, exclamativeDeterminer, multiplicativeNumeral, reflexiveDeterminer, modal, unclassifiedParticle, properNoun, allusivePronoun, interrogativeCardinalNumeral, bullet, subordinatingConjunction, irreflexivePersonalPronoun, possessiveDeterminer, negativeParticle, indefinitePronoun, generalizationWord, coordinatingConjunction, deficientVerb, adjective-i, impersonalPronoun, indefiniteCardinalNumeral, adjective-na, qualifierAdjective, affirmativeParticle, mainVerb, fusedPrepositionDeterminer, indefiniteArticle, weakPersonalPronoun, suspensionPoints, interrogativeMultiplicativeNumeral, affixedPersonalPronoun, auxiliary, circumposition, copula, demonstrativeDeterminer, participleAdjective, exclamativePoint, interrogativePronoun, presentativePronoun, punctuation, definiteArticle, slash, exclamativePronoun, preposition, conditionalPronoun, relationNoun, interrogativeParticle.

    • rdfProperty: https://www.paralex-standard.org/paralex_ontology.xml#POS

  • comment (string): Comment. Human-readable comment.

lexemes

Lexemes - This table is located in estonian_lexemes.csv.

  • The identifier column (or primaryKey) is ['lexeme_id']
  • foreignKey: ``

Columns defined by lexemes-schema:

  • lexeme_id (string): Identifier for the lexeme. Lexeme identifiers. Often, they are identical to the label (lemma). However, they must be unique to paradigms, distinguishing homonyms with different inflection. For example, the animal mouse/mice and the computer peripheric mouse/mouses would both have the label 'mouse' but could be identified by the lexeme identifiers mouse_1 and mouse_2.

    • constraints: a lexeme_id is obligatory; it must be unique.
  • lemma (any)

  • wordId (any)

  • paradigmId (any)

  • POS (string): Part of Speech. The relevant part of speech for this item. This must refer to a PartOfSpeech entity from the lexinfo (https://lexinfo.net/) ontology.

    • constraints: a POS must be one of the values: verb, numeral, conjunction, noun, adposition, determiner, article, adverb, pronoun, fusedPreposition, adjective, symbol, particle, conditionalParticle, demonstrativePronoun, interjection, semiColon, diminutiveNoun, possessivePronoun, prepositionalAdverb, compoundPreposition, interrogativeRelativePronoun, possessiveParticle, plainVerb, letter, interrogativeDeterminer, relativePronoun, postposition, fusedPronounAuxiliary, interrogativeOrdinalNumeral, indefiniteOrdinalNumeral, strongPersonalPronoun, possessiveRelativePronoun, ordinalAdjective, collectivePronoun, commonNoun, infinitiveParticle, comparativeParticle, partitiveArticle, invertedComma, lightVerb, emphaticPronoun, distinctiveParticle, genericNumeral, possessiveAdjective, reflexivePossessivePronoun, colon, coordinationParticle, presentParticipleAdjective, fusedPrepositionPronoun, cardinalNumeral, indefiniteDeterminer, numeralFraction, questionMark, generalAdverb, superlativeParticle, point, indefiniteMultiplicativeNumeral, comma, closeParenthesis, futureParticle, personalPronoun, reflexivePersonalPronoun, adverbialPronoun, reciprocalPronoun, openParenthesis, pastParticipleAdjective, negativePronoun, relativeDeterminer, existentialPronoun, pronominalAdverb, relativeParticle, exclamativeDeterminer, multiplicativeNumeral, reflexiveDeterminer, modal, unclassifiedParticle, properNoun, allusivePronoun, interrogativeCardinalNumeral, bullet, subordinatingConjunction, irreflexivePersonalPronoun, possessiveDeterminer, negativeParticle, indefinitePronoun, generalizationWord, coordinatingConjunction, deficientVerb, adjective-i, impersonalPronoun, indefiniteCardinalNumeral, adjective-na, qualifierAdjective, affirmativeParticle, mainVerb, fusedPrepositionDeterminer, indefiniteArticle, weakPersonalPronoun, suspensionPoints, interrogativeMultiplicativeNumeral, affixedPersonalPronoun, auxiliary, circumposition, copula, demonstrativeDeterminer, participleAdjective, exclamativePoint, interrogativePronoun, presentativePronoun, punctuation, definiteArticle, slash, exclamativePronoun, preposition, conditionalPronoun, relationNoun, interrogativeParticle.

    • rdfProperty: https://www.paralex-standard.org/paralex_ontology.xml#POS

  • inflection_class (string): Inflection class identifier. This identifier groups together lexemes of the same inflection class.

  • homonymNr (any)

tags

Tags mark rows which have commonalities - This table is located in estonian_tags.csv.

  • The identifier column (or primaryKey) is ['tag_id']

Columns defined by tags-schema:

  • tag_id (string): Tag id. The label for a set of forms which have something in common.

    • constraints: a tag_id is obligatory; it must be unique.
  • tag_column_name (string): Name of the tag column in the forms table. The name of the column this tag is used in the forms table

    • constraints: a tag_column_name is obligatory; it must match the regular expression [^ ]+_tag.
  • comment (string): Comment. Human-readable comment.

sounds

Sound inventory with distinctive features - This table is located in estonian_sounds.csv.

  • The identifier column (or primaryKey) is ['sound_id']
  • missingValues: ``

Columns defined by sounds-schema:

  • sound_id (string): sound representation. These identifiers are specific to sounds.

    • constraints: a sound_id is obligatory; it must be unique.
  • CLTS_id (string): Identifier of this sound in CLTS. Reference to this sound in CLTS data.

  • label (string): label for this row. A human readable label for the row.

  • syllabic (any)

  • stress (any)

  • long (any)

  • extra-long (any)

  • consonantal (any)

  • sonorant (any)

  • continuant (any)

  • delayed release (any)

  • approximant (any)

  • tap (any)

  • trill (any)

  • nasal (any)

  • voice (any)

  • spread gl (any)

  • constr gl (any)

  • LABIAL (any)

  • round (any)

  • labiodental (any)

  • CORONAL (any)

  • anterior (any)

  • distributed (any)

  • strident (any)

  • lateral (any)

  • DORSAL (any)

  • high (any)

  • low (any)

  • front (any)

  • back (any)

  • tense (any)

  • diphthong (any)

  • diph_front (any)

  • diph_high (any)

  • diph_low (any)

  • C_high (any)

  • C_low (any)

  • C_front (any)

  • C_back (any)

  • palatalised (any)

forms

Inflected forms - This table is located in estonian_paradigms.csv.

  • The identifier column (or primaryKey) is ['form_id']

  • Formal relations (foreignKeys) with other tables:

    Each value in column Must refer to
    ['cell'] ['cell_id'] in table cells
    ['lexeme'] ['lexeme_id'] in table lexemes

Columns defined by forms-schema:

  • form_id (string): Form table row identifiers. These identifiers are specific to form, lexeme, cell triples.

    • constraints: a form_id is obligatory; it must be unique.
  • lexeme (string): Reference to a lexeme identifier. Lexeme identifiers must be unique to paradigms.

  • cell (string): Reference to a cell identifier. The set of feature values as would appear in a gloss, separated by dots, eg. prs.ind.1sg or f.pl

  • phon_form (string): Inflected form (phonemic or phonetic). The form, given in phonemic or phonetic notation, with sounds separated by spaces

    • constraints: a phon_form must match the regular expression (sʲːː|tʲːː|lʲːː|nʲːː|e͜iː|e͜oː|e͜ɑː|i͜eː|i͜oː|i͜uː|i͜ɑː|o͜eː|o͜iː|o͜ɑː|u͜eː|u͜iː|u͜oː|u͜ɑː|y͜iː|æ͜eː|æ͜iː|æ͜oː|æ͜uː|ø͜eː|ø͜iː|ø͜ɑː|ɑ͜eː|ɑ͜iː|ɑ͜oː|ɑ͜uː|ɤ͜eː|ɤ͜iː|ɤ͜oː|ɤ͜uː|sʲː|tʲː|fːː|kːː|pːː|sːː|tːː|vːː|ʃːː|lʲː|mːː|nʲː|rːː|lːː|nːː|e͜i|e͜o|e͜u|e͜ɑ|i͜e|i͜o|i͜u|i͜ɑ|o͜e|o͜i|o͜u|o͜ɑ|u͜e|u͜i|u͜o|u͜ɑ|y͜i|y͜ɑ|æ͜e|æ͜i|æ͜o|æ͜u|ø͜e|ø͜i|ø͜ɑ|ɑ͜e|ɑ͜i|ɑ͜o|ɑ͜u|ɤ͜e|ɤ͜i|ɤ͜o|ɤ͜u|eːː|iːː|oːː|uːː|yːː|æːː|øːː|ɑːː|ɤːː|fː|kː|pː|sʲ|sː|tː|tʲ|vː|ʃː|lʲ|lː|mː|nʲ|nː|rː|ŋː|hː|eː|iː|oː|uː|yː|æː|øː|ɑː|ɤː|f|k|p|s|t|v|z|ʃ|l|m|n|r|ŋ|h|j|w|e|i|o|u|y|æ|ø|ɑ|ɤ|ˈ)( (sʲːː|tʲːː|lʲːː|nʲːː|e͜iː|e͜oː|e͜ɑː|i͜eː|i͜oː|i͜uː|i͜ɑː|o͜eː|o͜iː|o͜ɑː|u͜eː|u͜iː|u͜oː|u͜ɑː|y͜iː|æ͜eː|æ͜iː|æ͜oː|æ͜uː|ø͜eː|ø͜iː|ø͜ɑː|ɑ͜eː|ɑ͜iː|ɑ͜oː|ɑ͜uː|ɤ͜eː|ɤ͜iː|ɤ͜oː|ɤ͜uː|sʲː|tʲː|fːː|kːː|pːː|sːː|tːː|vːː|ʃːː|lʲː|mːː|nʲː|rːː|lːː|nːː|e͜i|e͜o|e͜u|e͜ɑ|i͜e|i͜o|i͜u|i͜ɑ|o͜e|o͜i|o͜u|o͜ɑ|u͜e|u͜i|u͜o|u͜ɑ|y͜i|y͜ɑ|æ͜e|æ͜i|æ͜o|æ͜u|ø͜e|ø͜i|ø͜ɑ|ɑ͜e|ɑ͜i|ɑ͜o|ɑ͜u|ɤ͜e|ɤ͜i|ɤ͜o|ɤ͜u|eːː|iːː|oːː|uːː|yːː|æːː|øːː|ɑːː|ɤːː|fː|kː|pː|sʲ|sː|tː|tʲ|vː|ʃː|lʲ|lː|mː|nʲ|nː|rː|ŋː|hː|eː|iː|oː|uː|yː|æː|øː|ɑː|ɤː|f|k|p|s|t|v|z|ʃ|l|m|n|r|ŋ|h|j|w|e|i|o|u|y|æ|ø|ɑ|ɤ|ˈ))*.

    • rdfProperty: https://www.paralex-standard.org/paralex_ontology.xml#phon_form

    • missingValues: #DEF#
  • analysed_phon_form (string): Inflected form with analysis, such as segmentation markers (phonemic or phonetic). The form, given in phonemic or phonetic notation, with sounds separated by spaces, and analysis markers.

  • orth_form (string): Inflected form (orthographic). The form, given orthographically

  • analysed_orth_form (string): Inflected form with analysis, such as segmentation markers (orthographic). The form, given orthographically, with markers for analysis.

  • overabundance_tag (string): Tags for overabundant forms. Identifies sets of overabundant forms. For example, overabundant forms across lexemes might belong to a series of regular and irregular forms, or a series of short and long forms, etc.

    • constraints: a overabundance_tag must match the regular expression (voc_rad_plural|de_te_genitive|not_Q3_variant|sid_partitive|voc_partitive|id_partitive|voc_i_plural|de_te_plural|sse_illative|e_genitive|Q3_variant|aditive)(\|(voc_rad_plural|de_te_genitive|not_Q3_variant|sid_partitive|voc_partitive|id_partitive|voc_i_plural|de_te_plural|sse_illative|e_genitive|Q3_variant|aditive))*.

    • rdfProperty: https://www.paralex-standard.org/paralex_ontology.xml#overabundance_tag

  • defectiveness_tag (string): Tags for defectiveness status. Identifies sets of defective forms (eg. pluralia tantum).

  • epistemic_tag (string): Tags for epistemic status. Identifies sets of forms with the same epistemic status.

features-values

Grammatical features values - This table is located in estonian_features.csv.

  • The identifier column (or primaryKey) is ['value_id']

Columns defined by features-values-schema:

  • value_id (string): Grammatical Feature value identifier. Identifier for the grammatical feature value (as found in the cell)

    • constraints: a value_id is obligatory; it must be unique.
  • value_label (any)

  • feature (string): feature. The name of the dimension of this feature, eg. case, tense, modality, voice, force, gender, evidentiality, person, number, polarity...

  • POS (string): Part of Speech. The relevant part of speech for this item. This must refer to a PartOfSpeech entity from the lexinfo (https://lexinfo.net/) ontology.

    • constraints: a POS must be one of the values: verb, numeral, conjunction, noun, adposition, determiner, article, adverb, pronoun, fusedPreposition, adjective, symbol, particle, conditionalParticle, demonstrativePronoun, interjection, semiColon, diminutiveNoun, possessivePronoun, prepositionalAdverb, compoundPreposition, interrogativeRelativePronoun, possessiveParticle, plainVerb, letter, interrogativeDeterminer, relativePronoun, postposition, fusedPronounAuxiliary, interrogativeOrdinalNumeral, indefiniteOrdinalNumeral, strongPersonalPronoun, possessiveRelativePronoun, ordinalAdjective, collectivePronoun, commonNoun, infinitiveParticle, comparativeParticle, partitiveArticle, invertedComma, lightVerb, emphaticPronoun, distinctiveParticle, genericNumeral, possessiveAdjective, reflexivePossessivePronoun, colon, coordinationParticle, presentParticipleAdjective, fusedPrepositionPronoun, cardinalNumeral, indefiniteDeterminer, numeralFraction, questionMark, generalAdverb, superlativeParticle, point, indefiniteMultiplicativeNumeral, comma, closeParenthesis, futureParticle, personalPronoun, reflexivePersonalPronoun, adverbialPronoun, reciprocalPronoun, openParenthesis, pastParticipleAdjective, negativePronoun, relativeDeterminer, existentialPronoun, pronominalAdverb, relativeParticle, exclamativeDeterminer, multiplicativeNumeral, reflexiveDeterminer, modal, unclassifiedParticle, properNoun, allusivePronoun, interrogativeCardinalNumeral, bullet, subordinatingConjunction, irreflexivePersonalPronoun, possessiveDeterminer, negativeParticle, indefinitePronoun, generalizationWord, coordinatingConjunction, deficientVerb, adjective-i, impersonalPronoun, indefiniteCardinalNumeral, adjective-na, qualifierAdjective, affirmativeParticle, mainVerb, fusedPrepositionDeterminer, indefiniteArticle, weakPersonalPronoun, suspensionPoints, interrogativeMultiplicativeNumeral, affixedPersonalPronoun, auxiliary, circumposition, copula, demonstrativeDeterminer, participleAdjective, exclamativePoint, interrogativePronoun, presentativePronoun, punctuation, definiteArticle, slash, exclamativePronoun, preposition, conditionalPronoun, relationNoun, interrogativeParticle.

    • rdfProperty: https://www.paralex-standard.org/paralex_ontology.xml#POS

  • comment (string): Comment. Human-readable comment.

  • canonical_order (integer): Sorting order for visual presentation. The order in which items are canonically presented. Use integers to represent relative order, order is used per-item.

  • ekilex_value_label (any)

  • vabamorf_value_label (any)