In the field of computational linguistics, the representation of natural languages (NL) in formal grammar terms is a well-known issue and a wicked problem indeed. In fact, some formal grammars lack of lexical and semantic information representation, reducing the linguistic structure to syntax. On the other side, approaches like dependency grammars lose expressive power, under a computational point of view. Adpositional grammars (adgrams) are a novel grammar formalism that overcomes these limitations. Adgrams are based on Pennacchietti’s intuition, who put together Brøndal’s logic description of prepositions, Tesnerian notion of valence, Langacker’s cognitive dichotomy trajector/landmark and Silvio Ceccato’s pioneer work in the field on machine translation (MT). The result is a quasiformal description of adpositions, being the junctors of language structure. This description is called adpositional space. Each adpositional space is made of four adpositional types: Plus (#), Minus ($), Slash (!), Times ()). The resulting structure is cognitively sound and formally inspiring, as adpositional trees (adtrees) can be built as special Porphyrian trees, going beyond the Tesnerian somehow fuzzy concept of dependency. This dissertation puts Pennacchietti’s work a step forward. In fact, here adgrams consider the ultimate unit of NLs being the morpheme, not the word, and therefore they offer a coherent theory of both morphology and syntax. Hence, the collocation phenomena are considered like zero morphemes. Moreover, a sharp distinction in the dictionary between the adpositional space, essentially made of closed morphemes, and the lexicon, made of open morphemes. Moving again from Tesnerian structural syntax, open morphemes always have a fundamental grammar character: stative (O), adjunctive (A, as stative modifiers), and verbal (I), circumstantial (E, as verbal modifiers). The Tesnierian approach is validated through Whorf’s research results comparing grammars of typologically distant NLs. The second part shows that adgrams are computable, as they can be implemented with a strong, robust formalism. A concrete instance of the formal model of adgrams is given through the quasi-natural language Esperanto (QNL, Lyons), showing the linguistic viability of the model. The formal model should be used appropriately in an ad hoc epistemological scenario, called ‘the translation game’, designed as a Gedankenexperiment `a la Turing. A toy example is also given. In the third part the implementation is explained with all its details: the implementation of Esperanto is made in exactly 179 logic formulas, 56 of which are predicates. The formal model is promising to be generalised for any NLs, and how to do this was explained in various points of the third part. Finally, this dissertation shows that adgrams are a powerful NL grammar formalism which is at the same time cross-linguistic, cognitively grounded and formally robust and computationally sound.
Adpositional grammars: a multilingual grammar formalism for NLP / Gobbo, Federico. - (2009).
Adpositional grammars: a multilingual grammar formalism for NLP.
Gobbo, Federico
2009-01-01
Abstract
In the field of computational linguistics, the representation of natural languages (NL) in formal grammar terms is a well-known issue and a wicked problem indeed. In fact, some formal grammars lack of lexical and semantic information representation, reducing the linguistic structure to syntax. On the other side, approaches like dependency grammars lose expressive power, under a computational point of view. Adpositional grammars (adgrams) are a novel grammar formalism that overcomes these limitations. Adgrams are based on Pennacchietti’s intuition, who put together Brøndal’s logic description of prepositions, Tesnerian notion of valence, Langacker’s cognitive dichotomy trajector/landmark and Silvio Ceccato’s pioneer work in the field on machine translation (MT). The result is a quasiformal description of adpositions, being the junctors of language structure. This description is called adpositional space. Each adpositional space is made of four adpositional types: Plus (#), Minus ($), Slash (!), Times ()). The resulting structure is cognitively sound and formally inspiring, as adpositional trees (adtrees) can be built as special Porphyrian trees, going beyond the Tesnerian somehow fuzzy concept of dependency. This dissertation puts Pennacchietti’s work a step forward. In fact, here adgrams consider the ultimate unit of NLs being the morpheme, not the word, and therefore they offer a coherent theory of both morphology and syntax. Hence, the collocation phenomena are considered like zero morphemes. Moreover, a sharp distinction in the dictionary between the adpositional space, essentially made of closed morphemes, and the lexicon, made of open morphemes. Moving again from Tesnerian structural syntax, open morphemes always have a fundamental grammar character: stative (O), adjunctive (A, as stative modifiers), and verbal (I), circumstantial (E, as verbal modifiers). The Tesnierian approach is validated through Whorf’s research results comparing grammars of typologically distant NLs. The second part shows that adgrams are computable, as they can be implemented with a strong, robust formalism. A concrete instance of the formal model of adgrams is given through the quasi-natural language Esperanto (QNL, Lyons), showing the linguistic viability of the model. The formal model should be used appropriately in an ad hoc epistemological scenario, called ‘the translation game’, designed as a Gedankenexperiment `a la Turing. A toy example is also given. In the third part the implementation is explained with all its details: the implementation of Esperanto is made in exactly 179 logic formulas, 56 of which are predicates. The formal model is promising to be generalised for any NLs, and how to do this was explained in various points of the third part. Finally, this dissertation shows that adgrams are a powerful NL grammar formalism which is at the same time cross-linguistic, cognitively grounded and formally robust and computationally sound.File | Dimensione | Formato | |
---|---|---|---|
Phd thesis Gobbo completa.pdf
embargo fino al 31/12/2100
Descrizione: testo completo tesi
Tipologia:
Tesi di dottorato
Licenza:
Non specificato
Dimensione
5.35 MB
Formato
Adobe PDF
|
5.35 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.