AnaPro,  Tool  for  Identification  and  Resolution  of  Direct  Anaphora  in   Spanish

Toledo Gómez, Israel; Valtierra Romero, Erick; GUZMAN ARENAS, ADOLFO; CUEVAS RASGADO, ALMA DELIA; MENDEZ SEGUNDO, LAURA

AnaPro, Tool for Identification and Resolution of Direct Anaphora in Spanish

Toledo Gómez, Israel; Valtierra Romero, Erick; GUZMAN ARENAS, ADOLFO; CUEVAS RASGADO, ALMA DELIA; MENDEZ SEGUNDO, LAURA

URI: http://hdl.handle.net/20.500.11799/49151

Fecha: 2014-02-01

Resumen:

AnaPro is software that solves direct anaphora in Spanish, specifically pronouns: it finds the noun or group of words to which the pronoun refers. It locates in the previous sentenc es the referent or antecedent which the pronoun replaces. An example of a direct anaphora solved is the pronoun “ he” in the sentence “He is sad.” Much of the work on anaphora has been done for texts in English; thus , we specifically focus on Spanish documents. AnaPro directly supports text analys is (to understand what a document says ), a non trivial task since there are different writing styles, references, idiomatic expressions, etc. The problem grows if t he analyzer is a computer, because they lack “common sense” (which persons possess) . Hence, before text analysis, its preprocessing is required, in order to assign tags (noun, verb,...) to each word, find the stems, disambiguate nouns, verbs, prepositions, identify colloquial expressions, i dentify and resolve anaphor a, among other chores. AnaPro works for Spanish sentences. It is a novel procedure, since it is automatic (no user intervenes during the resolution) and it does not need dictionaries. It employs heu ristics procedures to discover the semantics and help in the decisions; they are rather easy to implement and use li mited knowledge. Nevertheless, its results are good (81% of correct answers, at least). However, more tests will give a better idea of its goodness.

Descripción:

Introduction Anaphora is a relation of coreference between linguistic terms. According to Webster’s dictionary: “It is the use of a grammatical substitute (as a pronoun or a pro-verb) to refer to the denotation of a preceding word or group of words; also : the relation between a grammatical substitute and its antecedent.” Therefore, anaphora is a discourse relation. Anaphora resolution is very important in Natural Language Processing (NLP). This work is part of Project OM* (Ontology Merging), which seeks to build a large ontology by fusing smaller ontologies extracted from textual documents. An important part of the project is to analyze the sentences in a document with the goal to transform that text into an ontology that comprises its contents. A brief description of Project OM* follows.

Mostrar el registro completo del objeto digital

Ficheros en el objeto digital

Nombre: anapro_2.pdf

Tamaño: 1.376Mb

Formato: PDF

Descripción: articulo principal

Ver documento

Este ítem aparece en la(s) siguiente(s) colección(ones)

Conacyt [10019]
Científica [376]

Visualización del Documento

Título
AnaPro, Tool for Identification and Resolution of Direct Anaphora in Spanish
Autor
Toledo Gómez, Israel
Valtierra Romero, Erick
GUZMAN ARENAS, ADOLFO
CUEVAS RASGADO, ALMA DELIA
MENDEZ SEGUNDO, LAURA
Fecha de publicación
2014-02-01
Editor
Journal of Applied Research and Technology
Tipo de documento
Artículo
Palabras clave
Artificial Intelligence
Natural Language processing
Text Analysis
Anaphora resolution

Los documentos depositados en el Repositorio Institucional de la Universidad Autónoma del Estado de México se encuentran a disposición en Acceso Abierto bajo la licencia Creative Commons: Atribución-NoComercial-SinDerivar 4.0 Internacional (CC BY-NC-ND 4.0)