Mostrar el registro sencillo del objeto digital

dc.contributor Ledeneva, Yulia Nikolaevna
dc.contributor García Hernández, René Arnulfo
dc.contributor Hernández Castañeda, Ángel
dc.contributor.author Rojas Simón, Jonathan
dc.date.accessioned 2024-02-16T20:25:15Z
dc.date.available 2024-02-16T20:25:15Z
dc.date.issued 2023-06-12
dc.identifier.uri http://hdl.handle.net/20.500.11799/140066
dc.description Tesis de Doctorado en Ciencias de la Computación es
dc.description.abstract Automatic Text Summarization (ATS) plays an essential role in the management of textual information since it condenses the volume of text documents to produce summaries. Nowadays, the development of ATS is constantly growing because we can measure the performance of proposed methods through the Evaluation of Text Summaries (ETS), but evaluating summaries is a complex process. In the state-of-the-art, the ETS has traditionally been performed through the ROUGE system to analyze summaries' content automatically. However, without human-made summaries (human references), the evaluation cannot be carried out. For this reason, the evaluation of summaries without human references has been proposed. Over the last two decades, the scientific community has proposed methods that do not depend on human references by using the source text as a reference document. In this sense, ROUGE-C, LSA, and SIMetrix have been widely used methods that fulfill this feature. However, they tend not to correlate highly with human assessment. Therefore, optimizing their individual measures via linear optimization has been proposed in previous works (e.g., SECO-SEVA), providing a closer evaluation of human judgments. Although such optimization enabled improvements in automatic evaluation, it involved the adjustment of the parameters of each measure, assuming the presence of different complexity levels in text documents and assessment measures. Thus, the performance of each method varies according to the complexity level of each source document. In document analysis and information retrieval, text complexity has been addressed from multiple perspectives, such as readability, vocabulary, and the quantity of information they provide (informativeness). In general, text documents are characterized by varying the before mentioned features because they come from different sources of information. Therefore, any process of generation or evaluation of texts can vary. As a result, text complexity indexes have not been used in the ETS without human references to select the most appropriate measure to evaluate each summary (candidate summary). This thesis proposes using a methodology of six steps to select appropriate evaluation measures according to the source documents' complexity level. The proposed selection combines 31 measures derived from ROUGE-C, LSA, and SIMetrix methods, which use state of-the-art techniques focused on content analysis. Across different experimentations done with a Genetic Algorithm (GA) and Multilayer Perceptron (MLP), the results of the proposed selection show correlation improvements concerning other evaluation methods on well standardized datasets, such as DUC01 and DUC02. es
dc.language.iso eng es
dc.publisher Universidad Autónoma del Estado de México es
dc.rights restrictedAccess es
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/4.0 es
dc.subject Procesamiento de Lenguaje Natural es
dc.subject Evaluation of text summaries es
dc.subject Genetic algorithm es
dc.subject ROUGE es
dc.subject.classification INGENIERÍA Y TECNOLOGÍA es
dc.title Evaluation of text summaries without human references based on document complexity analysis es
dc.type Tesis de Doctorado es
dc.provenance Científica es
dc.road Dorada es
dc.organismo Unidad Académica Profesional Tianguistenco es
dc.ambito Internacional es
dc.cve.CenCos 31201 es
dc.cve.progEstudios 81 es
dc.modalidad Tesis es


Ficheros en el objeto digital

Este ítem aparece en la(s) siguiente(s) colección(ones)

Visualización del Documento

  • Título
  • Evaluation of text summaries without human references based on document complexity analysis
  • Autor
  • Rojas Simón, Jonathan
  • Director(es) de tesis, compilador(es) o coordinador(es)
  • Ledeneva, Yulia Nikolaevna
  • García Hernández, René Arnulfo
  • Hernández Castañeda, Ángel
  • Fecha de publicación
  • 2023-06-12
  • Editor
  • Universidad Autónoma del Estado de México
  • Tipo de documento
  • Tesis de Doctorado
  • Palabras clave
  • Procesamiento de Lenguaje Natural
  • Evaluation of text summaries
  • Genetic algorithm
  • ROUGE
  • Los documentos depositados en el Repositorio Institucional de la Universidad Autónoma del Estado de México se encuentran a disposición en Acceso Abierto bajo la licencia Creative Commons: Atribución-NoComercial-SinDerivar 4.0 Internacional (CC BY-NC-ND 4.0)

Mostrar el registro sencillo del objeto digital

restrictedAccess Excepto si se señala otra cosa, la licencia del ítem se describe cómo restrictedAccess

Buscar en RI


Buscar en RI

Usuario

Estadísticas