Procesamiento del Lenguaje Natural, Revista nº 44, marzo de ... - sepln [PDF]

Finalmente, la Sección 5 pre- senta las conclusiones y posibles trabajos fu- turos. 2. Conceptos introductorios. En la m

5 downloads 41 Views 12MB Size

Report

Download PDF

PNG Network

Recommend Stories

Procesamiento de lenguaje natural

Silence is the language of God, all else is poor translation. Rumi

Procesamiento del Lenguaje Natural, Revista nº 51, septiembre de 2013 ISSN

Life is not meant to be easy, my child; but take courage: it can be delightful. George Bernard Shaw

Aplicaciones del Procesamiento del Lenguaje Natural en la Recuperaciéon de Informaciéon en

Love only grows by sharing. You can only have more for yourself by giving it away to others. Brian

xxxiii congreso internacional de la sociedad española para el procesamiento del lenguaje natural

Don’t grieve. Anything you lose comes round in another form. Rumi

PAGINA 3 DEL n∫ 49.qxp_PAGINA 3 DEL n∫ 44

It always seems impossible until it is done. Nelson Mandela

Revista COS Marzo 2016

Be grateful for whoever comes, because each has been sent as a guide from beyond. Rumi

Revista SBCCP 44(2)

Open your mouth only if what you are going to say is more beautiful than the silience. BUDDHA

De la Subjetividad del Lenguaje

The happiest people don't have the best of everything, they just make the best of everything. Anony

Deliberazione n. 19 del 26 marzo 2018

Be like the sun for grace and mercy. Be like the night to cover others' faults. Be like running water

DM del 10 marzo 2017, n. 141

Those who bring sunshine to the lives of others cannot keep it from themselves. J. M. Barrie

Idea Transcript

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010

ISSN: 1135-5948 Artículos Un Análisis Comparativo de Estrategias para la Categorización Semántica de Textos Cortos A Comparative Analysis of Strategies for Semantic Short-Text Categorization María V. Rosas, Marcelo L. Errecalde, Paolo Rosso .................................................................................. 11 Cómo mejorar UniArab con FunGramKB Enhancing UniArab with FunGramKB Carlos Periñán-Pascual, Ricardo Mairal Usón............................................................................................ 19 Criterios ontológicos en FunGramKB Ontological commitments in FunGramKB Carlos Periñán-Pascual, Francisco Arcas-Túnez. ...................................................................................... 27 Un estudio inicial sobre el resumen de argumentos de películas An initial study on text summarisation in film stories Yan Xu, Michael P. Oakes .......................................................................................................................... 35 Segmentación textual Text segmentation Fernando Chicharro Esteban ...................................................................................................................... 43 Sistema de Diálogo Multimodal para una Aplicación de Inteligencia Ambiental en una Vivienda A Multimodal Dialogue System for an Ambient Intelligent Application in Home Environments Nieves Ábalos, Gonzalo Espejo, Ramón López-Cózar, Zoraida Callejas .................................................. 51 Localización de Palabras basada en Grafos de Fonemas Word Spotting based on Phoneme Graphs Jon Ander Gómez Adrián, Marcos Calvo Lance, Emilio Sanchis Arnal ...................................................... 59 Selección de características para la clasificación de preguntas multilingüe Feature selection for multilingual question classification David Tomás y José L. Vicedo ................................................................................................................... 67 Búsqueda de Respuestas multilingüe: ¿es buena idea buscar respuestas en otros idiomas distintos a los de la pregunta? Multilingual Question Answering: is it a good idea to search answers in different languages from the question one? Miguel Angel García Cumbreras, L. Alfonso Ureña López, Fernando Martínez Santiago, José Manuel Perea Ortega ......................................................................................................................... 75 Mejora de la Precisión del Análisis para el Español con Maltparser Improving parsing Accuracy for Spanish using Maltparser Miguel Ballesteros, Jesús Herrera, Virginia Francisco, Pablo Gervás........................................................ 83 Uso de detección de bigramas para categorización de texto en un dominio científico Using bigrams detection for text categorization in scientific domain Arturo Montejo Ráez, María Teresa Martín Valdivia, José Manuel Perea Ortega, L. Alfonso Ureña López .............................................................................................................................. 91 Simple_PLUS: una red de relaciones léxico-semánticas Simple_PLUS: a network of lexical semantic relations Nilda Ruimy ................................................................................................................................................ 99 Integración de los Sistemas de Diálogo para la Interacción en Redes Sociales Spoken Dialogue Systems for its Interaction in Social Networks D. Griol, M.Á. Patricio, J.M Molina, Á. Arroyo, Z. Callejas, R. López-Cózar ............................................. 107 Acceso Multilingüe a Sistemas de Ayuda y Bases de datos En Línea Multilingual Access to Online Help Systems and select_restr ="PLUS_ ANIMATE" id = "ARG2anestetizzare#1" semanticrole="Role_Underspecified" comment="Shadow_argument" select_restr ="USem3036anestetico"

3.4

The database management tool of the PSC lexicon, which is the one used for Simple_PLUS, does not allow for the computation of inheritance at the semantic level. Consequently, although many properties are largely shared and could therefore be inherited from their ancestors’ entries, every single feature of a semantic unit has to be explicitly defined in its lexical entry. Undoubtedly, the addition of more than 5,000 relations has still exponentially increased redundancy. Some entries9 contain so much information that they turn unmanageable unless an inheritance mechanism enters the picture so as to permit to overtly represent only those word’s specific properties and links that are essential to discriminate it from its closest semantically related words, especially its hyperonym. This presupposes, of course, to rely on a high-quality encoding and particularly on consistent taxonomic links. The implementation of inheritance, which is currently being tested (Del Gratta et al., 2008), is providing encouraging results, i.e. a dramatic reduction of explicitly encoded links. To give but one example, the lexical entry for the main meaning of the verb vendere [to sell], which is involved (as source or target term) in 273 semantic relations is reduced by 250 links that are derived by inheritance whereas only 23 specific relations are overtly represented.

4

Concluding remarks

The extensive instantiation in Simple_PLUS of new relations linking both events to their participants and co-participants in events helps gain deeper knowledge of the syntactic and semantic behaviour of word senses. It strengthens and enhances the representation of the semantic predicate.

Table 2: A semantic entry: to anaesthetize 8

Optimizing the lexicon format

9

such as, for example, those of high frequency activity verbs, as for example ‘to work’, encoding the link to their typical agents.

*=SIMPLE relations; **= EWN-borrowed rels.

105

Nilda Ruimy

Moravcsik, J. M. 1975. Aitia as Generative Factor in Aristotle’s Philosophy, Dialogue 14 : 622-36.

On the one hand, while in the SIMPLE model predicate’s arguments are constrained through restrictions on their semantic type membership, the newly encoded links enable to move forward from the expression of combinatorial possibilities at the ontological level to their specification at the lexical level. On the other hand, the relations involving instruments, locations and results enrich the semantic description by providing knowledge on those adjuncts or extra-thematic roles which are part of a semantic scenario and are therefore essential for a full understanding of texts. Combined with the wealth of relations provided by the SIMPLE model, the newly encoded links between events and entities constitute powerful tools that contribute to performance gains in NLP applications and are most relevant to build up and make explicit the semantic scenarios potentially useful to the Semantic Web.

Pustejovsky, J. and B. Boguraev. 1993. Lexical Knowledge Representation and Natural Language Processing, Artificial Intelligence 63 : 193-223. Pustejovsky, J. 1995. The Generative Lexicon. The MIT Press, Cambridge, MA. Pustejovsky, J. 2001. Type Construction and the Logic of Concepts. In P. Bouillon and F. Busa eds., The Syntax of Word Meanings, Cambridge University Press, pp. 91-123. Pustejovsky, J. 2006. “Type Theory and Lexical Decomposition”. In Journal of Cognitive Science 6: 39-76. Roventini, A. et al. 2003. ItalWordNet: building a large semantic database for the automatic treatment of Italian. In A. Zampolli, N. Calzolari, L. Cignoni, eds., Computational Linguistics in Pisa, Special Issue, XVIII-XIX, Tomo II, pp. 745-791, IEPI. Pisa-Roma.

References Alonge, A. ed. 1996. Definition of the links and subsets for verbs, EuroWordNet Project LE24003, Deliverable D006, WP4.1, final version.

Ruimy, N., M. Monachini, R. Distante, E. Guazzini, S. Molino, M. Ulivieri, N. Calzolari, and A. Zampolli. 2002. CLIPS, A Multil-level Italian Computational Lexicon: a Glimpse to Data. In Proceedings of LREC 2002, Volume III, pp. 792-79, Las Palmas de Gran Canaria (Spain).

Climent, S., H. Rodríguez and J. Gonzalo. 1996. Definition of the Links and Subsets for Nouns of the EuroWordNet Project. Deliverable D005, EuroWordNet, LE24003, Computer Centrum Letteren, University of Amsterdam.

Ruimy, N. et al 2003. A computational semantic lexicon of Italian: SIMPLE. In A. Zampolli, N. Calzolari, L. Cignoni, eds., Computational Linguistics in Pisa, Special Issue, XVIIIXIX, Tomo II, pp. 821-864, IEPI. Pisa-Roma.

Cruse, D. A. 1986. Lexical Semantics. Cambridge University Press, Cambridge. Del Gratta, R., N. Ruimy and A. Toral. 2008. SIMPLE-CLIPS ongoing research: more information with less data by implementing inheritance. In Proceedings of LREC 2008, CD-ROM, Marrakech (Morocco).

Ruimy, N. 2006. Merging two Ontology-based Lexical Resources. In Proceedings of LREC 2006, pp. 1716-1721, Genova (Italy). Ruimy, N. 2007. Enhancing SIMPLE Semantic Relations: A Proposal. In Proceedings of 3rd Language & Technology Conference. pp. 119-123, Fundacja Uniwersytetu im A. Mickiewicza, Poznań (Poland).

Dowty, D. 1991. Thematic Proto-roles and Argument Selection. Language, 67(3): 547-619. Fellbaum, C., D. Gross and K. Miller. 1993. Adjectives in WordNet, in Miller et al. Five Papers in WordNet, pp. 26–39, Cognitive Science Laboratory, Princeton University.

Ruimy, N. and A. Toral. 2008. More semantic links in the SIMPLE-CLIPS database. In Proceedings of LREC 2008, CD-ROM, Marrakech (Morocco).

Lenci, A., F. Busa, N. Ruimy, E. Gola, M. Monachini, N. Calzolari, A. Zampolli et al. 2000. SIMPLE Linguistic Specifications. LE4-8346 SIMPLE, Deliv. D2.1 & D2.2. ILC and University of Pisa, Pisa, 404 pp.

Vossen, P. ed. 2002. EuroWordNet General Document, final version, Vrije Universiteit, Amsterdam.

106

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 107-114

recibido 20-01-10 revisado 18-02-10 aceptado 09-03-10

Integraci´ alogo para la Interacci´ on en on de los Sistemas de Di´ ∗ Redes Sociales Spoken Dialogue Systems for its Interaction in Social Networks ´ Patricio, J.M Molina ´ Arroyo D. Griol, M.A. A. Z. Callejas, R. L´ opez-C´ ozar Dpto. Sist. Int. Aplicados Dpto. Leng. y Sist. Inform´aticos Dpto. de Inform´atica Univ. Polit´ecnica de Madrid Univ. de Granada Univ. Carlos III de Madrid {dgriol,mpatrici}@inf.uc3m.es, [email protected] [email protected] {zoraida,rlopezc}@ugr.es Resumen: Con el desarrollo de la denominada Web 2.0 y el gran inter´es y extensi´on que han alcanzado actualmente las redes sociales, se est´an introduciendo r´apidamente un gran n´ umero de aplicaciones que originan a la vez nuevas formas de comunicaci´on e interacci´on entre los usuarios. Por otro lado, la investigaci´on en el campo de los sistemas de di´alogo posee en la actualidad un gran n´ umero de retos relativos a la introducci´on de nuevas modalidades tanto de entrada como de salida, as´ı como aplicaciones y metodolog´ıas que favorezcan la adaptaci´on de estos sistemas a las caracter´ısticas y preferencias espec´ıficas de cada usuario. Las redes sociales, y m´as concretamente los mundos virtuales, suponen de este modo un escenario perfecto para llevar a cabo estas lineas de investigaci´on. En este art´ıculo presentamos un estudio relativo a la integraci´on de sistemas de di´alogo en redes sociales, describiendo en nuestro trabajo la utilizaci´on de un sistema de di´alogo que proporciona informaci´on acad´emica para crear un avatar conversacional dentro del mundo virtual de Second Life. Palabras clave: Sistemas de Di´alogo, Redes Sociales, Mundos Virtuales, Second Life Abstract: With the development of so-called Web 2.0 and the great interest and extension that social networks have now reached, a large number of applications that originate new forms of communication and interaction among users have been quickly introduced. Furthermore, research in the field of dialogue systems currently have a number of challenges related to the introduction of new input and output modalities, as well as applications and methodologies that allow the personalization of these systems to the specific characteristics and preferences of each user. Social networks and, in particular virtual worlds, have thus a perfect setting to carry out these research objectives. In this paper we present a study about the integration of dialog systems with social networks, describing the use of a dialogue system that provides academic information to develop a conversational bot in the Second Life virtual world. Keywords: Dialogue Systems, Social Networking, Virtual Worlds, Second Life

1.

Introducci´ on

El desarrollo de la denominada Web 2.0 ha hecho posible la introducci´on de numerosas aplicaciones utilizadas diariamente por un gran n´ umero de usuarios. Estas aplicaciones han originado nuevas formas y canales de comunicaci´on que est´an cambiando ∗ Trabajo financiado parcialmente por los Proyectos CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/TIC-1485) y DPS2008-07029-C02-02

ISSN 1135-5948

profundamente las formas de comunicaci´on entre los usuarios de Internet. Las redes sociales han surgido en este marco como un fen´omeno de consumo global durante los u ´ltimos a˜ nos. De acuerdo con (Nielsen, 2009), dos terceras partes de los internautas visitan redes sociales o blogs, actividad que consume el 10 % del tiempo que pasan en la red. La relevancia adquirida por este tipo de actividades est´a favoreciendo la aparici´on de

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

David Griol, Zoraida Callejas, Ángel Arroyo, M. Ángel Patricio, J. Manuel Molina, R. López-Cozar

2000) y, al mismo tiempo, beneficiarse de las modalidades visuales que proporcionan estos mundos virtuales. Nuestro trabajo se centra en dos puntos fundamentales. En primer lugar, dado que es muy dif´ıcil encontrar trabajos en la literatura que describan la integraci´on de las Tecnolog´ıas del Habla y el Procesamiento del Lenguaje Natural en los mundos virtuales, mostrar que esta integraci´on es posible. A partir de esta integraci´on, ambas tecnolog´ıas pueden beneficiarse de las ventajas que supone disponer de la voz como manera m´as natural de poder comunicarse, as´ı como de disponer de las modalidades visuales que proporcionan estos entornos virtuales. En segundo lugar, mostrar una aplicaci´on pr´actica de esta integraci´on mediante la utilizaci´on de un sistema de di´alogo que proporciona informaci´on acad´emica y empleando el mundo virtual Second Life. De este modo, el sistema de di´alogo desarrollo puede adem´as beneficiarse de la posibilidad de interactuar con el gran n´ umero de usuarios que ofrecen estos entornos

nuevas formas de comunicarse, de compartir informaci´on y de interactuar que nos afectan tambi´en en nuestra vida cotidiana (Ellison, Steinfield, y Lampe, 2007; Boyd y Ellison, 2007; Dwyer, 2007; Boyd y Heer, 2006). Con el avance en estas tecnolog´ıas, se han producido durante la u ´ltima d´ecada enormes avances en el desarrollo de mundos virtuales o “metaversos”. Estos mundos pueden definirse como entornos gr´aficos simulados por ordenador “cohabitados” por los usuarios a trav´es de sus avatares. Tradicionalmente, los mundos virtuales se han estructurado a priori predefiniendo las tareas realizables por los usuarios. En la actualidad, en los mundos sociales virtuales, la interacci´on social posee un papel clave y los usuarios pueden determinar sus experiencias en el mundo virtual siguiendo sus propias decisiones (Yoonhyuk y Hyunmee, 2009). De este modo, los mundos virtuales se han transformado en verdaderas redes sociales u ´tiles para la interacci´on entre personas de diferentes lugares que pueden socializar, aprender, entretenerse, etc. Debido al potencial social de los mundos virtuales, se han convertido en un atractivo para instituciones, empresas e investigadores, con la finalidad de desarrollar robots virtuales con las mismas apariencia y capacidades que los avatares correspondientes a usuarios humanos. Estos robots virtuales se denominan “metabots”, t´ermino acu˜ nado a partir de la contracci´on de los t´erminos metaverso y robot. Por tanto, un metabot es un software completamente capaz de interactuar en uno o m´as metaversos a trav´es de uno o varios avatares. Los metabots intensifican de este modo la percepci´on del mundo virtual, proporcionando gestos, miradas, expresiones faciales y movimientos necesarios para el proceso de comunicaci´on. Sin embargo, la interacci´on social en mundos virtuales se lleva a cabo generalmente en modo de texto mediante servicios de tipo chat. Nuestra propuesta es enriquecer la comunicaci´on en estos entornos, a˜ nadiendo capacidades de conversaci´on a los metabots (L´opez-C´ozar y Araki, 2005; Griol et al., 2008). Con este fin, proponemos la integraci´on de sistemas de di´alogo para la construcci´on de metabots inteligentes con la capacidad de conversar oralmente (Eynon, Davies, y Wilks, 2009; Justine Cassell y Churchill,

2.

Second Life

Second Life (SL) es un mundo virtual tridimensional desarrollado por Linden Lab en 2003 y accesible a trav´es de Internet. Un programa cliente gratuito llamado Second Life Viewer permite que sus usuarios, llamados ”residentes”, interact´ uen unos con otros a trav´es de avatares con capacidad de movimiento, proporcionando de este modo un nivel avanzado de servicio de red social. Los residentes pueden explorar, conocer a otros residentes, socializar, participar en actividades individuales y de grupo, crear y comerciar con objetos y servicios. Existen diferentes formas que pueden utilizarse para la comunicaci´on entre los residentes, las principales son los gestos, mensajes de texto y la voz. Los gestos son animaciones a partir de las cuales se puede simular una determinada acci´on. SL incluye una herramienta a partir de la cual se pueden dise˜ nar gestos personalizados. Los residentes pueden utilizar adem´as la funcionalidad de un chat, que posibilita la transmisi´on de mensajes de texto. Finalmente, los residentes pueden optar tambi´en por la utilizaci´on de la voz, lo que permite a los usuarios utilizar sus micr´ofonos para hablar entre ellos en tiempo real.

108

Integración de los Sistemas de Diálogo para la Interacción en Redes Sociales

3.

Actualmente SL se utiliza con ´exito como una plataforma para la educaci´on en muchas instituciones, como colegios, universidades, bibliotecas y entidades gubernamentales (por ejemplo, la Universidad de Ohio, la Royal Opera House de Londres, la Universidad P´ ublica de Navarra, el Instituto Cervantes, la Universidad Polit´ecnica de Madrid, la Universidad de Vigo, etc.). Hemos decidido utilizar Second Life como laboratorio experimental de nuestra investigaci´on por varias razones. En primer lugar, porque es uno de los mundos sociales virtuales m´as populares: su poblaci´on es hoy en d´ıa de millones de residentes en todo el mundo. En segundo lugar, porque utiliza unas tecnolog´ıas muy avanzadas para el desarrollo de simulaciones realistas, con lo que los avatares y el medio son m´as cre´ıbles y similares a los usuarios del mundo real. En tercer lugar, porque la capacidad de SL para la personalizaci´on es extensa y fomenta la innovaci´on y la participaci´on del usuario, lo que aumenta la naturalidad de las interacciones que tienen lugar en el mundo virtual. Para llevar a cabo nuestra investigaci´on disponemos de una isla en Second Life llamada TESIS (Tierra para la Experimentaci´on en Sistemas Inteligentes Simulados). En esta isla se han desarrollado diferentes instalaciones virtuales en las que se llevan a cabo numerosas actividades educativas. La figura 1 muestra una imagen de la isla TESIS durante una presentaci´on virtual de las ponencias realizadas en uno de nuestros cursos de doctorado.

Creaci´ on de un metabot conversacional para un dominio espec´ıfico

Hemos desarrollado un metabot conversacional, al que hemos llamado Demic, que facilita informaci´on acad´emica. Demic se basa en las funcionalidades proporcionadas por un sistema de di´alogo desarrollado previamente denominado Universidad Al Habla (UAH) (Callejas y L´opez-C´ozar, 2005; Callejas y L´opez-C´ozar, 2008). La informaci´on que proporciona el metabot puede clasificarse en cuatro categor´ıas: asignaturas, profesores, estudios de doctorado y matr´ıcula, tal y como muestra la Tabla 1. El sistema se ha desarrollado mediante la arquitectura t´ıpica de sistemas de di´alogo hablado actuales, incluyendo un m´odulo de reconocimiento autom´atico del habla, un gestor de di´alogo, un m´odulo de acceso a bases de datos, almacenamiento de datos y la generaci´on de respuesta oral mediante un generador de lenguaje y un sintetizador de texto a voz. El gestor de di´alogo que utiliza Demic se ha desarrollado utilizando documentos VoiceXML que se crean din´amicamente usando PHP. De esta manera, se puede adaptar las respuestas del sistema siguiendo el contexto de la conversaci´on y el estado de di´alogo, mejor´andose con ello la naturalidad de la interacci´on. Por ejemplo, los mensajes de ayuda proporcionados por el sistema tienen en cuenta el tema de que el usuario y el sistema est´an abordando en un momento determinado. El contexto se utiliza tambi´en como mecanismo para decidir la estrategia de confirmaci´on para su uso. La Figura 2 muestra a Demic interactuando con el avatar de un usuario. La Figura 3 muestra un ejemplo de di´alogo adquirido con Demic. Los turnos etiquetados con D se refieren al avatar conversacional Demic y con U se refieren al usuario. Demic gestiona un cliente oficial de Second Life para llevar a cabo su conexi´on con el mundo virtual. La Figura 4 muestra la arquitectura desarrollada para la integraci´on del agente conversacional en Second Life. El sistema de di´alogo que gobierna al metabot se sit´ ua fuera del mundo virtual, utiliz´andose para ello los servidores de datos y de VoiceXML necesarios. Mediante c´odigo desarrollado con C# .NET y la utilizaci´on

Figura 1: Una imagen de la isla TESIS en Second Life

109

David Griol, Zoraida Callejas, Ángel Arroyo, M. Ángel Patricio, J. Manuel Molina, R. López-Cozar

Categor´ıa

Informaci´ on proporcionada por el usuario (nombre y ejemplos) Nombre

Inteligencia Artificial

Grado, en el que se imparte en el caso que existan varias asignaturas con el mismo nombre Grupo y opcionalmente tipo, en el caso de que el usuario pregunte por un grupo espec´ıfico Cualquier combinaci´on de nombre y apellidos

Inform´atica

Opcionalmente semestre, en caso de que el usuario pregunte por las horas de tutor´ıa Nombre del Programa de Doctorado Nombre del curso si el usuario solicita informaci´on sobre un curso espec´ıfico Nombre de la convocatoria

Primer semestre Segundo semestre

Asignatura

Profesores

Estudios Doctorado

Matr´ıcula

Informaci´ on proporcionada por el sistema Grado, profesores, profesor responsable, semestre, cr´editos, p´agina web

A Teor´ıa

Horario, profesor

Jos´e Jos´e Garc´ıa Sr. Garc´ıa

Localizaci´on Despacho, Informaci´on de Contacto (tel´efono, fax, email), grupos y asignaturas, cursos de doctorado Horas de Tutor´ıa

Desarrollo de Software Programaci´ on Orientada a Objetos

Confirmaci´ on Registro Provisional

Departamento, responsable Tipo, cr´editos

Fecha Inicial, Fecha Final, Descripci´on

Tabla 1: Informaci´on proporcionada por el metabot conversacional

D1: Soy Demic. Puedo proporcionarte informaci´on sobre asignaturas, profesores, estudios de doctorado y matr´ıculas. U1: Me gustar´ıa conocer informaci´on sobre profesores. D2: Dime el nombre del profesor. U2: Jos´e Garc´ıa. D3: {Informaci´ on sobre el profesor} ¿Deseas algo m´as? U3: Me gustar´ıa saber tambi´en cu´al es el periodo de matr´ıcula en el Grado de Inform´atica. D4: {Informaci´ on sobre la matr´ıcula} ¿Deseas algo m´as? U4: No, gracias. D5: ¡Gracias!

Figura 2: Imagen de Demic (a la izquierda) en Second Life

de la librer´ıa SpeechLib se lleva a cabo la conexi´on de audio para capturar la se˜ nal proporcionada por el sintetizador texto a voz y transmitirla al m´odulo servidor de voz en Second Life (SLVoice). Este m´odulo es

Figura 3: Ejemplo de di´alogo adquirido mediante la interacci´on con Demic

110

Integración de los Sistemas de Diálogo para la Interacción en Redes Sociales

Figura 4: Esquema de la arquitectura utilizada para la integraci´on del metabot conversacional en Second Life un mensaje distinto a la se˜ nal de voz generada, de manera que esta informaci´on textual pueda complementar la informaci´on proporcionada vocalmente (por ejemplo, en los casos en los que la informaci´on que hay que proporcionar mediante la voz es demasiado largo y puede simplificarse mediante una explicaci´on vocal y un texto que la complemente).

externo al programa cliente para visualizar el mundo virtual (Second Life Viewer) y est´a basado en la tecnolog´ıa Vivox, que utiliza los protocolos RTP, SIP, OpenAL, TinyXPath, OpenSSL y LibCurl para la transmisi´on de los datos de voz. De este modo, el visor de Second Life se encarga de configuraci´on, control y funciones de pantalla, pero no as´ı de la se˜ nal de voz (tanto desde el micr´ofono como desde el servidor de voz Vivox). Adem´as, utilizamos la utilidad lipsynch proporcionada por Second Life para sincronizar de este modo la se˜ nal de voz con los movimientos de los labios del avatar. Mediante esta funcionalidad se posibilita adem´as informar visualmente de que el avatar conversacional se est´a comunicando oralmente en un determinado momento, dado que permite mostrar la forma de onda junto a la cabeza del avatar. Por u ´ltimo, hemos integrado un emulador de teclado que permite adem´as transmitir la transcripci´on de texto generada por el avatar conversacional directamente al Chat de Second Life, para los casos de que se desee adem´as transmitir este texto al usuario. Utilizando esta funcionalidad es posible adem´as transmitir mediante el chat

3.1.

Selecci´ on de la apariencia visual del metabot en el mundo virtual

Uno de los primeros problemas que surgen en el desarrollo de metabots consiste en definir los par´ametros dentro de los cuales los humanos nos comunicamos en el metaverso y c´omo posibilitar que los metabots sean sensibles a estos par´ametros. De este modo, para construir metabots que son capaces de integrar comportamientos humanos hemos definido un concepto que denominamos Avatar Rank (Arroyo, Serradilla, y Calvo, 2009) y que nos permite tener en cuenta estos par´ametros para medir la “popularidad del avatar” en el mundo virtual. El concepto del Avatar Rank est´a inspirado en el ranking definido por el

111

David Griol, Zoraida Callejas, Ángel Arroyo, M. Ángel Patricio, J. Manuel Molina, R. López-Cozar

buscador Google para la administraci´on de los enlaces tras realizar una b´ usqueda, imaginando las relaciones entre las localizaciones de los avatares como una red de enlaces. As´ı, con el t´ermino “popularidad” nos referimos a la atenci´on que est´a recibiendo un avatar en un determinado momento. Por lo tanto, cuanto mayor sea el valor del Avatar Rank, tanto mayor ser´a la atenci´on que le estar´an prestando otros avatares (ya sean humanos o robots), y como consecuencia, la actividad que realiza tendr´a m´as p´ ublico potencial. El Avatar Rank se construye adem´as sobre una medida m´as comprensible: la puntuaci´on (Score). Esta medida consiste en realizar una simplificaci´on para el caso de comunicaci´on unidireccional entre dos avatares. De este modo, la puntuaci´on de A sobre B [Score (A, B)] se define como el “nivel de atenci´on” que est´a recibiendo A de B. Para desarrollar en nuestros metabots la capacidad de distinguir cu´ando otros avatares se dirigen a ´el, calculamos el Score(Metabot, AvatarN), y si est´a puntuaci´on supera un umbral determinado (prefijado en la experimentaci´on que hemos llevado a cabo en 0,3), puede considerarse que el AvatarN est´a interesado en nuestro metabot. Obviamente, el c´alculo del Avatar Rank se deriva de una matriz de puntuaciones (los Scores) que indican la influencia que cada avatar o metabot ejerce sobre los dem´as y de la influencia que los dem´as ejercen sobre ´el. Para el c´alculo del Score se utilizan dos medidas trigonom´etricas: la distancia entre A y B, y el ´angulo facial de A en B. En resumen, el Score es mayor cuanto m´as directamente est´e B mirando a A y m´as cerca se encuentren uno de otro. Las siguientes ecuaciones definen el valor del Avatar Rank y de las funciones Score.

avatarRank(X) =

La Figura 5 muestra una representaci´on gr´afica de la funci´on Score. La funci´on Score se asemeja a una formulaci´on matem´atica del concepto del espacio personal.

Figura 5: Funci´on Score. Representaci´on espacial. Representaci´on en escala lineal

3.2.

Score(AoverB) = Score(A ← B) = = Sa,b = Score(a, b)

Score(a, b) =

Dmax − (distance(a, b) − Dmin) . Dmax Π − |atan(b − a) − rotation(b)| . Π 

S1,1  S2,1 ScoreM atrix =   ··· SN,1

S1,2 S2,2 ··· SN,2

··· ··· ··· ···

N 1 ∑ SX,i N − 1 i=1

 S1,N S2,N   ∀i ∈ N, Si,i = 0 ···  SN,N

112

Izquierda: Derecha:

Experiencia preliminar con el avatar desarrollado

Para medir el impacto de la apariencia del metabot sobre la interacciones de los usuarios, lo hemos comparado con dos chatbots que tambi´en interact´ uan en la isla TESIS. El primero de ellos es el chatbot Elbot (http://www.elbot.com), participante en los 18th Loebner Prize for Artificial Intelligence celebrados en 2008. Para poder utilizarlo en esta comparativa, se le ha dotado de una apariencia humana (metabot Chatterbox ). Adicionalmente, hemos generado una imitaci´on de metabot, que hemos denominado Pretender. Cada vez que alguien se acerca a este metabot, ´este realiza una b´ usqueda en un portal web de citas famosas, y pronuncia la frase con la mejor puntuaci´on. Al mismo tiempo, el metabot tambi´en indexa las frases que recibe de otros avatares, que se almacenan en una lista indexada de frases escuchadas. Los metabots anteriores han estado en funcionamiento durante varios meses en la isla TESIS en Second Life, registr´andose cada di´alogo. La Figura 6 muestra los resultados de las interacciones recibidas durante dos meses por los tres metabots. De los resultados, podemos ver que el metabot Pretender tiene una utilizaci´on con un menor n´ umero de turnos de di´alogo,

Integración de los Sistemas de Diálogo para la Interacción en Redes Sociales

di´alogo que gobierna al avatar con usuarios reales fuera del mundo virtual de Second Life. La Tabla 2 muestra las estad´ısticas de la adquisici´on de 50 di´alogos. La principal conclusi´on que puede extraerse de este estudio preliminar es la pr´actica inexistencia de diferencias en cuanto a estas estad´ısticas entre los di´alogos adquiridos utilizando u ´nicamente el sistema de di´alogo y los di´alogos adquiridos mediante la interacci´on con el avatar conversacional en Second Life.

4. Figura 6: Turnos de di´alogo para las conversaciones con cada metabot: Demic, Chatterbox y Pretender

Conclusiones

El desarrollo de las redes sociales y los mundos virtuales ofrece una amplia gama de oportunidades y nuevos canales de comunicaci´on que se pueden incorporar a las interfaces tradicionales. En este trabajo hemos propuesto una metodolog´ıa para integrar sistemas de di´alogo en la creaci´on de metabots conversacionales. Siguiendo esta propuesta se ha desarrollado un metabot que proporciona informaci´on acad´emica en Second Life y que es capaz de interaccionar oralmente con los avatares de los usuarios. Second Life ofrece un gran n´ umero de posibilidades para la evaluaci´on de estos nuevos canales y metodolog´ıas de comunicaci´on dada la posibilidad que tienen sus usuarios de socializar, explorar, conocer a otros residentes y acceder a un gran n´ umero de recursos educativos y culturales. En el trabajo que presentamos hemos evaluado dos aspectos principales de la interacci´on de los usuarios en estos mundos virtuales con el avatar. La primera es la influencia de las caracter´ısticas gr´aficas en la comunicaci´on oral del avatar con los diferentes usuarios. Con el segundo estudio hemos evaluado adem´as que las caracter´ısticas de los di´alogos se mantienen tras la integraci´on realizada, pudiendo disponer adem´as de un n´ umero de modalidades adicionales que nos ofrece la interacci´on en el mundo virtual. Como trabajo futuro queremos evaluar nuevas caracter´ısticas que pueden incorporarse al desarrollo del metabot conversacional y mejorar el proceso de comunicaci´on, realizando un an´alisis detallado de la integraci´on de las modalidades para la presentaci´on de la informaci´on que ofrece SL adicionalmente al uso de la voz. En especial, queremos llevar a cabo un estudio del efecto del comportamiento emocional del metabot en

Demic aumenta este n´ umero y es el robot Chatterbox el que tiene una mayor cantidad de interacciones. Los resultados indican que cuanto m´as h´abil es un avatar para entablar el di´alogo, mayor n´ umero de conversaciones recibe. Bas´andose en estos resultados, se realiz´o otro estudio en el que se eligi´o tres formas visuales para Demic: apariencia completamente humana, meros rasgos humanos y apariencia rob´otica. La Figura 7 muestra los resultados de esta evaluaci´on. Puede observarse como la apariencia f´ısica del bot tiene una gran influencia en los resultados obtenidos, obteni´endose un mayor n´ umero de conversaciones cuanto mayor es la similitud del avatar a un ser humano.

Figura 7: Turnos de di´alogo teniendo en cuenta la apariencia humana del avatar Demic Mediante la participaci´on de alumnos y profesores de la universidad hemos adquirido tambi´en una serie de di´alogos utilizando los mismos escenarios definidos para realizar una adquisici´on previa utilizando el sistema de

113

David Griol, Zoraida Callejas, Ángel Arroyo, M. Ángel Patricio, J. Manuel Molina, R. López-Cozar

N´ umero medio de turnos de usuario por di´alogo Porcentaje de confirmaciones por parte del avatar Preguntas del avatar para requerir informaci´on Respuestas generadas por el avatar tras una consulta a la base de datos

4,99 13,51 % 18,44 % 68,05 %

Tabla 2: Estad´ısticas de los di´alogos adquiridos mediante la participaci´on de estudiantes y profesores en Second Life of Computer-Mediated Communication, 12(4).

la comunicaci´on oral, mediante la integraci´on de texturas que permiten dotar de gestos y emociones a las expresiones faciales de los avatares.

Eynon, R., C. Davies, y Y. Wilks. 2009. The Learning Companion: an Embodied Conversational Agent for Learning. En Proc. of the WebSci’09: Society On-Line, Athens, Greece.

Bibliograf´ıa Arroyo, A., F. Serradilla, y O. Calvo. 2009. Multimodal agents in second life and the new agents of virtual 3d environments. En Proc. of the 3rd International Work-Conference on The Interplay Between Natural and Artificial Computation (IWINAC’09), p´aginas 506–516.

Griol, D., L.F. Hurtado, E. Segarra, y E. Sanchis. 2008. A Statistical Approach to Spoken Dialog Systems Design and Evaluation. Speech Communication, 50(8–9):666–682. Justine Cassell, Joseph Sullivan, Scott Prevost y Elizabeth F. Churchill. 2000. Embodied Conversational Agents. Mit Press.

Boyd, D. y N. Ellison. 2007. Social Network Sites, Definition, History and Scholarship. Journal of Computer Mediated Communication, 13(1).

L´opez-C´ozar, R. y M. Araki. 2005. Spoken, Multilingual and Multimodal Dialogue Systems. John Wiley & Sons Publishers.

Boyd, D. y J. Heer. 2006. Profiles as Conversation: Networked Identity and Performance on Friendster. En Proc. of the IEEE International Conference on System Sciences, p´aginas 1279–1282, Kauai, (Hawaii).

Nielsen. 2009. Global Faces and Networked Places: A Nielsen Report on Social Networking’s New Global Footprint. Nielsen Online.

Callejas, Z. y R. L´opez-C´ozar. 2005. Implementing modular dialogue systems: a case study. En Proc. of Applied Spoken Language Interaction in Distributed Environments (ASIDE’05), Aalborg, Denmark.

Yoonhyuk, Jung y Kang Hyunmee. 2009. User goals in social virtual worlds: A means-end chain approach. Computers in Human Behavior, Article in Press.

Callejas, Z. y R. L´opez-C´ozar. 2008. Relations between de-facto criteria in the evaluation of a spoken dialogue system. Speech Communication, 50(8–9):646–665. Dwyer, C. 2007. Digital Relationships in the ’MySpace’ Generation: Results from a Qualitative Study. En Proc. of the 40th Annual Hawaii International Conference on System Sciences (HICSS’07), p´aginas 19–28, Hawaii. Ellison, N., C. Steinfield, y C. Lampe. 2007. The Benefits of Facebook ’Friends’: Social Capital and College Students’ Use of Online Social Network Sites. Journal

114

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 115-122

recibido 20-01-10 revisado 19-02-10 aceptado 05-03-10

Multilingual Access to Online Help Systems and Databases Acceso Multiling¨ ue a Sistemas de Ayuda y Bases de datos En L´ınea Alejandro Revuelta-Mart´ınez, Luis Rodr´ıguez and Ismael Garc´ıa-Varea Departamento de Sistemas Inform´aticos Universidad de Castilla-La Mancha 02071 Albacete, Spain {Alejandro.Revuelta,Luis.RRuiz,Ismael.Garcia}@uclm.es Resumen: En este art´ıculo presentamos el sistema AMSABEL, un sistema de acceso a informaci´on basado en lenguaje natural. AMSABEL proporciona acceso al contenido de una base de datos relacional, a partir de consultas realizadas en lenguaje natural, tanto en castellano como en ingl´es. De esta manera se permite un acceso directo a la base de datos por parte de usuarios no expertos. El sistema utiliza aprendizaje autom´atico para traducir la consulta original al lenguaje de consulta estructurado siendo, por lo tanto, independiente del contenido de la base de datos siempre que se disponga de datos de entrenamiento. Palabras clave: Acceso a informaci´on multiling¨ ue, procesamiento del lenguaje natural, SQL, traducci´on autom´atica estad´ıstica Abstract: In this paper we present the AMSABEL system, a natural language based information retrieval system. AMSABEL provides access to data stored in a relational database using unrestricted natural language. Queries can be expressed in either, English or Spanish, thereby providing direct access to the database for non-expert users. The system relies on machine learning techniques to translate the original query from natural language into a structured query language and, therefore, does not depend on the contents of the database, as long as training data is available. Keywords: Multilingual information access, natural language processing, SQL, statistical machine translation

1

Introduction

Using natural languages to access to databases has important advantages. Most users are not able to directly use formal query languages and its learning can be difficult and time-consuming. The use of other kind of interfaces can solve some of these problems but, typically, at the expense of limiting flexibility. On the contrary, the user could employ the interface in a very natural way if he is able to directly request data in his native language. The idea of querying a database using natural language goes back to the late sixties and early seventies (Androutsopoulos, Ritchie, and Thanisch, 1995). These systems, also called Natural Language Interfaces to Databases, were built to be used with a particular database and, therefore, were not easily adaptable to other databases. Besides, those early systems only allowed to use a sub-

ISSN 1135-5948

set of natural language. Database access using natural language was improved in the following years by increasing the accuracy of the results as well as reducing the complexity of the adaptation of the systems to a new database. In the early nineties the Air Travel Information Service (ATIS) task (Price, 1990) was developed, consisting of a set of queries to an air travel database, expressed in both, spoken and written English. Human-generated results of the queries were also included. The ATIS task was created to be used in speech recognition evaluation but it has also been widely used in natural language database access systems. Some of the systems developed for this task were AT&T’s CHRONUS (Pieraccini et al., 1992), BBN, MIT’s TINA (Seneff, 1992), CMU’s Phoenix (Ward and Issar, 1994) or SRI’s Gemini (Dowding et al., 1993).

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Alejandro Revuelta-Martínez, Ismael García-Varea, Luis Rodríguez-Ruiz

2

Modern approaches are aimed at reducing the cost to access to new databases, this is mainly achieved by reducing the configuration effort required by non-expert users (Minock, 2010) or by using machine learning, as in He and Young (2003), Griol et al. (2006) or PRECISE (Popescu, Etzioni, and Kautz, 2003) systems. In this paper we present a system which allows multilingual access to online help systems and databases (AMSABEL)1 . Our proposal is based on statistical methods but, unlike any of the previous approaches, AMSABEL uses techniques applied in the field of Statistical Machine Translation. Modern SMT approaches appeared in the early nineties (Brown et al., 1993) and achieved high translation precision. SMT systems are built by means of machine learning, this way avoiding the need of adding expert knowledge, which is a common requirement of previous machine translation approaches. We propose the use of SMT to translate from English or Spanish into Structured Query Language (SQL), instead of another natural language. This entails new and interesting problems since the semantics of a formal language is, in general, more affected by translation errors. On the other hand, by using SMT methods our system inherits their portability and does not depend on the underlying database. We introduce, as well, a simple dialogue process for this kind of systems and a resultoriented evaluation methodology. Finally, we have implemented and tested a prototype for the AMSABEL system. The system is expected to be greatly useful. For example, it could be used by governments to provide access to official databases. The AMSABEL system could be used with little (or no) training by most citizens and its multilingual architecture will also help foreigners. This paper is organized as follows. Section 2 describes the task, the data and the corpus used for building and evaluating the system. Section 3 presents the AMSABEL system in detail. Section 4 describes the evaluation procedure and, finally, Section 5 presents conclusions and future work. 1

Task

The implemented system is used to access to information about train routes among several cities. The structure of that information is extracted from the Spanish railway company2 and contains, for each route, the following data: Train identification number, train type/service, departure and arrival cities, days of the week when the route is active, starting and ending route dates, departing and arriving times, ticket prices and ticket classes. A relational database has been designed including all the information previously described. The actual data has been automatically generated by choosing random values. Some constraints have been considered in the data generation to keep the semantics of the database (e.g., a route starting date should take place before its ending date). However, some other constraints have been ignored, since they are not important from the system point of view (e.g., the real distance between cities has not been considered when calculating trip lengths). The generated database stores information on 1094 routes covered by 50 different trains. The generation of the parallel, training corpora has been performed automatically using context-free grammars by means of syntax directed translation schemes (Amengual et al., 2000) which are typically used to develop language processors or compilers. This process has been performed starting from the grammar initial symbol, and using random rules until one pair of sentences is generated. As a result, two parallel sentences (in two different languages) will be generated, and this will be iterated until reaching a reasonable number of sentences. Three different grammars have been designed for the system, having each of them 90 rules. One grammar generates Spanish-SQL sentence pairs, other generates English-SQL sentences and the last one generates SpanishEnglish sentences. The latter will be used to translate an English query into Spanish when a user requests the help of a human operator. Table 1 shows an example of three parallel sentences in English, Spanish and SQL. Table 2 describes the training, validation and test sets features. Test values for the pair English-Spanish are not provided due to that 2

http://amsabel.albacete.org

116

http://www.renfe.es/horarios

Multilingual Acces to Online Help System and Databases

How much is a train from Barcelona to Madrid? ¿Cu´ anto cuesta un tren desde Barcelona a Madrid? SELECT DISTINCT Precio estacion, FROM Tren JOIN Billetes ON (Billetes.Tren=Tren.Id tren) WHERE Origen=‘barcelona’ AND Destino= ‘madrid’ This SQL sentence returns a column showing the prices of all the tickets (when purchased at the station) of the trains that go from Barcelona to Madrid. Eng. Spa. SQL

Table 1: Example of three parallel English-Spanish-SQL sentences.

Train Validation Test

# sentences vocabulary size running words # sentences running words # sentences running words

Spa–SQL 10000 4539 4032 138672 224265 1000 14873 20918 748 9413 14558

Eng–SQL 10000 4635 4132 139558 224947 1000 14494 21438 752 9763 15039

Eng–Spa 10000 4605 4161 139417 151743 1000 14659 15272 – – –

Table 2: Training, validation, and test task statistics. Test values for the pair English-Spanish are not provided due to that experiment is not reported.

experiment is not reported.

3

The AMSABEL system

A system has been implemented to interact with the database described in Section 2. This system allows a user to query information in either English or Spanish. In addition, it keeps constant communication with the user, requesting, if necessary, further information.

3.1

Figure 1: The AMSABEL system. The user queries in natural language (NL) are translated into SQL, returned and used to access to the database.

System overview

The whole system is designed following a client-server architecture with two kinds of clients and two kinds of servers (see Figure 1). sending them to a free operator and, finally, sending back the operator’s answer. This server can translate the input query when it is not in the operators’ native language.

• The user client gets the input from the user sending it to the translation server. In addition, it receives the SQL sentence and access to a specific database management system. Finally, it provides a platform for user-system dialogue.

• The operator client is used to translate user queries into SQL sentences. The operator can also send back to the user a comment that will be displayed as part of the user-system dialogue.

• The translation server translates natural language queries into SQL sentences. It takes part in the user-system dialogue by sending feedback to the user and attending his requests and, if the user requests a human operator, this server forwards the query to the operator server.

The access to the database using natural language is divided in four different processes: preprocessing, translation, postprocessing and the dialogue system.

• The operator server receives queries from users through the translation server

117

Alejandro Revuelta-Martínez, Ismael García-Varea, Luis Rodríguez-Ruiz

3.2

Preprocessing

In Eq.-(2), P r(e) is known as the language model probability and represents the probability of being e a sentence in the target language. Most systems use smoothed n-gram models (Jelinek, 1998) to estimate this probability:

In this step, the system input is prepared to be sent to the SMT engine by rewriting some parts of the sentence in order to make it readable by the translation engine or to reduce translation complexity.

P r(e) = P r(e1 · e2 · · · en ) n Y ≈ p(ei |ei−n+1 , · · · , ei−1 ) (3)

1. Character-level processing: Characters are written in lowercase. Consecutive spaces are collapsed into a single one and special characters (e.g., accents or the ‘˜ n’ letter) are formatted properly.

i=1

This factorization significantly reduces the parameters of the model, which are trained from data using maximum likelihood estimation. The AMSABEL system uses a trigram language model trained using the SRILM toolkit (Stolcke, 2002) on the SQL training corpus described in Section 2. The second probability, P r(f |e), is the translation model probability which represents the probability that f is a translation of e. In this paper this probability is obtained through phrase-based models (Koehn, Och, and Marcu, 2003). Phrase-based models split the input sentence f into segments (phrases) of consecutive words. Then, each source phrase is translated into a target language phrase and, finally, the target phrases are reordered to obtain the translation e. This model is formalized in Eq.-(4).

2. Language identification: To this end, the words in the query are compared to each considered language vocabulary to determine which one presents less out-ofvocabulary (OOV) words. However, if the query has many OOV words in all of them an error message is displayed (this filtering prevents the users from querying information outside the database domain or using non-supported languages). 3. Spell checking: OOV words are replaced by the closest word in the task vocabulary according to Levenshtein distance if a specific threshold is not exceeded. 4. Word-level processing: Regular expressions are used to identify dates, numbers, days of the week and similar expressions which are converted to the format expected by the translation engine.

P r(f |e) = P r(f¯1I |¯ eI1 )

5. Word categorization: A total of 8 categories are used to replace dates, departure cities, arrival cities, hours, days of the week, ticket classes and numeric values in the sentence.

3.3

=

φ(f¯i |¯ ei )d(ai − bi−1 ) (4)

i=1

Where f¯i is the i-th phrase in f , e¯i is the i-th phrase in e, φ(f¯i |¯ ei ) is the probability of being f¯i a translation of e¯i and d(ai − bi−1 ) is the distortion model used for reordering target phrases. The order of a target phrase f¯i depends on a probability distribution calculated using its start position (ai ) and the end position of f¯i−1 (bi−1 ). Phrase-based models used by the AMSABEL system are obtained from word alignments trained using the GIZA++ toolkit (Och and Ney, 2003) on the bilingual training corpora (English-SQL and Spanish-SQL) described in Section 2. Once the model has been trained, a search algorithm is employed to find the best translation for the source sentence. The

Translation

This step relies on SMT to translate either, the processed input query into an SQL sentence, or the input query to the operator’s mother language. From a formal point of view, SMT can be stated as the problem of ˆ which maximizes finding a target sentence e the probability P r(e|f ) for a given source sentence f : ˆ = argmax P r(e|f ) e (1) e

Using Bayes’ theorem, Eq.-(1) can be restated as: ˆ = argmax P r(e) · P r(f |e) e

I Y

(2)

e

118

Multilingual Acces to Online Help System and Databases

and 3.4. If no errors occurred in the user query it is translated into SQL and used to access to the database. Otherwise, an error message is generated and different solutions are suggested. Once a query has been correctly translated, it is used to retrieve the data, and the interface shows a description in natural language of the information retrieved. This way, the user can check the answer. In case the returned information is not correct, the user is allowed to choose one of these three new operations:

AMSABEL system uses Moses beam-search decoder (Koehn et al., 2007) to translate the input. This technique can also efficiently compute the n-best translations, allowing the user to request different translations of a single input sentence (see Section 3.5).

3.4

Postprocessing

The postprocessing step firstly restores the changes performed in the preprocessing step. Initially, special characters are expressed in SQL format. Secondly, categories are replaced by their original values. After those changes, the sentence should be in SQL format (see Table 1) but some errors can arise in the process. Any error in the sentence will cause that the system will not be able to return any result. Therefore, the SQL sentence is parsed to remove any syntactical error while trying to keep the semantics of the original query. To this end, the following steps are performed.

• Request more information. In case of some fields are missing in the result, this option adds additional fields to the system outcome. This option can be selected twice: the first time it adds fields directly connected with the user query and the second time all the fields of the database are added (this information could still be useful due to the domain of the task is very specific). This operation does not modify the constraints of the original query.

• A syntactic parser checks that the sentence is a correct SELECT SQL query. • The parser also checks that every table and field name exists in the database.

• Request another answer. Returns an alternative translation for the original query. Each time this option is selected the next translation in the n-best list is returned.

• If any error is found, the parser tries to correct it by synchronizing with the next field or table, or by inserting the expected keyword.

• Request an operator. Users may need some information that the system is not able to return. In addition, fully automatic systems are not perfect and can introduce errors. Therefore, the system allows the user to send his query to a human operator that will type a correct SQL sentence and, if necessary, can also return some feedback to the user.

• If a fatal error between SELECT and FROM keywords occurs, this fragment of the query is dropped and replaced by all the relevant fields of the tables that appear in the rest of the sentence. • If any error between FROM and WHERE keywords occurs, this part is dropped and completely regenerated from all the tables needed to form a correct SQL query.

The user can select the previous options in any order. 3.5.1 Feedback Users can easily feel frustrated if the system does not work as they expect, even if they are not using it properly. Another issue is that, although natural language is the most natural way for human interaction, users can also feel lost due to the lack of restrictions in the interface. To prevent these and other similar problems, the user can receive feedback information to keep him informed on the query status and to get explanations about what caused the errors.

At the end of this step the system, hopefully, has a complete and syntactical errorfree SQL sentence.

3.5

Dialogue system

The user-system interaction is not limited to typing a query and receiving an answer. Instead, a simple dialogue system assists the user in the query process. Firstly, the system reads a query from the user which is then processed and translated into SQL, as explained in Sections 3.2, 3.3

119

Alejandro Revuelta-Martínez, Ismael García-Varea, Luis Rodríguez-Ruiz

two sentences but can not check the correctness of the result of the query. For example, just by removing a keyword, a correct SQL sentence can be turned into an incorrect one and, on the contrary, the order of the constraints can be completely modified without changing the result of the query. From all this, comparison is based on the results of the SQL queries. The answers that show the same information as the corresponding reference result are considered useful, even if they include additional fields. Too much information at once can overwhelm the user but could also be really helpful if it includes the expected results. The exact procedure for the experiments carried out can be described as follows. For each sentence in the test set, a translation into SQL is performed and the database is accessed. Finally, the information returned by the database is used to classify the query in the following categories:

The following information is returned as feedback: • If the number of OOV words in the query is too high, the system returns a message explaining the task and the types of expected queries. • After each query, a natural language sentence is returned describing the generated SQL sentence. This sentence describes the fields and tables retrieved, as well as the restrictions considered. • The system informs the user if there is no more data available for a query. • The user is also informed if no more translation alternatives are available. • A message is shown if the user requests an operator but no one is available at that moment. • The system can also return the actual SQL sentence generated. This option is intended for advanced users and can be either, activated or not.

Q1 Exact information: The query shows the user exactly the same information as the reference.

• If the system detects an invalid formatted value (e.g., dates), it asks the user to express it in a valid format.

Q2 More fields: The query returns the same rows as the reference and, at least, the same fields.

Feedback is part of the user-system dialogue. Messages are returned and showed only when necessary, in the appropriate context and they do not interrupt the interaction with the system. Instead, they try to assist the user and facilitate the use of the dialogue options to find a correct answer, trying to reduce the user’s frustration if the system was not able to find a correct answer.

Q3 More rows: The information returned by the query includes all the information returned by the reference but more rows are added. Q4 Incorrect: The query does not include all the information of the reference. Query types Q1 and Q2 can be considered as successful outcomes, since they return what the user requested (Q2 just adds more details). Q3 (and, to a lesser extent, Q2) queries can be useful for navigational purposes because they return more information than requested. This fact can be used to refine a query, expand the data returned by a certain query, or just give an idea of the database contents in a first approach to the system. Finally, Q4 queries do not help the user and, what is more, they can even mislead him by providing wrong answers.

4 Experimental framework 4.1 Evaluation The evaluation has been performed using the test data sets described in Table 2. Those sets have been generated following the same process used to build the training corpora (see Section 2) and considering the same vocabulary but filtering out sentences that returned an empty data set. The evaluation performed in this paper tries to assess the usefulness of the system in the whole process of information retrieval. Hence, common machine translation metrics (e.g., WER or BLEU) can not be really informative in the context of information retrieval, since they compare the differences between

4.2

Experimental results

Table 3 shows the results of English and Spanish test queries when the user selects

120

Multilingual Acces to Online Help System and Databases

Eng.

Spa.

Times 0 1 2 0 1 2

Q1 66.6 66.6 66.6 81.4 81.4 81.4

Q2 3.7 12.4 14.0 0.3 4.2 4.3

Q3 5.1 8.9 16.5 1.3 5.4 10.2

Q4 24.6 12.1 2.9 17.0 9.1 4.1

when also looking for navigational information (i.e., considering, as well, Q3 results).

5

Concluding remarks

In this paper, the problem of accessing to databases using natural language has been presented, and a train timetable database access task has been described. The task has been used to develop a system that allows the retrieval of information using English or Spanish natural languages (SMT techniques have been used to generate SQL queries). The process can be enhanced by means of a dialogue system that assists the user and simplifies the information retrieval. The system has been evaluated to assess its correctness and usability. The results obtained show that SMT can be considered when translating from natural languages into the formal SQL. By adding an SQL parser and a simple dialogue system AMSABEL can achieve correct and specific answers, in more than 92% of the test queries and useful responses in more than 99% of the test queries. As future work, it could be interesting to study the implementation and performance costs of the AMSABEL system in a different and larger database to assess the database-

Table 3: Percentages of query types when the user selects, for each query, the option “more info” 0, 1 or 2 times.

the “request more information” option (explained in Section 3.5) zero, one or two times. These results show that this option does not modify the number of Q1 queries although it substantially increases Q2 and Q3 queries. In other words, this option increases the results useful for both, navigation and information. If this option is not used, Q2 and Q3 queries are a small percentage of the test queries. However, selecting this option once causes a great increase in Q2 queries and selecting it twice yields the greater increase in Q3 ones. At first, Spanish results are more accurate than English ones but the use of this option has a greater impact on English outcome. Using this option twice reduces the percentage of Q4 (useless) queries to less than 5% for Spanish and less than 3% in the case of English. Table 4 shows the results of English and Spanish test queries when the user selects “request another answer” option (explained in Section 3.5) from zero to four times. This option causes an important increment in the number of Q1 (perfect) queries. The first use yields the greater improvement in the results for both languages, increasing Q1 queries approximately by 10%. On the other hand, Q2 and Q3 queries are not significantly modified. This option has a greater impact on Spanish sentences reducing its Q4 queries to less than 3%. Table 5 contains the results obtained when “request another answer” and “more info” options are considered simultaneously. These results show that SMT combined with an SQL checker can obtain useful answers for most of the queries. The use of the proposed dialogue system can return the requested information (Q1 and Q2 results) in more than 92% of the queries, achieving more than 99%

Eng.

Spa.

Times 0 1 2 3 4 0 1 2 3 4

Q1 66.6 75.1 78.3 81.0 81.8 81.4 92.0 94.5 95.9 96.9

Q2 3.7 3.9 3.5 2.8 3.1 0.3 0.3 0.1 0.1 0.3

Q3 5.1 4.4 3.3 2.7 2.7 1.3 0.8 0.4 0.3 0.3

Q4 24.6 16.6 14.9 13.6 12.5 17.0 7.0 5.0 3.7 2.5

Table 4: Percentages of query types when the user selects, for each query, the option “request another answer” from 0 to 4 times.

Eng. Spa.

Q1 81.8 96.9

Q2 10.4 1.5

Q3 7.1 0.8

Q4 0.8 0.8

Table 5: Percentages of query types when the user selects, for each query, the option “request another answer” 4 times and the option “more info” two times.

121

Alejandro Revuelta-Martínez, Ismael García-Varea, Luis Rodríguez-Ruiz

independence. In addition, an evaluation of the system using real users could be carried out to compare to the results of the automatic evaluation techniques used in this paper. Finally, the interface could be even more natural by using speech recognition.

Koehn, P., H. Hoang, A. Birch, C. CallisonBurch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constantin, and E. Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proc. of the 45th Annual Meeting of the Association for Computational Linguistics, pages 177–180.

Acknowledgements Work supported by the European Social Fund and the Spanish Consejer´ıa de Educaci´on y Ciencia de la Junta de Comunidades de Castilla-La Mancha under PBI08-02107127 research project. Additionally, the authors wish to thank the anonymous reviewers for their valuable comments.

Koehn, P., F.J. Och, and D. Marcu. 2003. Statistical phrase-based translation. In Proc. of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 48–54. Minock, M. 2010. C-Phrase: A system for building robust natural language interfaces to databases. Data & Knowledge Engineering, 69(3):290–302.

References Amengual, J.C., A. Casta˜ no, A. Castellanos, V.M. Jim´enez, D. Llorens, A. Marzal, F. Prat, J.M. Vilar, J.M. Bened´ı, F. Casacuberta, M. Pastor, and E. Vidal. 2000. The EuTrans-I spoken language translation system. Machine Translation, 15(1–2):75–103.

Och, F.J. and H. Ney. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1):19–51.

Androutsopoulos, I., G.D. Ritchie, and P. Thanisch. 1995. Natural language interfaces to databases — an introduction. Natural Language Engineering, 1(1):29– 81.

Pieraccini, R., E. Tzoukermann, Z. Gorelov, J.L. Gauvain, E. Levin, C.H. Lee, and J.G. Wilpon. 1992. A speech understanding system based on statistical representation of semantics. In Proc. of the 1992 Int. Conference on Acoustics, Speech and Signal Processing, pages 193–196.

Brown, P.F., S.A. Della Pietra, V.J. Della Pietra, and R.L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2):263–311.

Popescu, A.M., O. Etzioni, and H. Kautz. 2003. Towards a theory of natural language interfaces to databases. In Proc. of the 8th Int. Conference on Intelligent User Interfaces, pages 149–157.

Dowding, J., J.M. Gawron, D. Appelt, J. Bear, L. Cherny, R. Moore, and D. Moran. 1993. Gemini: A natural language system for spoken-language understanding. In Proc. of the Workshop on Human Language Technology, pages 43– 48.

Price, P.J. 1990. Evaluation of spoken language systems: The ATIS domain. In Proc. of the Workshop on Speech and Natural Language, pages 91–95. Seneff, S. 1992. Tina: A natural language system for spoken language applications. Computational Linguistics, 18(1):61–86.

Griol, D., F. Torres, L. Hurtado, S. Grau, F. Garc´ıa, E. Sanchis, and E. Segarra. 2006. A dialog system for the DIHANA project. In Proc. of the Int. Conference on Speech and Computer, pages 131–136.

Stolcke, A. 2002. SRILM – an extensible language modeling toolkit. In Proc. of the 2002 Int. Conference on Spoken Language Processing, pages 901–904.

He, Y. and S. Young. 2003. A data-driven spoken language understanding system. In Workshop on Automatic Speech Recognition and Understanding, pages 583–588.

Ward, W. and S. Issar. 1994. Recent improvements in the CMU spoken language understanding system. In Proc. of the workshop on Human Language Technology, pages 213–216.

Jelinek, F. 1998. Statistical Methods for Speech Recognition. The MIT Press, Cambridge, MA, USA.

122

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 123-130

recibido 21-01-10 revisado 15-02-10 aceptado 08-03-10

A Probabilistic Method for Ranking Refinement in Geographic Information Retrieval∗ Un M´ etodo Probabil´ıstico para el Re-ordenamiento en Recuperaci´ on de Informaci´ on Geogr´ afica Esa´ u Villatoro-Tello, R. Omar Chav´ ez-Garc´ıa, Manuel Montes-y-G´ omez Luis Villase˜ nor-Pineda and L. Enrique Sucar Laboratory of Language Technologies, Department of Computational Sciences, National Institute of Astrophysics, Optics and Electronics (INAOE), Mexico. E-mail:{villatoroe, romarcg ,mmontesg, villasen, esucar}@ccc.inaoep.mx Resumen: Resultados recientes en la tarea de Recuperaci´ on de Informaci´ on Geogr´ afica (GIR) indican que los m´etodos de recuperaci´on de informaci´on actuales son efectivos para recuperar documentos relevantes a las consultas geogr´aficas, sin embargo tienen serias dificultades para generar un orden apropiado con los documentos recuperados. Motivado por estos resultados, este trabajo propone un m´etodo novedoso para re-ordenar la lista de documentos recuperados por un sistema GIR. El m´etodo propuesto est´ a basado en un Campo Aleatorio de Markov (CAM), el cual combina el orden original obtenido por el sistema GIR, la similitud entre documentos, y un enfoque de retroalimentaci´on de relevancia. La combinaci´ on de ´estas caracter´ısticas tiene el prop´osito de separar los documentos relevantes de los que no lo son y as´ı obtener un orden m´ as apropiado. Se realizaron experimentos con los recursos del foro GeoCLEF. Los resultados obtenidos muestran la viabilidad del m´etodo para re-ordenar documentos geogr´aficos y tambi´en muestran una mejora en la medida MAP (Mean Average Precision) comparados con el modelo tradicional de espacio vectorial. Palabras clave: Recuperaci´on de Informaci´ on, Recuperaci´on de Informaci´ on Geogr´ afica, Re-rankeo, Modelos Probabil´ısticos, Campo Aleatorio de Markov Abstract: Recent evaluation results from Geographic Information Retrieval (GIR) indicate that current information retrieval methods are effective to retrieve relevant documents for geographic queries, but they have severe difficulties to generate a pertinent ranking of them. Motivated by these results in this paper we propose a novel method to re-order the list of documents returned by a GIR system. The proposed method is based on a Markov Random Field (MRF)model that combines the original order obtained by the GIR system, the similarity among documents and a relevance feedback approach, all of them with the purpose of separating relevant from irrelevant documents, and thus, obtaining a more appropriate order. Experiments were conducted with resources from the GeoCLEF forum. Obtained results show the feasibility of the method for re-ranking documents in GIR and also depict an improvement in mean average precision (MAP) compared to the traditional vector space model. Keywords: Information Retrieval, Geographic Information Retrieval, Ranking Refinement, Probabilistic Models, Markov Random Fields

1

Introduction

Information Retrieval (IR) deals with the representation, storage, organization, and ac∗

This work was done under partial support of CONACyT(scholarships 165545 and 258311/224392)

ISSN 1135-5948

cess to information items1 (Baeza-Yates and Ribeiro-Neto, 1999). Given some query, formulated in natural language by some user, the IR system is suppose to retrieve and sort according to their relevance degree doc1 Depending on the context, items may refer to text documents, images, audio or video sequences.

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Esaú Villatoro-Tello, R. Omar Chávez-García, Manuel Montes-y-Gómez, Luís Villaseñor-Pineda, L. Enrique Sucar

uments satisfying user’s information needs (Grossman and Frieder, 2004). The word relevant means that retrieved documents should be semantically related to the user information need. Hence, one main problem of IR is determining which documents are, and which are not relevant. In practice this problem is usually regarded as a ranking problem, whose goal is to define an ordered list of documents such that documents similar to the query occur at the very first positions. Over the past years, IR models, such as: Boolean, Vectorial, Probabilistic and Language models have represented a document as a set of representative keywords (i.e., index terms) and defined a ranking function (or retrieval function) to associate a relevance degree for each document with its respective query (Baeza-Yates and Ribeiro-Neto, 1999; Grossman and Frieder, 2004). In general, these models have shown to be quite effective over several tasks in different evaluation forums2 . However, the ability of these models to effectively rank relevant documents is still limited by the ability of the user to compose an appropriate query. In relation to this fact, IR models tend to fail when desired results have implicit information requirements that are not specified in the keywords. Such is the case of Geographic Information Retrieval (GIR), which is a specialized IR branch, where search of documents is based not only in conceptual keywords, but also on geographical terms (e.g., geographical references) (Jones and Prurves, 2008). For example, for the query: “Cities near active volcanoes”, expected documents should mention explicit city and volcanoes names. Therefore, GIR systems have to interpret implicit information contained in documents and queries to provide an appropriate response to geographical queries. Recent development on GIR systems (Mandl et al., 2008) evidence that: i) traditional IR systems are able to retrieve the majority of the relevant documents for most queries, but that, ii) they have severe difficulties to generate a pertinent ranking of them. To tackle this problem, recent works have explored the use of traditional re-ranking approaches based on query expansion via either 2

CLEF (http://www.clef-campaign.org/) TREC (http://trec.nist.gov/) forums

relevance feedback (Larson, Gey, and Petras, 2006; Gill´en, 2007; Ferr´es and Rodr´ıguez, 2008; Larson, 2008), or employing knowledge databases (Wang and Neumann, 2008; Cardoso, Sousa, and Silva, 2008). Although these strategies are effective improving precision values, is known that query expansion strategies are very sensitive to the quality of the added elements, and some times may result in degradation of the retrieval performance. In this paper we propose a novel reranking strategy, which we apply in the context of Geographic Information Retrieval. Since retrieving relevant documents to geographic queries is not a problem for traditional IR systems, we focus on improving the order assigned to a set of retrieved documents, i.e., we focus on the ranking refinement problem. Our method combines the original order obtained by a GIR system, the similarity between documents obtained with textual features and a relevance feedback approach, all of them with the purpose of separating the relevant documents from does that are not relevant, and thus obtain a more appropriate order for the results generated by the base GIR system. The proposed method is based on a Markov random field (MRF) model, in which each document in the list is represented as a random variable that could be relevant or no relevant. The relevance feedback is incorporated in the initialization of the model, making these documents relevant. The energy function of the MRF combines two factors: the similarity between the documents in the list (internal similarity); and external information obtained from the original order and the similarity of each document with the query (external similarity). Taking these factors into account and assigning a weight to each, the MRF is solved (obtaining the more probable configuration) so it separates the relevant documents from the rest. Based on this result, the list of documents is reordered (re-ranked) and given as final result to the user. The rest of the paper is organized as follows. Section 2 discusses some related work in the field of geographic information retrieval. Section 3 shows the proposed method. Section 4 describes the experimental platform used to evaluate our ranking strategy. Section 5 presents the experimental

and

124

A Probabilistic Method for Ranking Refinement in Geographic Information Retrieval

results. Finally, section 6 depicts our conclusions and future work.

2

refinement problem, propose algorithms that consider the existence of Geo-tags 4 , therefore, the ranking function measures levels of topological space proximity among the geotags of retrieved documents and geo-queries (Martins et al., 2007). In order to achieve this, geographical resources (e.g., geographical databases) are needed. In contrast, approaches that do not depend on any robust geographical resource have proposed and applied variations of the query expansion process via relevance feedback, where no special consideration for geographic elements is made (Larson, Gey, and Petras, 2006; Gill´en, 2007; Ferr´es and Rodr´ıguez, 2008; Larson, 2008), and they have achieved good performance results. There are also works focusing on the ranking refinement problem; they consider the existence of several lists of retrieved documents (from one or many IR machines). Therefore, the ranking problem is seen as a information fusion problem, without any special processing for geo-terms contained in the retrieved documents. Some simple strategies only apply logical operators to the lists (e.g., AND) in order to generate one final re-ranked list (Ferr´es and Rodr´ıguez, 2008), while some other works apply techniques based on information redundancy (e.g., CombMNZ or Round-Robin)(Larson, Gey, and Petras, 2006; Villatoro-Tello, Montes-y-G´omez, and Villase˜ nor-Pineda, 2008; Ortega et al., 2008). Recent evaluation results indicate that there is not a notable advantage of knowledge-based strategies over methods that do not depend on any geographic resource (Villatoro-Tello, Montes-y-G´omez, and Villase˜ nor-Pineda, 2009). Motivated by these results, our proposed method does not make any special consideration for geographical elements, i.e., we consider for measuring similarity among documents all its textual components. Also, our method does not require accessing again the entire collection, it considers only the list provided by the GIR system. Our main hypothesis is that by employing information obtained through a feedback strategies, is possible to perform an accurate ranking refinement process avoiding the

Related Work

Formally, a geographic query (geo-query) is defined by a tuple (Henrich and Luedecke, 2007). The what part represents generic terms (nongeographical terms) employed by the user to specify its information need, it is also known as the thematic part. The where term is used to specify the geographical areas of interest. Finally, the relation term specifies the “spatial relation”, which connects what and where. GIR has been evaluated at the CLEF forum since year 2005, under the name of the GeoCLEF task (Mandl et al., 2008). Their results evidence that traditional IR methods are able to retrieve the majority of the relevant documents for most geo-queries, but, they have severe difficulties to generate a pertinent ranking of them. Due to this situation, recent GIR methods have focused on the ranking subtask. Common employed strategies are: i) query expansion through some feedback strategy, ii) re-ranking retrieved elements through some adapted similarity measure, and iii) re-ranking through some information fusion technique. These strategies have been implemented following two main approaches: first, techniques that had paid attention on constructing and including robust geographical resources in the process of retrieving and/or ranking documents. And second, techniques that ensure that geo-queries can be treated and answered employing very little geographical knowledge. As an example of those on the first category, some works employ geographical resources in the query expansion process (Wang and Neumann, 2008; Cardoso, Sousa, and Silva, 2008; Garc´ıa-Cumbreras et al., 2009). Here, they first recognize and disambiguate all geographical entities in the given geoquery by employing a GeoNER3 system. Afterwards, they employ a geographical ontology or thesaurus to search for these geoterms, and retrieve some other related geoterms. Then, retrieved geo-terms are given as feedback elements to the GIR machine. Some others approaches that focus on the ranking 3

4

A Geo-tag indicates the geographical focus of certain item. As can be seen in (Borges et al., 2007), Geo-tagging and geo-disambiguating are both major problems in GIR.

Geographical Named Entity Recognizer. 125

Esaú Villatoro-Tello, R. Omar Chávez-García, Manuel Montes-y-Gómez, Luís Villaseñor-Pineda, L. Enrique Sucar

drawbacks of query expansion techniques. In addition, based on the fact that geo-queries often contain implicit information, our intuition is that by considering full documents in the process of re-ranking, it is possible to make explicit some of the implicit information contained in the original geo-queries.

3

Proposed Method

A general outline of the proposed method is given in Figure 1. Given a query, the GIR system retrieves from a given collection of documents a list of files sorted according to a relevance criteria. From this list, some relevant documents are selected based on a relevance feedback approach. For each document in the list, the textual features are extracted. The text contained in each document in the list, the query given by the user, and a subset of documents selected via relevance feedback, are combined to produce a re-ordered list. This re-ranking is obtained based on a Markov random field (MRF) model that separates the relevant documents from irrelevant ones, generating a new list by positioning the relevant documents first, and the others after. Next we give a brief review of MRFs, and then we describe in detail each component of the proposed method.

3.1

Figure 1: Block diagram of the proposed method. As input, it takes the original list obtained by an GIR system. Then, it considers a subset of relevant documents obtained via relevance feedback, and the textual features of the documents in the list. These elements, together with the original order, are integrated through the use of a MRF that divides the relevant documents from the rest to build a new, re-ordered list. where Z is called the partition function or normalizing constant, and Up (f ) is called the energy function. The optimal configuration is found by minimizing the energy function Up (f ), obtaining a value for every random variable in F .

Markov Random Fields

Markov Random Fields (Li, 1994) are probabilistic models which combine a priori knowledge given by some observations and knowledge given by the interaction with neighbors. Let F = {F1 , F2 , . . . , Fn } be random variables on a set S, where each Fi can take a value fi in a set of labels L. This F is called a random field, and the instantiation of each of these Fi ∈ F as an fi , is what is called a configuration of F , so, the probability that a random variable Fi takes the value fi is denoted by P (fi ), and the joint probability is denoted as P (F1 = f1 , F2 = f2 . . . , Fn = fn ). A random field is said to be an MRF if it has the property of locality, i.e., if the field satisfies the following property:

3.2

In our case we consider a MRF in which each node corresponds to a document in the list. Each document is represented as a random variable with 2 possible values: relevant and irrelevant. We consider a fully connected graph, such that each node (document) is connected to all other nodes in the field; that is, we defined a neighborhood scheme in which each variable is adjacent to all the others. Given that the number of documents in the list is relatively low (1000 in the experiments), to consider a complete graph is not a problem computationally, and allows us to consider the relations between all documents in the list. For representing the documents, and evaluating the internal and external similarities, we consider all the words contained in each document (except stopwords), without

P (fi |fS−{i} ) = P (fi |fNi ) where S − {i} represents the set S without the ith element, fNi = {fi" |i" ∈ Ni }, and Ni is the set of neighboring nodes of the node fi . The joint probability can be expressed as: P (f ) =

Model

e−Up (f ) Z 126

A Probabilistic Method for Ranking Refinement in Geographic Information Retrieval

variable with the query q. Where dist(f, q) is defined as: dist(f, q) = |f ∩ q|/|q|. The second is a function that converts the position in the list given by a base IR machine to a real value. The function used g(x) = exp(x/20)/exp(5) (Ch´avez, Sucar, and Montes, 2010)5 . The function pos(f ) returns the position of the document f in the original list, posinv(f ) returns the inverse position of the f variable in this list. Having described each potential, the proposed energy function is defined as:

any special consideration for geographic elements. To describe the documents we used a binary bag of words representation, in which each vector element represents a word from the collection vocabulary; and the query is represented in the same manner. The internal and external similarities are considered via the energy function described next.

3.3

Energy Function

The energy function of the MRF combines two factors: the similarity between the documents in the list (internal similarity); and external information obtained from the original order and the similarity of each document with the query (external similarity). The internal similarities correspond to the interaction potentials and the external similarities to the observation potentials. The proposed energy function takes into account both aspects and is defined as follows:

 ¯ ¯ + λ[1 − dist(f, q)) Y + (1 − X)     ×g(posinv(f )]   if f = irrelevant U (f ) =    ¯ ¯    X + (1 − Y ) + λdist(f, q) × g(pos(f )) if f = relevant

The initial configuration of the MRF is obtained by relevance feedback. That is, the subset of documents selected via relevance feedback are initialized as relevant, and all other documents as irrelevant. Then, the MRF configuration of minimum energy (MAP) is obtained via stochastic simulation using the ICM algorithm (we experimented using also Simulated Annealing with similar results). At the end of this optimization process, each variable (document) has a value of relevant or irrelevant. Based on these values, a new re-ordered list is produced, by positioning first the relevant documents according to the MRF, and then the not-relevant ones.

U (f ) = Vc (f ) + λVa (f ) Where Vc is the interaction potential and it considers the similarity between random variable f and its neighbors, representing the support that neighboring variables give to f . Va is the observation potential and represents the influence of external information on variable f . The weight factor λ favors Vc (λ < 1), Va (λ > 1), or both (λ = 1). Vc is defined as: Vc (f ) =

!

¯ Y¯ + (1 − X) if f = irrelevant ¯ ¯ X + (1 − Y ) if f = relevant

4 Experimental Setup 4.1 Datasets

Where Y¯ represents the average distance between variable f and its neighbors with ir¯ represents the average disrelevant value. X tance between variable f and its neighbors with relevant value. The distance metric used to measure the similarity between variables is defined as: 1 − dice(f, g), where dice(f, g) represents the Dice coefficient (Mani, 2001), ∩g| and is defined as: dice(f, g) = 2|f |f ∪g| . Va is defined as follows:

Va (f ) =

For our experiments we employed the GeoCLEF document collection composed from news articles from years 1994 and 1995. Articles cover as national as international events and, as a consequence, documents contain several geographic references.

4.2

Topics

We worked with the topics from GeoCLEF 2005 to GeoCLEF 2008. A total of 25 topics or queries were emitted for each year to total at the last conference in 2008 a set of 100 queries. Table 4.2 shows the structure of each

 (1 − dist(f, q)) × g(posinv(f ))     if f = irrelevant     dist(f, q) × g(pos(f )) if f = relevant

5 The intuitive idea of this function is such that it first increases slowly so that the top documents have a small potential, and then it increases exponentially to amplify the potential for those documents in the bottom of the list.

The Va potential is obtained by combing two factors. The first indicates how similar, dist(f, q), or different, 1 − dist(f, q) is the f 127

Esaú Villatoro-Tello, R. Omar Chávez-García, Manuel Montes-y-Gómez, Luís Villaseñor-Pineda, L. Enrique Sucar

the original list in the context of GIR, ii) to evaluate the sensitivity of the method to the model parameters. Several experiments were conducted varying λ. Each experiments were made taking into account 1, 5 and 10 documents as relevance feedback. Simulated user feedback technique was used to perform the experiments. The collection used contains, in addition to the queries and documents, relevance judgments indicating which documents are relevant to each of the proposed queries, given that it is known beforehand which documents are relevant in the retrieved list, hence this documents are taken as feedback. This feedback type is known as simulated user feedback.

topic. The main query or title is between labels and . Also a brief description (, ) and a narrative (, ) are given. GC030 Car bombings near Madrid Documents about car bombings occurring near Madrid Relevant documents treat cases of car bombings occurring in the capital of Spain and its outskirts

Table 1: Topic GC030: Car bombings near Madrid

5

Experimental results are reported in Table 5. Results are reported in terms of P@5, P@10, P@20 and MAP. Results marked in bold indicate the best results obtained over the different configurations. For our experiments we employed the results produced by the vectorial space model (VSM) configured in Lemur6 using a TFIDF weighting scheme as our baseline ranking. For comparison purposes with the rest of the GeoCLEF participants, Table 2 shows the best MAP results obtained among all the sumitted runs over the different years of the GeoCLEF, as well as the median and the worst results.

For our experiments we employed the and the fields.

4.3

Evaluation

The evaluation of results was carried out using measures that have demonstrated their pertinence to compare IR systems, namely, the Mean Average Precision (MAP ) and the precision at N (P@N). MAP is defined as follows: |Q|

1 & ( M AP = |Q| i=1

'm

r=1 Pi (r)

n

× reli (r)

)

Where Pi (r) is the precision at the first r documents, reli (r) is a binary function which indicates if document at position r is relevant or not for the query i; n is the total number of relevant documents for the query i, m is the number of relevant documents retrieved and Q is the set of all queries. Intuitively, this measure indicates how well the system puts into the first positions relevant documents. It is worth pointing out that since our IR machine was configured to retrieve 1000 documents for each query, MAP values are measured at 1000 documents. On the other hand, P@N is defined as the percentage of retrieved relevant items at the first N positions of the result list.

4.4

Results

MAP Year

worst

median

best

GeoCLEF 2005

0.1022

0.2600

0.3936

GeoCLEF 2006

0.0732

0.2700

0.3034

GeoCLEF 2007

0.1519

0.2097

0.2850

GeoCLEF 2008

0.1610

0.2370

0.3037

Table 2: Results obtained in the GeoCLEF As can be seen in Table 5 our baseline results are very close to the median MAP obtained among participants in each of the GeoCLEF tracks (Table 2), except for the year 2007. However, remember that the majority of the GeoCLEF participants employ one or several geographical resources, or even more

Experiments definition

We conducted a series of experiments with the following objectives: i) to test the results of the proposed method compared with

6 An open-source system designed to facilitate research in information retrieval (http://www.lemurproject.org/)

128

A Probabilistic Method for Ranking Refinement in Geographic Information Retrieval

each year of the GeoCLEF track (Table 2). Notice that the proposed method yields to better results when the value of lambda is small (e.g 0.3). So it seems that, at least for this collection, the information from the neighbors is more valuable than the information from the original order and the similarity with the query.

robust IR machines in order to retrieve relevant documents. Given this fact, we consider that our Lemur IR configuration is yielding acceptable baseline results. Table 5 show a comparision between the results of the original list retrived by the Lemur IR machine and the results obtained with the proposed method for some different configurations of parameters. Notice that the values shown are an average of the values obtained for the 25 queries for each year. Also notice that for each of the considered measures, all variants of the proposed method improve the values of its corresponding baseline. Year

6

This paper proposed a method for improving the ranking of a list of retrieved documents by a GIR system. Based on a relevance feedback approach, the proposed method integrates the similarity between the documents in the list (internal similarity); and external information obtained from the original order and the query (external similarty), via a MRF to separate the relevant and irrelevant images in the original list. Experiments were conducted using the resources of the forum GeoCLEF from years 2005 to 2008. For our experiments we avoid using any specialized geographical resource, since our main goal was to prove the pertinece of the method employing olny textual (document’s words) features. Results showed that considering only one document as feedback, the proposed method improved the MAP up to 9%, 39%, 22% and 24% for years 2005, 2006, 2007 and 2008 respectively. An initial analysis idicates that for this collection, greater importance is given to the information from neighbors, obtained from the textual similarity between documents. As future work we are considering including instead of textual features, geographical features, and we also intend to include a combination of both (textual and geographical) features to exploit the advantages of both.

Experiment P@5 P@10 P@20 MAP Baseline

0.5200 0.4360 0.3380 0.3191

GeoCLEF

F1-L0.3

0.5440 0.4520 0.3440 0.3486

2005

F5-L0.3

0.9840 0.5760 0.3800 0.4627

F10-L0.0

0.9840 0.9320 0.5040 0.5910

Baseline

0.3200 0.2560 0.1960 0.2618

GeoCLEF

F1-L0.3

0.3680 0.2760 0.2060 0.3658

2006

F5-L0.0

0.8160 0.4600 0.2800 0.5881

F10-L0.3

0.8160 0.6520 0.3580 0.6942

Baseline

0.2400 0.2160 0.1620 0.1612

GeoCLEF

F1-L0.5

0.3040 0.2400 0.1720 0.1970

2007

F5-L0.3

0.7920 0.4360 0.2580 0.3909

F10-L0.3

0.7920 0.6600 0.3560 0.4960

Baseline

0.3840 0.2960 0.2440 0.2347

GeoCLEF

F1-L0.0

0.3840 0.2840 0.2360 0.2911

2008

F5-L0.0

0.8160 0.4720 0.3120 0.4068

F10-L0.5

0.8160 0.7320 0.3960 0.4959

Conclusions

Table 3: A comparison between the results obtained by the VSM base ranker and the proposed method with some of its variants. The number after the letter F indicates the number of documents taken for relevance feedback, the number following the letter L indicates the value of λ

References Baeza-Yates, R. and B. Ribeiro-Neto. 1999. Modern Information Retrival. Addison Wesley.

Results show that an improvement of 9%, 39%, 22% and 24% for years 2005, 2006, 2007 and 2008 respectively, is reached when only one document is selected as feedback; and as expected, as more documents are given as feedback, better performance is obtained. It is also important to notice, that when selecting one document as feedback element, reached results improve the median values from Table 2, except for year 2007. Aditionally, observe that when more elements are given as feedback (5, 10), MAP values are even better than the best result obtained for

Borges, K. A., A. H. F. Laender, C. B. Medeiros, and C. A. Davis Jr. 2007. Discovering geographic locations in web pages using urban addresses. In Proceedings of Workshop on Geographic Information Retrieval GIR, Lisbon, Portugal. ACM Press. Cardoso, N., P. Sousa, and M. J. Silva. 2008. The university of lisbon at geoclef 2008. In 129

Esaú Villatoro-Tello, R. Omar Chávez-García, Manuel Montes-y-Gómez, Luís Villaseñor-Pineda, L. Enrique Sucar

Working notes for the CLEF 2008 Workshop, Aarhus, Denmark, September.

2005, volume 4022 of Lecture Notes in Computer Science. Springer.

Ch´avez, O., E. Sucar, M. Montes. 2010. Image re-ranking based on relevance feedback combining internal and external similarities. In The 23rd International FLAIRS Conference, Daytona Beach, Florida, USA. In press.

Li, S. 1994. Markov random field models in computer vision. Computer Vision — ECCV ’94, pages 361–370. Mandl, T., P. Carvalho, G. M. Di Nunzio, F. Gey, R. R. Larson, D. Santos, and C. Womser-Hacker. 2008. Geoclef 2008: The clef 2008 cross-language geographic information retrieval track overview. In Evaluating Systems for Multilingual and Multimodal Information Access, volume 5706 of Lecture Notes in Computer Science. Springer.

Ferr´es, D. and H. Rodr´ıguez. 2008. Talp at geoclef 2007: Results of a geographical knowledge filtering approach with terrier. In Advances in Multilingual and Multimodal Information Retrieval: 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, volume 5152 of Lecture Notes in Computer Science. Springer.

Mani, Inderjeet. 2001. Automatic Summarization (Natural Language Processing, 3 (Paper)). John Benjamins Publishing Co, June.

Garc´ıa-Cumbreras, M. A., J. M. PereaOrtega, M. Garc´ıa-Vega, and L. A. Ure˜ naL´opez. 2009. Information retrieval with geographical references. relevant documents filtering vs. query expansion. Information Processing and Management.

Martins, B., N. Cardoso, M. S. Chaves, L. Andrade, and M. J. Silva. 2007. The university of lisbon at geoclef 2006. In Evaluation of Multilingual and Multi-modal Information Retrieval: 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, volume 4730 of Lecture Notes in Computer Science. Springer.

Gill´en, Rocio. 2007. Monolingual and bilingual experiments in geoclef2006. In Evaluation of Multilingual and Multi-modal Information Retrieval: 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, volume 4730 of Lecture Notes in Computer Science. Springer.

Perea-Ortega, J. M., L. A. Ure˜ na, D. Buscaldi, and P. Rosso. 2008. Textmess at geoclef 2008: Result merging with fuzzy borda ranking. In Working notes for the CLEF 2008 Workshop, Aarhus, Denmark, September.

Grossman, D. A. and O. Frieder. 2004. Information Retrieval, Algorithms and Heuristics. Springer, second edition.

Villatoro-Tello, E., M. Montes-y-G´ omez, and L. Villase˜ nor-Pineda. Ranking refinement via relevance feedback in geographic information retrieval. In Mexican International Conference on Artificial Intelligence (MICAI 2009), Guanajuato, Mexico, November 2009. Lecture Notes in Artificial Intelligence 5845, Springer, 2009.

Henrich, A. and V. Luedecke. 2007. Characteristics of geographic information needs. In Proceedings of Workshop on Geographic Information Retrieval, GIR’07, Lisbon, Portugal. ACM Press. Jones, C. B. and R. S. Prurves. 2008. Geographical information retrieval. International Journal of Geographic Information Science, 22(3):219–228.

Villatoro-Tello, E., M. Montes-y-G´ omez, and L. Villase˜ nor-Pineda. 2008. Inaoe at geoclef 2008: A ranking approach based on sample documents. In Working notes for the CLEF 2008 Workshop, Aarhus, Denmark, September.

Larson, Ray R. 2008. Cheshire at geoclef 2008: Text and fusion approaches for gir. In Working notes for the CLEF 2008 Workshop, Aarhus, Denmark, September. Larson, R. R., F. Gey, and V. Petras. 2006. Berkeley at geoclef: Logistic regression and fusion for geographic information retrieval. In Accessing Multilingual Information Repositories: 6th Workshop of the Cross-Language Evaluation Forum, CLEF

Wang, Riu and Gunter Neumann. 2008. Ontology-based query construction for geoclef. In Working notes for the CLEF 2008 Workshop, Aarhus, Denmark, September. 130

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 131-138

recibido 21-01-10 revisado 19-02-10 aceptado 06-03-10

A Multimodal Dialogue System for Playing the Game “Guess the card”∗ Un sistema de di´ alogo multimodal para jugar el juego de “Adivina la Carta” Ivan Meza, Elia P´ erez, Lisset Salinas, Hector Aviles, Luis A. Pineda IIMAS, UNAM Ciudad Universitaria {ivanvladimir,liz,haviles}@turing.iimas.unam.mx, [email protected], [email protected] Resumen: En este art´ıculo se presenta un sistema conversacional en espa˜ nol hablado y con visi´on computacional que juega el juego de “adivina la carta” con el p´ ublico en una cabina de la exhibici´on permanente del museo de las ciencias Universum de la Universidad Nacional Aut´onoma de M´exico. Se presenta el modelo conceptual as´ı como la arquitectura del sistema. Se incluye tambi´en la transcripci´on de un fragmento de un di´alogo real colectado en el museo, as´ı como una evaluaci´on preliminar del sistema. Se concluye con una reflexi´on acerca de los alcances de la presente metodolog´ıa. Palabras clave: Sistema de di´alogo, Administraci´on del di´alogo, Sistemas multimodales con habla y visi´on Abstract: In this paper a dialogue system with spoken Spanish and computer vision that plays the game “Guess the card” with members of the general public in a permanent stand of the science museum Universum at the National Autonomous University of Mexico (UNAM) is presented. The conceptual and architectural guidelines for the construction of the system are presented. An excerpt of an actual dialogue collected at the museum is also included, along with a preliminary evaluation of the system. The paper is concluded with a reflection about the scope of the present methodology. Keywords: Dialogue system, dialogue manager, Multimodal Speech and Vision Systems

1.

Introduction

Over the last ten years we have been developing a technological infrastructure for the construction of spoken dialogue systems in Spanish supported by multimodal input and output, including the interpretation of images through computer vision and the display of pictures and animations to support the speech output. We are interested in applica∗

We thank the support and effort at IIMAS by Fabian Garcia Nocetti, Wendy Aguilar, Hayde Castellanos, and the visiting students Carlos Hern´ andez, Edith Moya, Aldo Fabian, Karen Soriano, Nashielly Vasquez, Miriam Reyes, Ram´ on Laguna and Tania P´erez. We also thank the support and help provided at the museum Universum by Ren´e Drucker, Lourdes Guevara, Gabriela Guzzy, Luis Morales, Emmanuel Toscano, Jimena Reyes, Brenda Flores, Germ´ an Albizuri, Pablo Flores, Esteban Estrada, Esteban Monroy, ´ Ana Lara, Mar´ıa Agonizantes, Diego Alvarez, Addina Cuerva, Claudia Hern´ andez and Le´ on Soriano.

ISSN 1135-5948

tions in fixed stands, like the one presented in this paper, but also with mobile capabilities. In this latter case, we developed the robot Golem, which was able to act as the guide of a poster session through a spoken Spanish conversation. We have now developed a new application to demonstrate this kind of technology in a permanent stand at the Universum science museum of the National Autonomous University of Mexico (UNAM). In this stand the system is able to play the game “Guess the card” with the public, mostly children, through a fluent conversation in spoken Spanish, where the linguistic behavior is coordinated with computer vision and the display of pictures to support the system’s output. The stand has a table with ten cards with astronomical motives on it, and the system chooses one of them; then the human user asks up to four questions to identify the card in ques-

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Ivan Meza, Elia Pérez, Lisset Salinas, Héctor Avilés, Luis A. Pineda

tion; when this interrogatory is finished the system asks the user to show the card that he or she thinks was chosen by the system. Finally, the system interprets such a card through computer vision and acknowledges whether the user guessed the card, or tells him or her which one was the right one. In this project we have focused in the definition and implementation of generic architecture to support multimodal dialogues, and in the quick development of specific applications. The present architecture is centered around the notion of dialogue model specification and interpretation. Dialogue models are representations of small conversational protocols which are defined in advance through analysis. These units are then assembled dynamically during the interaction producing rich and natural conversations. The central component of the system is the dialogue manager (DM); this program interprets the dialogue models continuously, and each interpretation act corresponds to a conversational transaction. The main tenant of our approach is that dialogue acts are expressed and interpreted in relation to a conversational context that is shared between the speaker and hearer. We pose that the interpretation context has two main parts: a global context that holds for whole of the conversational domain, and needs to be identified in advance through analysis, and a specific context that is built dynamically along each particular conversation. Dialogue models can be thought of as representations of the global context, and the history of the interpretations and actions that are performed in every particular conversation constitute the specific context. Each particular application is defined as a set of dialogue models, but the DM is a generic application independent program. In this paper we illustrate the present conceptual architecture with the application “guess the card” at Universum meseum. Section 2 presents the main characteristics of the dialogue manager. The architecture of the system is presented and discussed in section 3. Section 4 specifies the task and shows an excerpt of an actual dialogue collected at the stand. Section 5 illustrates the dialogue models for this application. A preliminary evaluation of the system is presented in Section 6. The implementation details are presented in Section 7. Finally, in section 8 we present our conclusions about the implications of the

present theory and methodology for the construction of this and similar systems.

2.

Dialogue Manager

The dialogue manager interprets the conversational protocols codified in the dialogue models, and coordinates the system’s perceptions, both linguistic and visual, with the system’s actions. It also keeps track of the dynamic conversational context, which is required to make interpretations and perform actions that depend on the previous communicative events in the current conversation. We are interested in modeling practical dialogues in which the conversational partners “visit” conversational situations with highly structured expectations about what can be expressed by the interlocutor or about the visual events that can occur in the world, which we call expected intentions or expectations. This information forms a part of the global context and is used in all interpretation acts, and also to produce the corresponding relevant actions. Situations, expectations and actions of an application domain are encoded through dialogue models. A dialogue model is represented as a directed graph (cycles are permitted). Situations are represented as nodes and edges are labeled with expectation and action pairs. If the expectation of an edge is satisfied by the current interpretation, then the corresponding action is performed. Situations can have one or more input and output expectation-action pairs. Situations are typed according the modality of the expected input; the main types are listening or linguistic and seeing or visual. There is also a special type of situation in which a full dialogue model is embedded. Situations of this type are called recursive. When a recursive situation is reached, the current dialogue model is pushed down into a stack, and the embedded model is interpreted, so the conversation as a whole has a stack structure. All dialogue models have one or more final situations, and when these are reached, the model’s interpretation process is terminated. If there is a dialogue at the top of the stack it is pop up and its interpretation is resumed; otherwise the dialogue as a whole is terminated. In this sense, dialogue models correspond to recursive transition networks (RTN), which have the same expressive power of context free grammars. All dialogue models have an error situa-

132

A Multimodal Dialogue System for Playing the Game "Guess the card"

constants, is illustrated in Figure 2. The specification of a transition involving propositional functions is illustrated in Figure 3. In this case the expectation has some concrete information defined in advance in the dialogue model, but its satisfaction requires that some content information, represented by the variable x, is collected from the actual message or event in the world. Expectations and actions in dialogue models are parametric objects, as illustrated in Figure 3. Situations can also have parameters, and this mechanism permits the specification and flow of information along the conversation.

tion. When the input message or event is not expected, or cannot be assigned an interpretation, the system reaches an error situation, and starts a recovery conversational protocol. In the default case, it produces a recovery speech act (e.g., I didn’t understand you, could you repeat it please? ); at this point the dialogue reaches again the situation in which the communication failure occurred and resumes the conversation with the same context. However, the error situation can also embed a full recovery dialogue to handle specific recovery patterns to achieve grounding at the communication and agreement conversational layers (Clark y Schaefer, 1989; Pineda et al., 2007). Expected intentions and actions are expressed through abstractions that are independent of the expression used by the interlocutor and of the actual patterns that appear on the visual field of the system. These abstractions allow to capture a wide range of possible concrete communication behavior. Accordingly, the analysis of a task domain corresponds to the identification of possible speech act protocols that are observed empirically in the domain, and this analysis is codified in the dialogue models. In our implementation, expectations are expressed through a declarative notation representing speech acts and actions. Actions are also specified declaratively through Multimodal Rhetorical Structures (MRS); these are lists of basic rhetorical acts, defined along the lines of the Rhetorical Structure Theory (RST) (Mann y Thompson, 1988). Although the specification of MRS is also modality independent, the basic rhetorical acts have an associated output modality. Accordingly, a MRS is thought of as “paragraph” in which some of its sentences are rendered through speech, but others may be rendered visually, as texts, pictures, animations and video. The specification of speech acts and rhetorical acts can be expressed through concrete expressions (e.g., constants and grounded predicates), but also these can be expressed through propositional functions. The notation for transitions is illustrated in Figure 1, where the situation sf is reached from si if the corresponding expectation is satisfied; during this transition the M RS is performed by the system. The specification of a transition with a concrete expectation and a concrete action, that can be named through

Figura 1: Specification of a situation’s transition.

Figura 2: Specification intention-action pair.

of

a

concrete

Figura 3: Specification through propositional functions. Concrete expectations and actions interpreted and performed by the system (i.e., grounded interpretations and action specifications) are collected in the conversation history, where the stack structure of the dialogue is also preserved. In order to access the dialogue history, expectations and actions can also be specified through domain specific functions, as illustrated in Figure 4. The arguments of these functions are domain specific information, the current dialogue model, and the conversation history. When the labels of an edge are specified through functions, these are evaluated first, and the values of the functions determine the system behavior (i.e., the actual expectation that needs to be satisfied to follow the corresponding edge, the action that is produced by the system or the conversational situation that is reached if the expectation is satisfied). This functional machinery per-

133

Ivan Meza, Elia Pérez, Lisset Salinas, Héctor Avilés, Luis A. Pineda

mits also the resolution of terms an expressions on the basis of the discourse information (i.e., anaphoric inferences). The definition of these functions extends the expressive power of the formalism, but preserves an implicit graph directed process, with the corresponding computational advantages. For this reason we call this formalism Functional Recursive Transition Networks (F-RTN).

pretation of the event perceived in the world. Interpretations (i.e., the output of interpretation process) and expectations are codified in the same format in the dialogue models. In our scheme, the bottom and intermediate levels correspond to low and high level perception, and we think of perception as a process that takes an external stimuli and produces its interpretation in relation to the context. The interpretation process is illustrated in Figure 5. For the present application we define a speech and a visual perception process. The goal of speech perception is to assign an interpretation to the speech act performed by the human user. When the DM reaches a listening situation it feeds the language interpreter with a set of expectations in a top-down fashion. In the present implementation each expected intention has an associated regular expression stored in memory, which codifies a large number of ways to state such intention. The language’s interpreter recovers such a regular expression and applies it to the text produced by the ASR system; if this match is successful, the expected intention, with the values recovered from the input speech, becomes the interpretation. Figure 5 illustrates this flow of information. Visual interpretation proceeds in the same way. In this case, when the DM reaches the seeing situation, it expects to see one among the ten cards, and this information is feed top-down from the DM to the visual interpreter. This in turn activates the vision recognition module, which provides a set of features codifying the image of the card in a bottomup fashion. Visual expectations are also indices the image of the card, and this association is also codified in memory. For visual interpretation, the features of the external image are matched with the cards codified in memory, and the interpretation corresponds to the expectation with the largest number of matches. For the system’s output, when an expectation is matched with the interpretation of the current input stimuli the systems performs the corresponding action (i.e., as defined by its associated MRS). MRSs are also abstract specifications that need to be instantiated in terms of the specific input interpretation and the dynamic contexts. This specification is performed by modality specific programs, and rendered by modality spe-

Figura 4: Example of functional arc.

3.

Architecture

Linguistic and visual events need to be recognized and interpreted before they can be matched with the expectations defined in the dialogue models. This is, the input information needs to be interpreted in order to be used in conversation. To model the relation between linguistic and visual interpretations and the information codified in the dialogue models we have developed a three-layers conceptual architecture; the top-level corresponds to the interpretation of dialogue models, as mentioned. The bottom-level corresponds to the recognition level in which the external information (e.g., speech or images) is translated into an internal data structure. However, the product of a recognition process at this bottom level is thought of as an “uninterpreted image” or “pattern”; this is, as an instance of a data-structure which has not been assigned meaning (e.g., If a person does not know Greek, but is familiar with the Greek alphabet, he or she can recognize that a text written in Greek is in fact a text, but is unable to tell what does it mean. So, for this person, a string of Greek symbols is an uninterpreted image). The architecture contains also and intermediate level which is constituted by modality specific interpreters; the role of these interpreters is to assign meanings to the uninterpreted images in terms of the expectations of the current conversational situation. In our scheme, expectations are also used as indices to memory objects, and an interpretation act consists of matching a memory object indexed by a current expectation with the uninterpreted image produced by the recognition device. The output of this process is the interpretation of the speech act performed by the interlocutor, or the inter-

134

A Multimodal Dialogue System for Playing the Game "Guess the card"

cific devices (e.g., the speech synthesizer and the display drivers to render texts, pictures or videos).

cessfully at the second time, and it was in fact the card chosen by the system.

5.

This application has two dialogue models. The main defines the greeting part, the choosing of the card, the actual cycle of questions and answers (as a recursive situation) and the final salutation. The embedded model handles the interrogatory, and the verification of the child’s choice. Figure 6 illustrates the embedded dialogue model (r1 (C)), which has the card chosen by the system as its parameter. At this point the system asks the first question; no input is required since the input edge is labeled with an empty expectation (the symbol). The system reaches the listening situation l4 and waits for an answer. This situation has two output edges, so there are two active expectations at this point of the dialogue. One is that the user asks for a feature of the card (e.g., “is it red?”) and the other is that the user states what is the card in question (e.g., “is it the sun?”). This latter expectation is satisfied when the child feels that he or she has enough information, regardless whether the four questions have been made. Notice that this is an indirect speech act, as an assertion has been made through a question; however, the system performs the right interpretation in terms of the reference to the entity (i.e., the sun). Questions about features are interpreted through a similar referential heuristics. If the user asks for a feature, the expectation f eature(x) is satisfied, the MRS validate(C, X) is performed an the situation n3 is reached. This MRS checks whether the card chosen by the system has such a feature, and renders a relevant feedback text (through speech) and picture (displayed on the screen). However, if the child states the card, the system ask him or her to confirm whether he or she wants to the end the game, and situation l5 is reached. Situation n3 has only one output edge with an empty expectation; however, both the MRS and next situation of this edge are specified through functions that depend on the dialogue history. In particular, how many questions have been made. So, the actual rhetorical act produced at each transition would vary depending on how advanced is the dialogue; in case there are still questions to be made the situation l4 is reached

Figura 5: Dialogue system architecture

4.

Dialogue Models for the Application

Task

The present application plays the game “guess the card”. The goal of the human user is to guess a card that is chosen by the system from a set of ten cards. The cards have astronomical motives (e.g., the sun, a telescope, etc). Since the system is placed in a science museum oriented to children most users are children aged between 10 and 14 years old. In the stand there is a table with the cards on it; in a typical session the system introduces itself, asks for the name and age of the child, and explains the rules of the game. At this point, the system tells the user that it has chosen one of the cards. The child is then allowed to ask up to four questions about the features of the card in question. At the end of the interrogatory, the child is asked to take the card that he or she thinks is the right one and place it in front of the system’s camera; then, the system confirms whether the child has won the game, or tells the child which card was the right one. Table 1 shows an excerpt of a dialogue between a user and the system. Each row in the table contains the turn number, the speaker, the elocution, and the visual input and output produced by the user and system. In turns 1 to 3 the system greets the user and asks for his or her name. Turns 40 to 47 correspond to two questions with the corresponding system’s answer. This fragment includes two failures with the corresponding recovery strategies. Finally, from turns 48 to 52 the user shows the card with the corresponding system answer. In particular, in this dialogue, the system identified the card suc-

135

Ivan Meza, Elia Pérez, Lisset Salinas, Héctor Avilés, Luis A. Pineda

Turn

Speaker

Interaction

Screen

1

S

2 3

U S

40

U

41

S

42 43 44 45 46

U S U S U

47

S

Lo siento, en mi carta no aparece un planeta/I am sorry, there is no planet in the card

48

S

Ha finalizado tu ronda de preguntas, por favor mu´estrame la carta que crees que escog´ı/You have no chances left, please show me the card you think I chosed.

49

U

[The user shows the card]

50

S

51 52

U S

No tengo muy buena vista, es ´esta la carta que me mostraste/I do not have good eye sight, is this the card you showed me?. No no/No no Ok, mu´estrame la carta/Okay, show me the card

Hola a todos me llamo Golem y ´este es el juego de adivina la carta. Tenemos jugador, ¿c´omo te llamas?/Hello everyone, my name is Golem and this is the game of guess the card. We have a player, what is your name? Olimpia Hola Olimpia, gusto en conocerte, ¿cu´antos a˜ nos tienes?/Hello Olimpia, nice to meet you, how old are you? ... ¿redondo?/round? S´ı, vas bien. Lo que hay en la carta es redondo. S´olo te queda esta u ´ltima pregunta as´ı es que pi´ensala muy bien/Yes, you are right. In the card there is something round. This is your last chance think it well – ¿Podr´ıas repetirlo?/Could you repeat it? – ¿Podr´ıas repetirlo?/Could you repeat it? ¿Es un planeta?/Is it a planet?

Cuadro 1: Excerpt of a dialogue between the user (U) and the system (S). again; however, when the child has exhausted the four chances, the dialogue reaches the situation n4 ; this situation has also only one output edge with an empty expectation, and the situation s1 is reached deterministically. This is a seeing situation where the system asks the child to show it the card, entering

the final part of the dialogue. The remaining nodes and edges of the graph are understood in a similar way.

6.

Evaluation

We have performed a preliminary evaluation of our system using the user-satisfaction

136

A Multimodal Dialogue System for Playing the Game "Guess the card"

cesses we use the Open Agent Architecture framework (Cheyer y Martin, 2001). For the ASR system we use the Sphinx3 system (Huerta, Chen, y Stern, 1999). In particular for the system presented here, we developed a speech recognizer for children which are our main users. For this we collected the Corpus DIMEx100 Children. This is a corpus based on our previous work with the Corpus DIMEx100 Adults (Pineda et al., 2009). For this corpus, 100 children were-recorded the same 5, 000 sentences of the adults version. With this setting we were able to get a 47,5 % word error rate (WER) with a basic language model based on the sentences of the corpus. This performance is comparable with the 48,3 % WER we obtain with the adult version of the corpus which has been further validated (Pineda et al., 2009).

Figura 6: Example of a dialogue model for guessing a card Factor TTS Performance ASR Performance Task ease Interaction pace User expertise System response Expected behavior Future use

Percentage 90 % 50 % 50 % 80 % 50 % 70 % 60 % 90 %

For visual perception we use feature extraction and matching based on the SpeededUp Robust Features (SURF ) algorithm (Bay et al., 2008). This algorithm consists of three main steps: i) detection of interest points, ii) description of interest points, and, iii) object matching. Detection of interest points is based on the determinant of Hessian matrix (approximated by simple weighted box filters) to detect extrema pixels (i.e., pixels with darkest or lightest values) across a scalespace representation of the image. This representation is useful to achieve size invariance. Description of each interest point is composed by sums of 2D Haar wavelets responses to reflect intensity changes of square patches around the interest point. Integral images (Viola y Jones, 2004) are used to speed-up convolution. Object matching is carried out by nearest neighbor search and the trace of the Hessian matrix to distinguish between bright interest points on dark backgrounds and the inverse setting. Although in this system we are considering card identification only, this ideas can be extended to include different tasks of visual analysis. For example, in (Aviles et al., 2010) we have used this architecture to identify posters and also posters’ regions chosen by users through pointing gestures within the context of a tour–guide robot. SURF implementation is based on OpenCV (Bradski y Kaehler, 2008) with a naive nearest neighbor search.

Cuadro 2: Percentage of the “yes” answers to the user-satisfaction questionnaire. questionnaire from the PARADISE framework (Walker et al., 1997). For this, ten children played the game. All the children finished the game, and four of them guessed the right card. In average, there were 33,64 user turns. The visual system was able to identify the card at a rate of 1,18 tries per card. Table 2 summarizes the results obtained for the user satisfaction questionnaire. The game by itself is hard for the children, as can be seen in the task ease and user expertise factors. The children usually hesitate about what to ask, even though they are presented with suggestions by the system. The quality of ASR system needs to be improved considerably; in particular, the interpretation of the names and children’s ages has proven difficult. However, despite these shortcomings, the majority of children would like to play with the system again.

7.

Implementation

The dialogue manager is implemented in Prolog. The modality specific interpreter and recognition modules are defined as independent processes, implemented with different programming languages and environments. For the control and communication between pro-

137

Ivan Meza, Elia Pérez, Lisset Salinas, Héctor Avilés, Luis A. Pineda

8.

Conclusions

ning OpenCV: Computer Vision with the OpenCV Library. ORilley.

In this paper we have presented a multimodal application with spoken language and vision developed with the framework of dialogue models specification and interpretation that we have developed over the years. The present system shows that this methodology and programming environment is mature enough to build real applications for the general public in a museum and similar kind of environments in a relatively short amount of time. Our methodology is focused on practical dialogues for task oriented applications that can be characterized through generic conversational protocols that can be found through analysis. The notion of global and specific context permits to interpret dialogue or speech acts in a simple way, without extensive syntactic and semantic analysis, as the context constraints very heavily the possible interpretations. The expressive power of FRTN and the specification of abstract expectation and actions permit to model the conversational domain through simple protocols, that nevertheless generate rich and diverse conversational behavior. The present application has been evaluated in a preliminary way, and current results suggest that the quality of ASR system needs to improve considerably for the construction of robust applications. Although the present system is operational, and most children are able to complete the game, and are willing to play it again, showing a reasonable degree of user satisfaction, communication failures are still quite high, and there is a considerable conversational effort expended on recovery dialogues. Nevertheless we are confident that the present methodology has good potential for future applications.

Cheyer, Adam y David Martin. 2001. The open agent architecture. Journal of Autonomous Agents and Multi-Agent Systems, 4(1):143–148, March. OAA. Clark, H.H. y E.F. Schaefer. 1989. Contributing to discourse. Cognitive Science, 13:259–294. Huerta, J. M., S. J. Chen, y R. M. Stern. 1999. The 1998 carnegie mellon university sphinx-3 spanish broadcast news transcription system. En Proc. of the DARPA Broadcast News Transcription and Understanding Workshop. Mann, W. C. y S. Thompson. 1988. Rhetorical structure theory: Towards a functional theory of text organization. Text, 8(3):243–281. Pineda, L., V. Estrada, S. Coria, y J. Allen. 2007. The obligations and common ground structure of practical dialogues. Revista Iberoamericana de Inteligencia Artificial, 11(36):9–17. Pineda, Luis, H. Castellanos, J. Cu´etara, L. Galescu an J. Ju´arez, J. Llisterri, P. P´erez, y L. Villase nor. 2009. The corpus dimex100: Transcription and evaluation. Language Resources and Evaluation. Viola, Paul A. y Michael J. Jones. 2004. Robust Real-time Object Detection. International Journal of Computer Vision, 57(2):137–154. Walker, Marilyn A., Diane J. Litman, Candace A. Kamm, Ace A. Kamm, y Alicia Abella. 1997. Paradise: A framework for evaluating spoken dialogue agents. p´aginas 271–280.

Bibliograf´ıa Aviles, Hector, Ivan Meza, Wendy Aguilar, y Luis Pineda. 2010. Integrating Pointing Gestures into a Spanish-spoken Dialog System for Conversational Service Robots. To appear. Bay, Herbert, Andreas Ess, Tinne Tuytelaars, y Luc Van Gool. 2008. SURF: SpeededUp Robust Features. Computer Vision and Image Understanding, 110(3):346– 359. Bradski, G. y A. Kaehler.

2008.

Lear-

138

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 139-145

recibido 21-01-10 revisado 17-02-10 aceptado 05-03-10

An approach to Recognizing Textual Entailment and TE Search Task using SVM Una aproximación al RTE y a la Tarea de Búsqueda de Implicación Textual usando Máquinas de Soporte Vectorial Julio Javier Castillo Faculty of Mathematic Astronomy and Physics National University of Cordoba, Argentina [email protected]

Abstract: This paper shows a Recognizing Textual Entailment System, and a sub-system that address the Textual Entailment Search Task. This system employs a Support Vector Machine classifier with a set of 32 features, which includes lexical and semantic similarity for both two-way and three-way classification tasks. Additionally, we show an approach to dealing with the problem of searching entailment in a context of a set of documents that use co-reference analysis. Keywords: Textual entailment, machine learning, lexical features. Resumen: En este trabajo se presenta un sistema de Reconocimiento de Implicación Textual, y un subsistema que permite atacar el problema de la búsqueda de implicaciones. El sistema utiliza un clasificador de Máquina de Soporte Vectorial con un conjunto de 32 características, las cuales incluyen similaridad léxica y semántica para las tareas de clasificación de dos y tres vías. Adicionalmente, se presenta una aproximación inicial al problema de búsqueda de implicancias en un conjunto de documentos que utiliza análisis de correferencias. Palabras clave: Implicancia Textual, aprendizaje automático, características léxicas.

1

Introduction

The objective of the Recognizing Textual Entailment Challenge is the task of determining whether the meaning of the Hypothesis (H) can be inferred from a text (T). In RTE5 the texts comes from a variety of sources and includes typographical errors and ungrammatical sentences. The RTE5 is based on only three application settings: QA, IE, and IR, in contrast to previous RTEs. There is a new Textual Entailment Search Pilot Task that is situated in the summarization application setting, where the task has the goal of finding all Texts in a set of documents that entails a given Hypothesis. In this paper, we present a system that addresses the textual entailment recognition

ISSN 1135-5948

main task and textual entailment search pilot task. The system applies a Support Vector Machine (SVM) approach to the problem of recognizing textual entailment. Our system, work almost exclusively with lexical features, with the aims of exploring more deeply how lexical information could help in the RTE task. Then, we use 31 lexical features and only 1 semantic feature based on WordNet. These features are used to characterize the relationship between text and hypothesis for both training and test cases. The remainder of the paper is organized as follows: Section 2 describes the architecture of our system, whereas Section 3 shows the experimental evaluation and discussion of them. Finally, Section 4 summarizes the conclusions and lines for future work.

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Julio Javier Castillo

2

System Architecture

2.1. Preprocessing

This section provides an overview of our system, which is based on a machine learning approach for RTE. We use a supervised machine learning approach to train a SVM classifier over a variety of lexical and semantic metrics. Every output of the metrics is treated as a feature and used in the training step, taking the RTE3 devset, RTE4 annotated set, and RTE5 devset as training datasets. In Figure 1 we present a brief overview of the system.

The Preprocessing module has three optional submodules, as needed by the different features: Tokenizer: The text-hypothesis pairs are tokenized with the tokenizer of OpenNLP framework. Stemmer: Text-hypothesis pairs are stemmed with Porter’s stemmer1 (Lesk, 1986). Tagger: Text-hypothesis pairs are PoS tagged with the tagger in the OpenNLP2 framework. We tested three runs differing only in the preprocessing stage. For RUN1 we use 800 pairs of the RTE3 devset, 1000 pairs of the RTE4 testset and 600 pairs of the RTE5 devset as training set. Therefore 2400 pairs are used for training purposes. The RUN1 is trained with the union of the following datasets: RTE3 devset + RTE4 testset + RTE5 devset. On the other hand, RUN2 is trained with the union of these datasets: RTE3 devset + RTE4 testset + RTE5 devset, but without SUM sample test pairs. Therefore, 2000 pairs are used as training set. Finally, RUN3 is the result of applying three Support Vector Machines: SVM_QA, SVM_IR, and SVM_IE, trained over RTE3-DS + RTE4-TS +RTE5-DS. The SVM_QA is a SVM that is trained only by using the pairs of QA task over the datasets: RTE3 devset + RTE4 testset + RTE5 devset. In the same way, SVM_IR and SVM_IE are trained only by using IR and IE pairs, respectively. The training set for RUN3 is composed by 600 QA-pairs, 700 IE-pairs, and 700 IE-pairs. Table 1 shows the training set composition used for every SVM.

Figure 1: General architecture of the system for RTE5.

Datasets RTE3-DS_QA RTE4-TS_QA RTE5-DS_QA Total QA pairs RTE3-DS_IE RTE4-TS_IE RTE5-DS_IE Total IE pairs RTE3-DS_IR RTE4-TS_IR

First, the pairs are pre-processed with optional modules, as it is described later. Second, we compute 32 features that belong to two different categories: lexical and semantic metrics. Finally, we use a SVM classifier for 2way and 3-way classification tasks. Using a machine learning approach, we tested with SVM classifier in order to classify RTE-5 test pairs in three classes: entailment, contradiction or unknown.

1 2

140

Pairs Total Pairs 200 200 200 600 200 300 200 700 200 300

http://tartarus.org/~martin/PorterStemmer/ http://opennlp.sourceforge.net/

An approach to Recognizing Textual Entailment and TE Search Task using SVM

RTE5-DS_IR Total IR pairs

200

lcs(T , H ) 

700

Table 1: Training set composition for QA, IR and IE – SVM’s.

In all practical cases, min(Length(T), Length(H)) would be equal to Length(H) . Therefore, all values will be numerical in the [0,1] interval. Before performing LCS, texts were tokenized and stemmed.

The motivation of the input features was testing our system over a wide range of lexical features and trying to determinate whether this approach could improve our performance. 2.2.

Length( MaxComSub(T , H )) min( Length(T ), Length( H ))

(16) Block distance. (17) Chapman length deviation. (18) Chapman mean length. (19) Cosine similarity. (20) Dice similarity. (21) Euclidean distance. (22) Jaccard similarity. (23) Jaro. (24) Jaro Winkler. (25) Matching coefficient. (26) Monge Elkan distance (Agichtein, et al., 2008). (27) Needleman Wunch distance. (28) Overlap Coefficient. (29) QGrams distance. (30) Smith-Waterman distance. (31) Smith-Waterman with gotoh. (32) Smith-Waterman with gotoh windowed affine.

Features

We use a supervised machine learning approach to train a classifier over a variety of lexical and semantic metrics. Thus, we use the output of each metric as a feature, and train a SVM classifier. For this purpose, we use 32 features/metrics over Text (T) and Hypothesis (H) as explained below. The first 12 features do not require additional explanation. (1) Percentage of Words of Hypothesis in the text. (2) Percentage of word of text in hypothesis. (3) Percentage of bigrams of Hypothesis in Text. (4) Percentage of trigrams of hypothesis in the text. (5) TF-IDF Measure. (6) Standard Levenshtein distance (Levenshtein, 1966) (character based). (7) Percentage of words of Hypothesis in the text. (8) Percentage of words of text in Hypothesis (over stems). (9) Percentage of bigrams of hypothesis in the Text (over stems) (10) Percentage of trigrams of Hypothesis in Text (over stems). (11) TF-IDF measure (over stems). (12) Levenshtein distance (over stems). (13) String similarity based on Levenshtein distance using Wordnet as defined in (Castillo, 2008). (14) Semantic similarity using WordNet (Castillo et al. 2008). (15) Longest common substring:

The features 1 to 5, 7, and 16 to 32, were treated as bags of words. On the other hand, features 8 to 12 were treated as bags of stems. The features 16 to 32 were calculated using SimMetrics3 Library over string T and H, and following the traditional definition for every one of them.

2.3. Textual Entailment Search Pilot Task In order to move towards more realistic scenarios and start testing RTE systems against real data, textual entailment search is proposed. So, Textual Entailment Search Pilot task has the goal of analyzing the potential impact of textual entailment recognition on a real NLP application task. The Textual Entailment Search task consists of finding all the sentences in a set of documents that entails a given Hypothesis. The systems must find all the entailing sentences (Ts) in a corpus of 10 newswire documents about a common topic. So, the main difference

Given two strings, T of length n and H of length m, the Longest Common Sub-string (LCS) method (Levenshtein, 1966) will find the longest strings which are substrings of both T and H. It is founded by dynamic programming.

3

141

http://sourceforge.net/projects/simmetrics/

Julio Javier Castillo

So, from a semantic point of view, the H28 provides more information but is equivalent for our textual entailment task because we replace the same entity in all occurrences in the document. Once this process is performed, every pair of a document is taken and fed into the system, such as explained before, following the RUN1 preprocessing procedure, but with outputs True/False.

with respect to the main task is that in the Entailment Search task both Text and Hypothesis have to be interpreted in the context of the corpus. In this proposal, we show a textual entailment search subsystem based on coreference analysis. The assumption is that using coreference analysis we will be able to recognize true and false entailments in the context of the corpus where T and H belongs to. As coreference tool we use OpenNlp toolkit. Our system has an extension to deal with Textual Entailment Search problem. It is a new module that performs the following algorithm:

2.3.1. Features used in Textual Entailment Search Task In the Textual Entailment Search Task we use only four features numbered, which are (12), (13), and (14) and (15), as explained before. We choose only a limited set of features, due to computational constrains; this is because processing all coreferences for all documents it is a very slow process. The motivation of the input features: Levenshtein distance (12) is motivated by the good results obtained as a measure of similarity between two strings. Using stems, this measure improves the Levenshtein over words. The lexical distance feature (13) based on Levenshtein distance is interesting because works to sentence level. Semantic similarity using WordNet (14) is interesting because of the capture of the semantic similarity between T and H to sentence level. Longest common substring (15) is selected because it is easy to implement and provides a good measure for word overlap.

1) Appends a Hypothesis hi to the document Dj. 2) Computes a coreference analysis over all documents Dj. 3) Identifies all coreferences that refer to the same entity. 4) Takes the longest reference and replaces all occurrences in the document. 5) Repeats for every Topic, Document and Text. The following example shown as an entity will be replaced by an equivalent entity adding redundant information. The example was extracted from the RTE Search Pilot Devset. [French President Jacques Chirac, 16] [Chirac, 16] Where the first string represents the noun phrase that is being referenced and the second number is a unique reference id in the whole document. Thus, the algorithm selects “French President Jacques Chirac” and replaces all references with the same id, using this noun phrase. Sometimes, the result won’t be a correct syntactically sentence. However, it will be human understandable. We expect that the overall sense of the sentence won’t be changed. Also, it is important to note that the previous algorithm, in some cases, could transform the hypothesis. For example, the hypothesis 28 in the testset is transformed as:

3 Experimental Evaluation Discussion of Results

and

3.1. Results: RTE5 main task Our results for RTE5 testset for two-way and three-way classification task are summarized in Table 2. In this table, we compare our results with those obtained by a selected set of systems that submitted their results to Textual Analysis Conference 2009, which is a common framework of evaluation. Thus, the high score and low score of the RTE5 participants and ablation test are shown below.

H28: “Bobby Fischer faced deportation to the United States.” H28-modified: “Fischer, an outspoken critic of the U.S government, faced deportation to the United States.”

142

An approach to Recognizing Textual Entailment and TE Search Task using SVM

RTE Systems Best System Score Median Score 2-way X1_abl-1 RUN1 RUN2 Median Score 3-way RUN3 Low Score

Acc 2way 0.7350 0.6117 0.5517 0.5517 0.545 --0.5483 0.50

features from the feature vector and working with 30 features. Wordnet resource has been ablated from RUN1. First, features 13 and 14 were removed of the feature vector, and then rerunning the system on the test set. The results obtained are named as “X_abl-1” and shown in Table 2.

Acc 3 way 0.6833 --0.53 0.5217 0.52 0.52 0.5183 0.4383

Interestingly, the ablation of these two features does not produce modification on twoway classification task and produces a very slight and not statistical significant increase on three-way task of 0.83%. In addition, removing the feature 14 (the only one that deals with semantic similarity) does not impact on the overall classification. Table 3 shows the results obtained on RTE two-way and three-way classification task for every RUN and subtask. Always the IR subtask yields the best results, maybe because this dataset is the easier subtask to predict.

Table 2: Results obtained with two-way and three-way classification task for RTE5 testset. We note that, RUN1 consists of 2400 pairs, RUN2 consists of 2000 pairs, and for RUN3 consists of 600 QA-pairs, 700 IR-pairs and 700 IE-pairs. This suggests that RUN1 reaches our best performance, because RUN1 has more samples as training set, despite of the fact that includes SUM samples pairs. However, RUN2 and RUN3 do not have a significant difference with respect to RUN1. For both two-way and three-way task, a slight and not statistical significant difference of 0.34% and 0.67% between the best and worst RUN is found. The performance, in all runs, was clearly above those low scores; however, our results are far from the best system score. The RUN1 was trained using full RTE3 devset + RTE4 testset + RTE5 devset. Our best performance was achieved with RUN1, and it was 55.17% and 52.17% of accuracy, for two and three way, respectively. The accuracy of this run for two-way task is placed 5% below of median score. On the other hand, it is placed 2.17% over the median score for three way task. Thus, we conclude that this lexical approach is very preliminary and need to be improved on several ways. An ablation test is a procedure that consists of “disconnecting” one module (using a knowledge resource) of the system, in order to asses the contribution of that module to the overall accuracy of the system. The ablation tests are very important because allows collecting data to better understanding of the impact of the knowledge resources used by RTE systems and evaluating the contribution of each resource to systems performance. We performed an ablation test of "Wordnet" resource. It is implemented by removing two

RTE Systems IR

RUN 13w 0.65

RUN RUN RUN RUN RUN 122332w 3w 2w 3w 2w 0.69 0.63 0.67 0.63 0.65

IE QA

0.41 0.5

0.44 0.52

0.41 0.52

0.44 0.54

0.45 0.48

0.48 0.53

Table 3: Accuracy results divided by task and run. Finally, we note that interestingly using four SVM, one for each task, we obtain similar results but using only 700 pairs.

3.2. Results: Textual Entailment Search Task We use Search Task Development Set (ST) and RTE3+ST as training set. We take all true cases and automatically generate false cases based on the development set in order to build a balanced training set. Table 4 summarizes our results for Textual Entailment Search Task. The high score and low score of RTE5 participants in TAC 2009 are shown below all together.

143

Julio Javier Castillo

RTE Systems

F- Measure

Precision

Recall

High Score

0.4559

---

---

RUNs Median Score RUNnew

0.3012

---

---

0,287

0,214

0,395

RUN1 Low Score

0.1816 0.0955

0.1016 ---

0.855 ---

On the other hand, our approach to Textual Entailment Search is very simple and preliminary, and need to be improved by using knowledge resources and more in depth coreference analysis. Future work is oriented to experiment with additional lexical, syntactic and semantic similarities features and to test the improvements they may yield.

Table 4: Result submission for Textual Entailment Search Pilot Task.

5

Giampiccolo, D. , Magnini B., Dagan I. and Dolan B. 2007. The Third PASCAL Recognizing Textual Entailment Challenge in Proceedings of the Workshop on Textual Entailment and Paraphrasing, pages 1–9, Prague. Castillo, J. and Alonso L. 2008. An approach using Named Entities for Recognizing Textual Entailment. TAC 2008, Gaithersburg, Maryland, USA. Lesk, M. 1986. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from a ice cream cone. In SIGDOC. Gusfield, Dan. 1999. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. CUP. Levenshtein, V. 1966. Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Soviet Physics Doklady, 10:707. Inkpen, D., Kipp D. and Nastase V. 2006. Machine Learning Experiments for Textual Entailment. Proceedings of the second RTE Challenge, Venice-Italy. Dolan, Bill, Quirk C. and Brockett C. 2004. Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In COLING ’04: Proceedings of the 20th international conference on Computational Linguistics, page 350, Association for Computational Linguistics. Morristown, NJ, USA. Zanzotto, F., Pennacchiotti F. and Moschitti A. 2007. Shallow Semantics in Fast Textual Entailment Rule Learners. In Proceedings of the Third Recognizing Textual Entailment Challenge, Prague. De Marneffe, M. et al. 2006. Manning.Learning to distinguish valid textual entailments. In Proceedings of the Third Recognizing Textual Entailment Challenge, Italy.

Also, we performed additional experiments testing 3 machine learning algorithms: Decision Trees, SVM, and Multilayer Perceptron. We show only the best results, which were obtained by using RTE3+ST as training set and SVM as classifier. This RUNnew is shown in the Table4. In the TAC 2009, eight teams submitted a total of 20 runs to this task. The RUN1 is clearly above the system with low score, and the RUNnew is placed slightly in the Median Score of the RUNs. Despite our very simple approach, we think that several improvements could be done in order to improve the F-score of the system, refining the before algorithm and using syntactic features and more semantic information.

4

References

Conclusion and Future Work

In this paper we use a set of lexical features and try to determine how lexical information helps in the textual entailment semantic task. We show our RTE system that performs two-way and three-way textual entailment. The best results are obtained on the three-way task. We present we address the Recognizing Textual Entailment main track, and also we describe an initial approach to the textual entailment search task. As conclusion, we need more balanced feature set using not only lexical features, but also syntactic and semantic features, in order to improve the accuracy of the system. Additionally, we need to compute correlations between all features in order to avoid “redundant information” at the moment of characterizing the RTE task.

144

An approach to Recognizing Textual Entailment and TE Search Task using SVM

Castillo, J. 2009. A Study of Machine Learning Algorithms for Recognizing Textual Entailment.RANLP2009, Borovets, Bulgaria. Agichtein, E. et al. 2008. Combining Lexical, Syntactic, and Semantic Evidence for Textual Entailment Classification. TAC 2008, Gaithersburg, Maryland, USA..

145

146

Tesis

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 149-150

recibido 1-10-09 revisado 08-02-10 aceptado 05-03-10

An´ alisis l´ exico robusto Robust Lexical Analysis Juan Otero Pombo Departamento de Inform´atica, Universidade de Vigo Escola Superior de Enxe˜ nar´ıa en Inform´atica, Edificio Polit´ecnico Campus de As Lagoas s/n, 32002 - Ourense [email protected] Resumen: Tesis doctoral de Inform´atica realizada por Juan Otero Pombo bajo la direcci´on de los doctores Manuel Vilares Ferro (Universidade de Vigo) y Jorge Gra˜ na Gil (Universidade de A Coru˜ na). La defensa tuvo lugar el 4 de junio de 2009 ante el tribunal formado por los doctores Guillermo Rojo S´anchez (Universidade de Santiago de Compostela), Jos´e Gabriel Pereira Lopes (Universidade Nova de Lisboa, Portu´ gal), Jean-Eric Pin (Laboratoire D-Informatique Algoritmique, Fondement CNRS, Francia), Leo Waner (Universidad Pompeu Fabra) y V´ıctor Manuel Darriba Bilbao (Universidade de Vigo). La calificaci´on obtenida fu´e Sobresaliente Cum Laude con menci´on de Doctor Europeo. Se puede obtener m´as informaci´on acerca de esta tesis en http://www.grupocole.org. Palabras clave: Correcci´on ortogr´afica, tokenizaci´on, etiquetaci´on morfosint´actica, recuperaci´on de informaci´on Abstract: PhD Thesis in Computer Science written by Juan Otero Pombo under the supervision of Dr. Manuel Vilares Ferro (Universidade de Vigo, Spain) and Dr. Jorge Gra˜ na Gil (Universidade de A Coru˜ na, Spain). The author was examined on 4th June, 2009 by the commitee formed by Dr. Guillermo Rojo S´anchez (Universidade de Santiago de Compostela, Spain), Dr. Jos´e Gabriel Pereira Lopes (Universidade Nova ´ de Lisboa, Portugal), Dr. Jean-Eric Pin (Laboratoire D-Informatique Algoritmique, Fondement CNRS, France), Dr. Leo Waner (Universidad Pompeu Fabra, Spain) and Dr. V´ıctor Manuel Darriba Bilbao (Universidade de Vigo, Spain). The grade obtained was Sobresaliente Cum Laude, with an European Doctor mention. Further information is available at http://www.grupocole.org. Keywords: Spelling correction, tokenization, part of speech tagging, information retrieval

1.

Introducci´ on

Aunque en los u ´ltimos a˜ nos se han realizado importantes avances, los fundamentos te´oricos del Procesamiento del Lenguaje Natural (pln) se encuentran todav´ıa en constante evoluci´on. Resulta por tanto de especial inter´es el desarrollo de tecnolog´ıa de base, imprescindible para abordar tareas de mayor nivel como la traducci´on autom´atica, elaboraci´on autom´atica de res´ umenes, b´ usqueda de respuestas y, en particular la recuperaci´ on de informaci´ on (ri). En este sentido, y como primer paso en esta escala de problemas, adquiere especial relevancia el desarrollo y mejora de t´ecnicas que permitan manejar el l´exico. M´as concretamente, nos hemos centrado aqu´ı en el desarrollo de un marco com´ un que nos permita representar y resolver las ambig¨ uedades presentes en este nivel de

ISSN 1135-5948

an´alisis.

2.

Objetivos

El objetivo principal de esta tesis ha sido el desarrollo y evaluaci´on de la tecnolog´ıa de base necesaria para el pln, m´as concretamente en el ´ambito del an´alisis l´exico, la correcci´on ortogr´afica y la etiquetaci´on. Nuestros mayores esfuerzos se han centrado en el desarrollo de un nuevo m´etodo de correcci´on ortogr´afica regional sobre aut´ omatas finitos (af) como alternativa a los m´etodos de correcci´on global cl´asicos, integrando las t´ecnicas desarrolladas en la herramienta de etiquetaci´on MrTagoo con el fin de aprovechar la informaci´on morfosint´actica contextual embebida en el modelo estoc´astico que subyace en dicha herramienta, y determinar el grado de idoneidad de las alternativas

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Juan Otero Pombo

de correcci´on obtenidas. Por otra parte, la minimizaci´on del coste computacional, tanto desde el punto de vista espacial como temporal, ha sido prioritaria a lo largo de todo el proyecto mediante el uso de tecnolog´ıa de estado finito y la integraci´on de los m´etodos implementados sobre una estructura de datos com´ un, aplicando t´ecnicas de programaci´on din´amica que redundan en un importante ahorro al evitar la repetici´on innecesaria de c´alculos. De este modo, hemos desarrollado una herramienta de an´alisis l´exico robusto capaz de manejar los tres tipos de ambig¨ uedades que pueden surgir en esta fase: La ambig¨ uedad morfosint´actica, que surge cuando a una unidad l´exica le pueden ser asignadas diferentes etiquetas morfosint´acticas; la ambig¨ uedad segmental, que aparece cuando es posible dividir el texto en unidades l´exicas de m´as de un modo; y la ambig¨ uedad l´exica, que es la que introducen los m´etodos de correcci´on ortogr´afica cuando ofrecen varias alternativas de correcci´on.

3.

informaci´on ling¨ u´ıstica, tanto desde el punto de vista sem´antico como sint´actico, deber´ıa reducir de forma significativa esta diferencia de precisi´on, que es menor del 4 %, o podr´ıa incluso eliminarla. Nuestro siguiente paso, consisti´o en comprobar si la p´erdida de precisi´on del m´etodo regional pod´ıa ser compensada en un entorno de correcci´on contextual, ya que el hecho de que ´este devolviese un menor n´ umero de alternativas podr´ıa repercutir de forma positiva en la precisi´on del sistema global. Para ello hemos integrado los algoritmos implementados en una herramienta de etiquetaci´on morfosint´actica capaz de manejar ambig¨ uedades de segmentaci´on o de tokenizaci´on. Nuestros experimentos no han corroborado la hip´otesis inicial, pero han servido para evidenciar que el incremento del rendimiento del m´etodo regional en t´erminos de espacio y tiempo respecto al global era a´ un mayor cuando aplic´abamos estas t´ecnicas en un entorno de correcci´on contextual. Esto era debido a que, adem´as de resultar m´as eficiente desde el punto de vista computacional, el algoritmo regional ofrece un menor n´ umero de alternativas de correcci´on. Esto nos anima a continuar en la b´ usqueda de t´ecnicas y heur´ısticas que nos permitan determinar cuando es posible optar por una correcci´on regional. Finalmente, hemos realizado pruebas con el fin de verificar la utilidad pr´actica de nuestra propuesta en un entorno de Recuperaci´ on de Informaci´ on en el que las consultas presentan errores. Para ello, hemos comparado tres m´etodos. El primero, consiste en expandir las consultas con todas las alternativas de correcci´on. El segundo, aplica nuestro corrector contextual para determinar cu´al de las alternativas obtenidas encaja mejor en el contexto de la palabra err´onea. El tercero, evita la aplicaci´on de m´etodos de correcci´on ortogr´afica al utilizar n-gramas para la indexaci´on y la recuperaci´on. Los resultados de estos experimentos revelan que el uso de t´ecnicas de correcci´on ortogr´afica mejora el rendimiento de los sistemas de ri basados en extracci´on de ra´ıces. Sin embargo, cuando no est´en disponibles los recursos necesarios para aplicar estas t´ecnicas, o resulte de inter´es un sistema independiente del idioma, el uso de n-gramas permite obtener mejoras significativas. Un factor a tener presente es la longitud de las consultas y los documentos, as´ı como el ratio de error presente en las primeras.

Resultados

En primer lugar, hemos desarrollado un nuevo m´etodo de correcci´on ortogr´afica regional sobre afs cuya caracter´ıstica diferencial radica en el concepto de regi´on que nos permite delimitar el ´area de reparaci´on de una palabra err´onea, en contraposici´on con los m´etodos de correcci´on globales que aplican las operaciones b´asicas de reparaci´on en todas las posiciones de la palabra sin tener en cuenta el punto en que el error es detectado. Para estimar la viabilidad del m´etodo desarrollado, hemos implementado tambi´en un m´etodo de correcci´on global que nos ha servido como referencia a la hora de evaluar el rendimiento, la cobertura y la precisi´on de nuestra propuesta. Los resultados preliminares corroboraban no s´olo que el rendimiento ofrecido por nuestra t´ecnica de correcci´on regional era superior al que arrojaba el m´etodo global, sino tambi´en que la diferencia entre ambos crec´ıa al mejorar la localizaci´on del primer punto de error. Adem´as, el m´etodo regional ofrec´ıa un menor n´ umero de alternativas de correcci´on debido a que acotaba la zona del af a explorar a la regi´on en la que se detecta el error. Otro aspecto a tener en cuenta era el ratio de acierto del m´etodo. En el caso de nuestro m´etodo regional es del 77 % frente al 81 % del global, aunque cab´ıa esperar que la integraci´on de

150

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 151-152

recibido 27-01-10 revisado 15-02-10 aceptado 08-03-10

Arabic Named Entity Recognition∗ ´ Reconocimiento de Entidades Nombradas en Textos Arabes Yassine Benajiba DSIC - Universidad Polit´ecnica de Valencia Center for Computational Learning Systems (CCLS), Columbia University, New York City, NY [email protected] Resumen: Tesis doctoral en Inform´atica realizada por Yassine Benajiba y dirigida por el doctor Paolo Rosso (Univ. Polit´ecnica de Valencia). El acto de defensa de tesis tuvo lugar en Valencia en Mayo de 2009 ante el tribunal formado por los doctores Felisa Verdejo (UNED), Mona Diab (Columbia Univ.), Imed Zitouni (IBM T.J. Watson Research Center), Horacio Rodriguez (Univ. Polit´ecnica de Catalu˜ na) y Encarna Segarra (Univ. Polit´ecnica de Valencia). La calificaci´on obtenida fue Sobresaliente Cum Laude. Palabras clave: Extracci´on de informaci´on, Reconocimiento de Entidades Nombradas, Procesamiento de idiomas con morfolog´ıa compleja. Abstract: PhD thesis in Computer Science written by Yassine Benajiba under the supervision of Dr Paolo Rosso (Univ. Polit´ecnica de Valencia). The author was examined in May 2009 in Valencia by the committee formed by Felisa Verdejo (UNED), Mona Diab (Columbia Univ.), Imed Zitouni (IBM T.J. Watson Research Center), Horacio Rodriguez (Univ. Polit´ecnica de Catalu˜ na) and Encarna Segarra (Univ. Polit´ecnica de Valencia). The grade obtained was Cum Laude. Keywords: Information Extraction, Named Entity Recognition, Complex Morphology Languages Processing.

1.

Introduction

In this Ph.D. thesis we have investigated the problem of the recognition and classification of Named Entities (NEs) within Arabic text, i.e. Arabic Named Entity Recognition (NER). In order to achieve this goal, we have explored a wide range of features including: lexical, morphological and syntactic ones, we have employed three different discriminative Machine Learning (ML) approaches and we have validated our approach on 9 standard data-sets. We have studied the following problems: 1. The difference in performance when using a 2-step approach as an attempt to separate the problem of recognizing the NEs from the one of classifying them; 2. The relevance of using the rich morphology of the Arabic language in order to obtain a high performance NER system; 3. The difference in performance when difference ML approaches are employed; we ∗

This PhD thesis was supported by an AECI scholarship

ISSN 1135-5948

have studied the possibility of combining them; and 4. Transferring knowledge about NEs from another language, i.e. English, in order to enhance the performance of an Arabic NER system. Even though our study is focused on Arabic, as a language, and NER, as a task, the obtained results might be easily extrapolated to most of the morphology complex/rich languages and most of the Information Extraction tasks.

2.

Thesis overview

In this thesis, we have addressed the challenges raised by the morphologically rich languages. Arabic in our case, to a supervised NLP task: i.e. NER. The document (Benajiba, 2009) is structured as follows: In Chapter 1, we have introduced basic concepts and we summarize the major contributions of the research work carried out. Chapter 2 describes the challenges of NLP in general and NER in particular for the Arabic language.

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Yassine Benajiba

In Chapter 3, the NER task is introduced. This includes both presenting the standard definitions of the task and giving an overview on the most influential research works in the NER field in general and in the Arabic NER field in particular. In this chapter, we attempt to make it easier for the reader to see where the contribution of this thesis stands exactly. Chapter 4 describes the three ML discriminative approaches which are used in our research study, namely: Maximum Entropy (ME), Conditional Random Fields (CRFs) and Supports Vector Machines (SVMs). All of these ML approaches are feature-based and have proved to be efficient for sequence classification problems. Chapter 5 is dedicated to present our 1step and 2-steps, ME-based, Arabic NER system. In this section we first report the results obtained when a ME-based classifier is used to build the system. Following, we report our results when the NER task is split into two sub-tasks where the first one determines the spans of the NEs within the text and the second one assigns a class to each one of them. This study is important for it provides empirical proves that enhancing the performance using a 2-step approach is only possible when the first step achieves a performance close to 100 points of F-measure. In Chapter 6 we have switched our focus to the features and the ML approaches which we might resort to in order to boost our system. We conduct several experiments where ME, CRFs and SVMs are used with different feature-sets. We validate these experiments on 9 different standard data-sets of different genres (newswire, broadcast news and weblogs). The major contribution of this study is that it has shown that different ML approaches might benefit differently from the available features and they also obtain different results for the different NE classes. This has triggered the research work which we present in Chapter 7 where we have combined different classifiers where each classifier is trained for one NE class. Similarly, the feature selection is done for each classifier separately. The combination of the different classifiers is done finally in order to obtain a single outcome. Chapter 8 introduces a very novel approach where we project knowledge about NEs from another language, i.e. English. This research study has been conducted in collaboration with IBM T.J. Watson Research Center as a six-month internship subject of the Ph.D. student Yassine Benajiba. The results show that a statistically significant improvement is always obtained. Finally, in Chapter 9 we have drawn our

conclusions and discussed some interesting research directions.

3.

Thesis contributions

The major contributions of the investigations carried out are: 1. A deep analysis of the difference of behavior of different ML approaches in the context of NER; 2. A multi-classifier approach has been shown to lead to the best results; 3. Statistically significant improvement is obtained across the different data genres when additional knowledge about NEs is projected from a resource-rich language such as English; 4. The study of an effective approach to build an efficient and robust Arabic Named Entity Recognition system; 5. Providing empirical proof that the morphology richness of the Arabic language can be employed to enhance the performance of an NER system; 6. The ANERcorp data-set together with SVMs and CRFs Arabic NER models have been made publicly available for the research community1 . We have validated our results on 9 datasets of 4 different genres, namely: newswire, broadcast news, Arabic Treebank and Weblogs. Our experiments are easily replicable for anyone because they are all built on top of publicly available tools. We have also described in details our incremental selection approach which can be easily used in case a new feature is being added to the NER system. To our knowledge, our research study is the most extensive work which has been reported, up to now, on Information Extraction for morphologically rich languages and our results are very competitive on a world wide scale. A summarized description of part of the research work can be found in (Benajiba et al., 2009).

References Yassine Benajiba, Mona Diab, and Paolo Rosso. 2009. Arabic named entity recognition: A feature-driven study. The special issue on Processing Morphologically Rich Languages of the IEEE Transaction on Audio, Speech and Language Processing, July. Yassine Benajiba. 2009. Named entity recognition. Ph.D. Thesis dissertation, Universidad Polit´ecnica de Valencia, May, http://users.dsic.upv.es/∼prosso/ resources/BenajibaPhD.pdf. 1

152

http://www.dsic.upv.es/grupos/nle/downloads.html

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 153-154

recibido 08-01-10 revisado 17-02-10 aceptado 06-03-10

Parsing Schemata for Practical Text Analysis Esquemas de An´ alisis Sint´ actico para el An´ alisis Pr´ actico de Textos Carlos G´ omez-Rodr´ıguez Universidade da Coru˜ na Campus de Elvi˜ na, s/n, 15071 A Coru˜ na [email protected] Resumen: Tesis doctoral en Inform´atica realizada por Carlos G´omez-Rodr´ıguez bajo la direcci´on de los doctores Miguel A. Alonso Pardo (Univ. da Coru˜ na) y Manuel Vilares Ferro (Univ. de Vigo). La defensa de la tesis tuvo lugar el 5 de junio de 2009 ante el tribunal formado por los doctores John A. Carroll (Univ. of Sussex), Giorgio Satta (Univ. degli Studi di Padova), V´ıctor Jes´ us D´ıaz Madrigal (Univ. de Sevilla), Leo Wanner (Univ. Pompeu Fabra) y Jes´ us Vilares Ferro (Univ. da Coru˜ na). La calificaci´on obtenida fue de Sobresaliente Cum Laude por unanimidad, con menci´on de Doctor Europeo. Palabras clave: Esquemas de an´alisis sint´actico, an´alisis sint´actico, formalismos gramaticales, an´alisis de constituyentes, an´alisis de dependencias Abstract: PhD thesis in Computer Science written by Carlos G´omez-Rodr´ıguez under the supervision of Miguel A. Alonso Pardo (Univ. da Coru˜ na) and Manuel Vilares Ferro (Univ. de Vigo). The author was examined on June 5, 2009 by the following committee: John A. Carroll (Univ. of Sussex), Giorgio Satta (Univ. degli Studi di Padova), V´ıctor Jes´ us D´ıaz Madrigal (Univ. de Sevilla), Leo Wanner (Univ. Pompeu Fabra) and Jes´ us Vilares Ferro (Univ. da Coru˜ na). The grade obtained was unanimous Sobresaliente Cum Laude, with European Doctorate Mention. Keywords: Parsing schemata, parsing, grammatical formalisms, constituency parsing, dependency parsing

1

Introduction

This dissertation provides several theoretical and practical tools that extend the applicability of Sikkel’s theory of parsing schemata (Sikkel, 1997) in several different directions. First, a compilation technique is defined that can be used to obtain efficient implementations of parsers automatically from their corresponding schemata. This makes it possible to use parsing schemata to prototype and test parsing algorithms, without the need of manually converting the formal representation to an efficient implementation in a programming language. Second, the range of parsing algorithms that can be defined by means of schemata is extended with the definition of new variants of the formalism that can deal with errorrepair parsers and dependency-based parsers. Apart from these tools themselves, the dissertation also introduces several research results that have been obtained by using them. The compilation technique is used to

ISSN 1135-5948

obtain implementations of different parsers for context-free grammars (CFG) and treeadjoining grammars (TAG) and perform an empirical analysis of their behaviour with real-sized grammars. The extension of parsing schemata for error-repair parsing is used to define a transformation that can automatically add error-repair capabilities to parsers that do not have them. Finally, the extension of parsing schemata for dependency parsing is used to find formal relationships between several well-known dependency parsers, as well as to define novel algorithms for mildly non-projective dependency structures. An extended version of the thesis is scheduled for publication by Imperial College Press in Fall 2010 (G´omez-Rodr´ıguez, 2010).

2

Overview

The thesis is structured in five parts. The first part is introductory, containing a first chapter which summarises the main goals and contributions of the thesis, and a second

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Carlos Gómez-Rodríguez

chapter defining the formalism of parsing schemata, that will be used throughout the thesis. The second part presents a compiler for parsing schemata and several empirical studies of constituency-based parsers conducted with it. The third part introduces an extension of parsing schemata that can be used to describe error-repair parsers. The fourth part is devoted to a variant of schemata for dependency-based parsers. Finally, the fifth part summarises conclusions and discusses future work. A chapter-by-chapter breakdown of the three central parts follows.

is then used to develop a generic technique for obtaining efficient robust parsers based on regional error repair, and empirical performance results are provided. In Chapter 7, the framework of errorrepair parsing schemata is used to define a general transformation technique to automatically obtain robust, error-repair parsers from standard non-robust parsers. The technique can be applied to a wide range of parsers, and the schemata thus obtained can be implemented with global, regional or local error-repair techniques, using the compiler of Chapter 3.

2.1

2.3

Part II: Compiling Parsing Schemata

In Chapter 8 a variant of parsing schemata is defined that can be used to describe, analyse and compare dependency parsing algorithms. This extension is used to establish clear relations between several existing projective dependency parsers and prove their correctness. Parsing schemata for non-projective dependency parsers are also presented. A variant of the formalism to represent parsers based on link grammar is also shown, including examples of how some existing dependency parsers can be adapted to produce parsers for link grammar. In Chapter 9, parsing schemata are used to solve the problem of efficiently parsing mildly non-projective dependency structures. Polynomial time parsing algorithms are presented for various mildly non-projective dependency formalisms, including well-nested structures with gap degree bounded by a constant k, and a new class of mildly ill-nested structures for gap degree k which includes all the sentences present in a number of dependency treebanks.

Chapter 3 presents a compiler able to automatically transform parsing schemata into efficient Java implementations of their corresponding algorithms. The system analyzes the deduction steps in the input schema in order to determine the best data structures and indexes to use, ensuring that the generated implementations are efficient. In Chapter 4, the compiler presented in Chapter 3 is used to generate implementations of three well-known CFG and TAG parsing algorithms and compare their empirical performance on several grammars taken from natural language corpora. In particular, TAG parsers are compared with the XTAG grammar, a real-life, wide-coverage featurebased TAG. Note that previous comparative studies of TAG parsers in the literature were done with “toy” grammars, but there were no such studies with wide-coverage grammars. In Chapter 5, the parsing schema compiler is used to conduct further empirical studies of CFG and TAG parsers. Artificially generated grammars are used to measure the parsers’ empirical computational complexity with respect to string and grammar size, as well as the overhead caused by using TAG to parse context-free languages.

2.2

Part IV: Parsing Schemata for Dependency Parsers

References G´omez-Rodr´ıguez, Carlos. 2010. Parsing Schemata for Practical Text Analysis. Mathematics, Computing, Language, and Life: Frontiers in Mathematical Linguistics and Language Theory. Imperial College Press, London, UK.

Part III: Parsing Schemata for Error-Repair Parsers

Chapter 6 introduces error-repair parsing schemata: an extension of parsing schemata which can be used to define parsers that can robustly handle sentences with errors or inconsistencies. We show how this formalism can be used to describe existing algorithms and prove their correctness. The framework

Sikkel, Klaas. 1997. Parsing Schemata — A Framework for Specification and Analysis of Parsing Algorithms. Texts in Theoretical Computer Science — An EATCS Series. Springer-Verlag, BerlinHeidelberg-New York.

154

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 155-156

recibido 08-01-10 revisado 10-02-10 aceptado 04-03-10

on de preguntas basados en corpus para la Sistemas de clasificaci´ b´ usqueda de respuestas Corpus-based question classification in question answering systems David Tom´ as Depto. de Lenguajes y Sistemas Inform´aticos - Universidad de Alicante Carretera San Vicente del Raspeig s/n 03690, Alicante, Espa˜ na [email protected] Resumen: Esta tesis se centra en el desarrollo de sistemas autom´aticos de clasificaci´on de preguntas f´acilmente adaptables a diferentes idiomas y dominios de trabajo. Estos sistemas se basan en t´ecnicas de aprendizaje autom´atico sobre corpus, siguiendo un enfoque estad´ıstico del tratamiento del lenguaje humano. El objetivo es evitar en gran medida el uso de herramientas y recursos ling¨ u´ısticos m´as all´a de los propios corpus de aprendizaje, obteniendo sistemas que destacan por su flexibilidad y sus escasos requerimientos. Palabras clave: Clasificaci´on de preguntas, b´ usqueda de respuestas, aprendizaje supervisado, aprendizaje semisupervisado, aprendizaje m´ınimamente supervisado Abstract: This thesis is focused on the development of question classification systems that are easily adaptable to different languages and domains. These systems are based on machine learning techniques and corpus, following a statistical approach to human language. The goal is to almost avoiding the need for linguistic tools and resources, obtaining flexible systems with few requirements. Keywords: Question classification, question answering, supervised learning, semisupervised learning, minimally-supervised learning

1.

Introducci´ on

Los sistemas de b´ usqueda de respuestas (BR) o question answering tienen como finalidad encontrar respuestas concretas a necesidades precisas de informaci´on formuladas por los usuarios mediante lenguaje natural. En los sistemas de BR, un primer paso para poder devolver la respuesta solicitada por el usuario es analizar la pregunta y comprenderla, saber sobre qu´e se nos est´a preguntando. La clasificaci´ on de preguntas se ha desmarcado como una tarea en s´ı misma dentro del mundo del procesamiento del lenguaje natural y de la BR. Su objetivo es identificar de forma autom´atica qu´e se nos est´a preguntando, categorizando las preguntas en diferentes clases sem´anticas en funci´on del tipo de respuesta esperada. As´ı, ante preguntas como “¿Qui´en es el presidente de los Estados Unidos?” o “¿D´onde est´a la Torre Eiffel?”, un sistema de clasificaci´on de preguntas detectar´ıa que se est´a preguntando por una persona o un lugar respectivamente.

ISSN 1135-5948

En esta tesis se han desarrollado una serie de aproximaciones al desarrollo de sistemas de clasificaci´on de preguntas flexibles, entendiendo por flexibilidad la capacidad del sistema para adaptarse de forma sencilla a diferentes idiomas y dominios. Para ello se han empleado t´ecnicas de aprendizaje autom´atico sobre corpus, permitiendo a estos sistemas mejorar a trav´es de la experiencia sin necesidad de conocimiento humano.

2.

Aportaciones

En este trabajo se han desarrollado tres aproximaciones a la tarea de clasificaci´on de preguntas, buscando en cada una de ellas reducir, con respecto a la anterior aproximaci´on, la necesidad de recursos para la construcci´on del clasificador. Clasificaci´ on supervisada basada en ngramas. En esta primera aproximaci´on el clasificador aprende de forma autom´atica a partir de informaci´on obtenida estrictamente de un corpus de entrenamiento. No se requie-

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

David Tomás

un otro tipo de herramienta o recurso re ning´ ling¨ u´ıstico. El objetivo es establecer un sistema de referencia para aquellas situaciones en las que u ´nicamente se dispone de un corpus para el aprendizaje. Llevar a cabo esta aproximaci´ on requiere la ejecuci´on de diversas tareas:

Clasificaci´ on m´ınimamente supervisada sobre taxonom´ıas refinadas. En esta tercera aproximaci´on afrontamos el problema de la clasificaci´on de preguntas sobre taxonom´ıas refinadas en ausencia de datos de entrenamiento. A partir de un peque˜ no conjunto de semillas iniciales definidas por el usuario para cada clase, el sistema aprende a discriminar de forma autom´atica entre ellas a partir de informaci´on adquirida de forma autom´atica de la Web. De esta forma se obtiene una aproximaci´on m´ınimamente supervisada a la clasificaci´on sobre taxonom´ıas refinadas, que tradicionalmente requerir´ıa de grandes corpus de entrenamiento para ofrecer una cobertura adecuada al problema. Las tareas realizadas en esta fase son:

Determinar el algoritmo de aprendizaje m´as apropiado para la tarea de clasificaci´on de preguntas. Analizar diferentes caracter´ısticas de aprendizaje a nivel de palabra obtenidas exclusivamente de los datos de entrenamiento. Desarrollar corpus en diferentes idiomas para la tarea de clasificaci´on multiling¨ ue. Desarrollar corpus en diferentes dominios para la tarea de clasificaci´on en dominio abierto y restringido.

Definir un modelo para la adquisici´on autom´atica de muestras de entrenamiento, evitando as´ı la necesidad de grandes conjuntos de datos para el aprendizaje.

Evaluar y comparar los algoritmos y caracter´ısticas anteriores sobre los corpus desarrollados.

Desarrollar un algoritmo que permite aprovechar estas muestras para la construcci´on del clasificador.

Estudiar diferentes t´ecnicas de selecci´on de caracter´ısticas.

Desarrollar un conjunto de datos de evaluaci´on sobre una taxonom´ıa refinada que nos permita medir el rendimiento del sistema.

Clasificaci´ on semisupervisada empleando textos no etiquetados. En esta segunda aproximaci´on se enriquece el modelo b´asico definido anteriormente, completando la informaci´on extra´ıda del conjunto de entrenamiento mediante informaci´on sem´antica externa. Esta informaci´on se obtiene a partir de texto no etiquetado adquirido de forma autom´atica de la Web. De esta forma se consigue mejorar la capacidad del sistema empleando datos no etiquetados, dando lugar a una aproximaci´on semisupervisada a la clasificaci´on de preguntas. Las tareas realizadas en esta aproximaci´on son:

Evaluar el sistema sobre diferentes idiomas. on con los sisComparar esta aproximaci´ temas de aprendizaje empleados habitualmente en esta ´area, valorando las ventajas aportadas.

Informaci´ on adicional Tesis doctoral en Inform´atica realizada en la Universidad de Alicante por David Tom´as D´ıaz, bajo la direcci´on del doctor Jos´e Luis Vicedo Gonz´alez. La defensa tuvo lugar el d´ıa 21 de julio de 2009 ante un tribunal formado por los doctores Manuel Palomar (Univ. de Alicante), Patricio Mart´ınez (Univ. de Alicante), Horacio Rodr´ıguez (Univ. Polit´ecnica de Catalu˜ na), Paolo Rosso (Univ. Polit´ecnica de Valencia) y G¨ unter Neumann (DFKI). La calificaci´on obtenida fue Sobresaliente Cum Laude por unanimidad, con menci´on de Doctor Europeo.1

antica parIncorporar informaci´on sem´ tiendo de texto no etiquetado, empleando m´etodos kernel y an´alisis de la sem´antica latente (LSA). on con otros Comparar esta aproximaci´ sistemas que incorporan informaci´on sem´antica proveniente de recursos ling¨ u´ısticos complejos.

1

a dispoEl texto completo de la tesis est´ nible en http://gplsi.dlsi.ua.es/gplsi09/lib/ exe/fetch.php?media=tesis_david.pdf.

Evaluar el sistema sobre diferentes idiomas.

156

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 157-158

recibido 18-01-10 revisado 18-02-10 aceptado 05-03-10

Textual Entailment Recognition and its Applicability in NLP Tasks∗ Reconocimiento de Implicaci´ on Textual y su aplicabilidad en Tareas de PLN ´ Oscar Ferr´ andez Dept. de Lenguajes y Sistemas Inform´aticos (Universidad de Alicante) Carretera San Vicente s/n 03690 Alicante Espa˜ na [email protected] Resumen: Tesis doctoral en Inform´atica realizada en la Universidad Alicante (UA) por Oscar Ferr´ andez bajo la direcci´on del Dr. Rafael Mu˜ noz Guillena. El acto de defensa de la tesis tuvo lugar en Alicante el 27 de Julio de 2009 ante el tribunal formado por los doctores Manuel Palomar (UA), Andr´es Montoyo Guijarro (UA), Arantza D´ıaz de Ilarraza (EHU/UPV), Luis Alfonso Ure˜ na (UJA) y Raquel Mart´ınez Unanue (UNED). Calificaci´on: Sobresaliente Cum Laude por unanimidad. Palabras clave: Implicaci´on Textual, Aplicaciones PLN, Sem´antica de textos Abstract: Ph.D Thesis in Computer Science, specifically in the field of Computational Linguistics, written by Oscar Ferr´ andez under the supervision of Dr. Rafael Mu˜ noz Guillena. The author was examined on July 27th , 2009 by a panel formed by Dr. Manuel Palomar (UA), Dr. Andr´es Montoyo Guijarro (UA), Dr. Arantza D´ıaz de Ilarraza (EHU/UPV), Dr. Luis Alfonso Ure˜ na (UJA) and Dr. Raquel Mart´ınez Unanue (UNED). The grade obtained was Sobresaliente Cum Laude. Keywords: Textual Entailment, NLP Applications, Texts Semantics

1.

Introduction

Human languages are extremely rich and ambiguous resulting in the fact that the same information can be expressed employing different words and linguistic structures. In other words, an ambiguous text might represent several distinct meanings and a concrete meaning might be expressed in different ways. However, controlling language variability is something which as yet has not been attained. In terms of reasoning, there are many inferences easily detected by humans but extremely difficult for computers to address. While the problem of language variability is the context of this work, it has been concretely focused on textual entailment. Textual entailment has been defined as a generic framework for modelling semantic variability, which appears when a concrete meaning is described in different manners. Hence, language variability can be addressed by defining the concept of textual entailment as a ∗

This thesis has been partially funded by the EU Commission (project FP6-IST-033860) and by the Spanish Government (project TIN2006-15265-C0601).

ISSN 1135-5948

one-way meaning relation between two text snippets. Two coherent fragments of text are defined and according to the definition of textual entailment, the meaning of one of them must entail the meaning of the other, should this not occur the entailment does not hold. Since a lot of applications in many Natural Language Processing (NLP) areas are highly influenced by the problem of language variability, solving textual entailment relations would help many NLP applications to increase their final performance by means of correct language variability disambiguation. This thesis exposes the major topics in textual entailment by means of examples together with thorough discussions. As a result, an end-to-end textual entailment system was developed following the idea that textual entailment relations can be recognised from different linguistic levels. Specifically, lexical, syntactic and semantic, each performing a set of useful inferences to determine entailments. The final entailment decision is taken by a machine learning classifier which uses as features the set of inferences. Extensive evaluations over the PASCAL Recognising Textual

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Óscar Ferrández

the entailment recognition. As our main contributions, we would like to highlight:

Entailment datasets have been carried out throughout this thesis. Furthermore, another motivation as well as a contribution of this thesis consisted of applying our system to other Natural Language Processing tasks such as Question Answering, Automatic Text Summarization and the particular semantic task of linking Wikipedia categories to WordNet glosses.

2.

We have measured the impact of trivial lexical and syntactic inferences within the task of detecting entailments. Concluding that these deductions play a crucial role in the final entailment decision. Dealing with complex analyses (i.e. the semantic perspective), we have evaluated the benefits of using linguistic resources in order to recognise entailments. For instance, WordNet, FrameNet, VerbNet and VerbOcean allowed: (i) the use of semantic inferences based on synonyms, antonyms, etc; (ii) more abstract semantic deductions using Frame Analysis; and (iii) to measure the importance of finding out correspondences between verbs and entities.

Thesis Overview

Chapter 1 introduces the concept of textual entailment within the field of Artificial Intelligence and NLP. Chapter 2 is intended to provide the related work relevant to this thesis. Apart from detailing the most up-to-date textual entailment systems, this chapter also describes resources used within the system development and testing as well as the two main competitions and/or workshops in this field. Chapter 3 comprises the detailed description of our system. Carefully explaining each inference and how they assist in solving entailment relations. Chapter 4 discusses the framework in which the system is evaluated as well as analysing the results obtained. Chapter 5 presents the applicability of our system in other NLP tasks rather than puristic textual entailment recognitions. It permits us to make an extrinsic system evaluation, assessing the gain of applying our system to QA and SUM among others. Chapter 6 gives conclusions together with some thoughts for future work. Chapter 7 presents the software developments carried out in this thesis. Appendix A illustrates the official results regarding each PASCAL Recognizing Textual Entailment challenge. Appendix B shows the official results corresponding to the different Answer Validation Exercise competitions. Appendix C presents the bar graphs for the information gain values of each feature. Appendix D gives a brief bio-sketch of the author and a summary of the research projects relative to this thesis.

3.

Furthermore, we have implemented some new linguistic resources based on FrameNet: the Frame-to-Frame similarity metric and the FrameNet-WordNet alignment measure. Although in this thesis they have been used in order to discover entailments, they could also be useful in other NLP tasks and/or applications. Regarding the applicability of our textual entailment system in other NLP tasks. It was successfully applied to Summarization, Question Answering and the task of linking Wikipedia categories to WordNet glosses. In a nutshell, the investigations posed throughout this thesis reveal the importance of combining linguistic features derived from distinct perspectives, analyses and/or resources. It results in the implementation of a textual entailment system capable of making use of these variety of features (Ferr´ andez, Mu˜ noz, and Palomar, 2008; Ferr´ andez, Mu˜ noz, and Palomar, 2009).

References ´ Ferr´ andez, Oscar, Rafael Mu˜ noz, and Manuel Palomar. 2008. Studying the influence of semantic constraints in ave. In LNCS 5706, CLEF 2008, pages 460–467. ´ Ferr´ andez, Oscar, Rafael Mu˜ noz, and Manuel Palomar. 2009. Alicante University at TAC 2009: Experiments in RTE. In TAC 2009 Workshop.

Thesis Contributions

With the ideas, reasoning and experiments exhibited in this thesis, we demonstrate that the combination of lexical, syntactic and semantic knowledge is the correct way to tackle

158

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de 2010, pp 159-160

recibido 19-01-10 revisado 16-02-10 aceptado 08-03-10

Hierarchical self-refining consensus architectures and soft consensus functions for robust multimedia clustering Arquitecturas de consenso jer´ arquicas auto-refinables y funciones difusas de consenso para agrupamiento robusto de datos multimedia Xavier Sevillano GTM – Grup de Recerca en Tecnologies M`edia La Salle. Universitat Ramon Llull C/Quatre Camins, 2. 08022 Barcelona [email protected] Resumen: Tesis doctoral en Tecnolog´ıas de la Informaci´on y las Comunicaciones y su Gesti´on. El acto de defensa de tesis tuvo lugar en Barcelona en Junio de 2009 ante el tribunal formado por los doctores Paolo Rosso (Univ. Polit´ecnica de Valencia), Aristides Gionis (Yahoo! Research), Juan Jos´e Rodr´ıguez (Univ. de Burgos), Jordi Turmo (Univ. Polit´ecnica de Catalunya) y Ester Bernad´o (La Salle - Univ. Ramon Lull). La calificaci´on obtenida fue Sobresaliente Cum Laude. Palabras clave: Agrupamiento robusto, consenso, agrupamiento difuso, multimedia Abstract: PhD thesis in Information and Communication Technologies and their Management. The author was examined in June 2009 in Barcelona by the following committee: Paolo Rosso (Univ. Polit´ecnica de Valencia), Aristides Gionis (Yahoo! Research), Juan Jos´e Rodr´ıguez (Univ. de Burgos), Jordi Turmo (Univ. Polit´ecnica de Catalunya) and Ester Bernad´o (La Salle - Univ. Ramon Lull). The grade obtained was Summa Cum Laude. Keywords: Robust clustering, consensus, fuzzy clustering, multimedia

1

Introduction

The robust design of clustering systems is a very relevant and challenging issue. This is due to the unsupervised nature of the clustering problem, which makes it difficult (if not impossible) to select a priori the configuration of the clustering system that gives rise to the most meaningful partition of a data collection. Furthermore, given the myriad of options –e.g. clustering algorithms, data representations, etc.– available to the clustering practitioner, such important decision is often made with a high degree of uncertainty (unless domain knowledge is available, which is not always the case). For this reason, our approach to robust clustering intentionally reduces user decision making as much as possible: the clustering practitioner is encouraged to use and combine all the clustering configurations at hand, compiling the resulting clusterings into a cluster ensemble, upon which a consensus clustering is derived. The more similar this

ISSN 1135-5948

consensus clustering is to the highest quality clustering contained in the cluster ensemble, the greater the robustness to the indeterminacies inherent to clustering. This PhD thesis is focused on the problem of robust clustering based on cluster ensembles, with a specific focus on the increasingly interesting application of multimedia data clustering and a view on its generalization in fuzzy clustering scenarios. More specifically, the main goal of this research work is to derive high quality consolidated clusterings upon cluster ensembles in a computationally efficient manner. This latter issue is especially relevant, as our particular approach to robust clustering indirectly entails the creation of large cluster ensembles, a fact that dramatically increases execution time and memory usage of consensus clustering algorithms. Furthermore, our proposals find a natural field of application in multimedia data clustering, as the existence of multiple data

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Xavier Sevillano

modalities poses additional indeterminacies that challenge the obtention of robust clustering results. Finally, in order to generalize our approach to robust clustering based on cluster ensembles, we propose several voting based consensus functions for deriving fuzzy consensus partitions by combining of the outcomes of multiple fuzzy clustering systems.

2

design simple tools that allow to predict accurately the most computationally efficient hierarchical consensus architecture for a given consensus clustering problem. – the self-refining consensus clustering proposal is capable of generating a bunch of self-refined consensus clusterings upon a previously obtained consensus partition, some of which are a largely improved version of the latter. We complement our approach by introducing a blind strategy for selecting the optimal self-refined consensus clustering among them.

Thesis overview

The PhD thesis is organized as follows: the first two chapters provide an introduction to the central matter of the thesis. In particular, chapter 1 presents an overview of the clustering problem, and chapter 2 reviews related work in the field of consensus clustering. Chapter 3 introduces hierarchical consensus architectures, our proposal for the computationally efficient derivation of consensus clusterings based on the application of the divide-and-conquer strategy on cluster ensembles. In chapter 4, we present self-refining consensus clustering, a fully unsupervised methodology for obtaining high quality consensus partitions based on excluding low quality components of the cluster ensemble from the consensus process. Chapter 5 shows how our proposals for robust clustering based on cluster ensembles naturally allow the simultaneous use of early and late multimodal fusion techniques, thus constituting a highly generic approach to the problem of multimedia data clustering. In chapter 6, we present voting based consensus functions for combining the outputs of multiple fuzzy unsupervised classifiers, a first step for porting our previous proposals to the more generic framework of fuzzy clustering. Finally, in chapter 7 we discuss the main conclusions of our work, outlining several future research lines of interest that stem from the investigation presented in this thesis.

3

– multimedia clustering based on cluster ensembles starts by creating clusterings upon each separate modality and on feature level fused modalities, and after compiling them all into a multimodal cluster ensemble, a consensus clustering is created upon it. By doing so, we take advantage of modality fusion both at feature level (early fusion) and at decision level (late fusion). – voting based consensus functions for soft cluster ensembles allow to obtain fuzzy consensus partitions. Among our proposals, the BordaConsensus and CondorcetConsensus algorithms are pioneer consensus functions based on positional voting. Throughout this thesis, we have given great importance to the experimental evaluation of all our proposals, using a total of 16 unimodal and multimodal publicly available data collections. The results of these experiments have confirmed that, in the quest for the design of robust multimedia clustering systems in which user decision making is minimized, our cluster ensembles-based proposal succeeds in the computationally efficient derivation of high quality consensus partitions. In the future, we plan to set the user free from the obligation to select how many clusters the objects are to be grouped in, using this element as an additional factor of diversity at the time of creating the cluster ensemble.

Thesis contributions

The major contributions of this PhD thesis are the following: – hierarchical consensus architectures constitute a highly parallelizable strategy for the fast derivation of consensus clusterings upon cluster ensembles. Besides defining random and deterministic hierarchical consensus architectures, we also

160

Procesamiento del Lenguaje Natural, Revista nº 44 marzo de 2010, pp 161-162

recibido 18-01-10 revisado 20-02-10 aceptado 25-02-10

Sintetizador Param´ etrico de Lengua de Signos Espa˜ nola∗ Spanish Sign Language Parametric Synthesizer Fernando L´ opez-Colino Departamento de Ingenier´ıa Inform´atica Universidad Aut´onoma de Madrid. Fco. Tom´as y Valiente 11 Madrid, 28049, Spain [email protected] Resumen: Este trabajo presenta los resultados de la tesis doctoral realizada por Fernando Jes´ us L´opez Colino y dirigida por el Dr. Jos´e Col´as Pasamontes. Esta tesis fue defendida el 15 de octubre de 2009 y calificada con Sobresaliente Cum Laude por unanimidad. La principal novedad ling¨ u´ıstica aportada por la tesis es el uso del modelo fonol´ogico propuesto por Liddell y Johnson en la s´ıntesis de lengua de signos espa˜ nola (LSE). El modelo requiere la descripci´on del aspecto temporal de la LS, que no es contemplado por las notaciones existentes. Por ese motivo, la descripci´on fon´emica de los signos y la evoluci´on temporal se recogen en una base de datos relacional. En las evaluaciones de inteligibilidad con usuarios nativos de LSE se ha obtenido una tasa de reconocimiento media del 97 % para signos aislados, del 95 % para frases completas y del 85 % para construcciones clasificatorias. Palabras clave: Lengua de signos espa˜ nola, S´ıntesis autom´atica, Discapacitados auditivos, Construcciones clasificatorias Abstract: PhD Thesis written by Fernando Jes´ us L´opez Colino under the supervision of Dr. Jos´e Col´as Pasamontes. The author was examined in October 15th , 2009 and the grade obtained was Sobresaliente Cum Laude by unanimous decision. The Thesis’ main linguistic contribution is to use the phonological model proposed by Liddell and Johnson applied to the Spanish Sign Language (LSE) synthesis. This model requires defining the temporal aspect of the Sign Language (LS), which is not described by existing notations. Therefore, the phonologic description and the temporal aspects are stored in a relational database. The intelligibility evaluations, using LSE natives, showed the following recognition rates: 97 % for isolated signs, 95 % for complete sentences and 85 % for classifier constructions. Keywords: Spanish Sign Language, Automatic Synthesis, Deaf People, Classifier Constructions

1.

Introducci´ on

La s´ıntesis de lengua de signos (LS) es un ´area de investigaci´on con una antig¨ uedad de apenas una d´ecada, periodo en el cual han surgido numerosas soluciones. La t´ecnica m´as utilizada es la s´ıntesis param´etrica, donde se parte de una descripci´on de los signos en funci´on de los siete par´ametros formativos quin´esicos (PFQ) que los conforman (tambi´en conocidos como fonemas). Aunque esta t´ecnica no produce los resultados m´as naturales (como los obtenidos por sistemas de animaci´on manual (Huenerfauth, 2009)), s´ı es ∗

Agradecimientos a los colaboradores de la Fundaci´ on Confederaci´ on Nacional de Sordos de Espa˜ na (FCNSE) y a Almudena S´ anchez por los conocimientos en LSE aportados; al programa FPU-UAM por su finaciaci´ on econ´ omica.

ISSN 1135-5948

la u ´nica que ofrece la flexibilidad y grado de control necesarios para gestionar la complejidad de la LS, raz´on por la que ha sido elegida para este trabajo. Este trabajo realiza la primera aproximaci´on gen´erica a la s´ıntesis de lengua de signos espa˜ nola (LSE).

2.

Estado del Arte

Los sintetizadores param´etricos actuales (Zwiterslood et al., 2004; Elliott et al., 2008; Kennaway, Glauert, y Zwitserlood, 2007) utilizan una adaptaci´on a XML de las notaciones iconogr´aficas que describen los PFQ de los signos a sintetizar (Prillwitz et al., 1989). Esta aproximaci´on es equivalente a la s´ıntesis de voz, en la que se utiliza una notaci´on gr´afica para representar los fonemas. El uso de esta aproximaci´on presenta ciertos incon-

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Fernando López-Colino

ye caracter´ısticas especiales para la gesti´on fon´emica independiente de la LS.

venientes, tales como que la descripci´on de un mensaje a sintetizar es una tarea compleja o la falta de definici´on en la dimensi´on temporal de los signos: cada uno de los PFQ se definen como una secuencia de unidades en las que es necesario definir el instante en que se representan cada una de ellas; de igual modo, la s´ıntesis de la LS requiere de la definici´on de sincronizaci´on entre dichas secuencias.

3.

4.

Resultados Obtenidos

Los contenidos generados por el sintetizador han sido evaluados por 11 personas sordas, entre ellas expertos de la FCNSE. La evaluaci´on se compuso de distintos test centrados en medir la inteligibilidad de signos aislados, frases completas y construcciones clasificatorias en LSE. El porcentaje de aciertos para los signos aislados fue de un 96 %, permitiendo respuesta libre, y de un 98 %, cuando se propon´ıan cinco opciones posibles. En el caso de frases compuestas por secuencias de signos, se obtuvo una tasa de aciertos del 95 %. Por u ´ltimo, la evaluaci´on de las construcciones clasificatorias obtuvo una tasa de aciertos del 85 %.

Desarrollo Realizado

En este trabajo afrontamos la s´ıntesis param´etrica de LS desde una nueva perspectiva, siguiendo el modelo ling¨ u´ıstico propuesto en (Liddell y Johnson, 1989). Dicho modelo requiere mayor precisi´on en la definici´on fon´emica de los signos, junto con la descripci´on del aspecto temporal de los mismos. Por este motivo, hemos optado por el uso de una base de datos relacional para el almacenamiento de dichas deficiones. Esta base de datos no s´olo permite disponer de descripciones m´as precisas, detalladas y flexibles, sino que adem´as permite definir representaciones paralelas, tanto de los propios signos como de los PFQ. Gracias a estas definiciones paralelas se puede contemplar en la misma base de datos la representaci´on de los signos en distintos idiomas o dialectos y afrontar la s´ıntesis de variantes emocionales para los mismos. Una mayor flexibilidad de las descripciones de los signos, permite que los expertos en este modo de comunicaci´on, especialmente signantes nativos, participen tanto en la evaluaci´on como en la propia definici´on de los signos. Para ello se ha dise˜ nado un conjunto de aplicaciones espec´ıficas para esta tarea, basadas en una interfaz visual. Dado que la notaci´on de entrada ya no debe describir los signos mediante los PFQ, podemos definir una notaci´on de entrada en un nivel superior de abstracci´on, lo que facilita la descripci´on de mensajes en LS. Esta nueva notaci´on, que hemos denominado HLSML, contempla las tres categor´ıas sem´anticas de la LS: el diccionario dactilol´ogico, los signos establecidos y las construcciones clasificatorias. Aunque la notaci´on HLSML se centra en la descripci´on sint´actica del mensaje signado, permite la descripci´on fon´emica de los signos del mensaje, cuando ´estos no est´an contemplados en la base de datos. Otro aspecto novedoso a destacar del sintetizador desarrollado es el dise˜ no del esqueleto del avatar. El avatar utilizado inclu-

Bibliograf´ıa Elliott, Ralph, John Glauert, Richard Kenna´ S´af´ar. 2008. Linway, Ian Marshal, y Eva guistic modelling and language-processing technologies for avatar-based sign language presentation. Universal Access in te Information Society, 6(4):375–391. Huenerfauth, Matt. 2009. A linguistically motivated model for speed and pausing in animations of american sign language. ACM Transactions on Accessible Computing, 2(2):1–31. Kennaway, Richard, John Glauert, y Inge Zwitserlood. 2007. Providing signed content on the internet by synthesized animation. ACM Transactions on ComputerHuman Interaction, 14(15):1–29. Liddell, Scott K. y Robert E. Johnson. 1989. American sign language: The phonological base. Sign Language Studies, 64:195–278, fall. Prillwitz, Siegmund, Regina Leven, Heiko Zienert, Thomas Hanke, y Jan Herming. 1989. HamNoSys. Version 2.0; Hamburg Notation System for Sign Languages. An introductory guide. Signum-Verlag. Zwiterslood, Inge, Margriet Verlinden, Johan Ros, y Sanny van der Schoot. 2004. Synthetic signing for the deaf: esign. En Proceedings of the Conference and Workshop on Assistive Technologies for Vision and Hearing Impairment, Granada, Spain, Junio. CVHI.

162

Información General

SEPLN 2010 XXVI CONGRESO DE LA SOCIEDAD ESPAÑOLA PARA EL PROCESAMIENTO DEL LENGUAJE NATURAL Universidad Politécnica de Valencia – Valencia (España) 7-10 de septiembre 2010 http://www.sepln.org/

1

Presentación

La XXVI edición del Congreso Anual de la Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN) se celebrará en el marco del CEDI los días 7, 8, 9 y 10 de septiembre de 2010 en la Universidad Politécnica de Valencia. Además se prevé la organización de talleres-workshops satélites para esa misma semana. La ingente cantidad de información disponible en formato digital y en las distintas lenguas que hablamos hace imprescindible disponer de sistemas que permitan acceder a esa enorme biblioteca que es Internet de manera cada vez más estructurada. En este mismo escenario, hay un interés renovado por la solución de los problemas de accesibilidad a la información y de mejora de explotación de la misma en entornos multilingües. Muchas de las bases formales para abordar adecuadamente estas necesidades han sido y siguen siendo establecidas en el marco del procesamiento del lenguaje natural y de sus múltiples vertientes: Extracción y recuperación de información, Sistemas de búsqueda de respuestas, Traducción automática, Análisis automático del contenido textual, Resumen automático, Generación textual, y Reconocimiento y síntesis de voz. El objetivo principal del congreso es ofrecer un foro para presentar las últimas investigaciones y desarrollos en el ámbito de trabajo del Procesamiento del Lenguaje Natural (PLN) tanto a la comunidad científica como a las empresas del sector. También se pretende mostrar las posibilidades reales de aplicación y conocer nuevos proyectos I+D en este

ISSN 1135-5948

campo. Además, como en anteriores ediciones, se desea identificar las futuras directrices de la investigación básica y de las aplicaciones previstas por los profesionales, con el fin de contrastarlas con las necesidades reales del mercado. Finalmente, el congreso pretende ser un marco propicio para introducir a otras personas interesadas en esta área de conocimiento.

2

Objetivos

El objetivo principal de este congreso es el de ofrecer a la comunidad científica y empresarial del sector el foro idóneo para la presentación de las últimas investigaciones y desarrollos del ámbito de trabajo en PLN, así como mostrar las posibilidades reales de aplicación y conocer nuevos proyectos. De esta manera, el XXVI Congreso de la SEPLN pretende ser un lugar de encuentro para la comunicación de resultados e intercambio de opiniones sobre el desarrollo de esta área en la actualidad. Además, se desea conseguir el objetivo de anteriores ediciones de este congreso identificando las futuras directrices de la investigación básica y de las aplicaciones previstas por los profesionales, con el fin de contrastarlas con las necesidades reales del mercado. Igualmente el congreso pretende ser un marco propicio para introducir a otras personas interesadas en esta área de conocimiento.

3

Areas Temáticas

Se anima a grupos e investigadores a enviar comunicaciones, resúmenes de

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

SEPLN 2010

proyectos o demostraciones en alguna de las áreas temáticas siguientes, entre otras: • Modelos lingüísticos, matemáticos y psicolingüísticos del lenguaje • Desarrollo de recursos y herramientas lingüísticas. • Gramáticas y formalismos para el análisis morfológico y sintáctico. • Semántica, pragmática y discurso. • Resolución de la ambigüedad léxica. • Generación textual monolingüe y multilingüe • Traducción automática • Síntesis del habla • Sistemas de diálogo • Indexado de audio • Identificación idioma • Extracción y recuperación de información monolingüe y multilingüe • Sistemas de búsqueda de respuestas. • Evaluación de sistemas de PLN. • Análisis automático del contenido textual. • Análisis de sentimientos y opiniones. • Análisis de plagio. • Minería de texto en blogosfera y redes sociales. • Generación de Resúmenes. • PLN para la generación de recursos educativos. • PLN para lenguas con recursos limitados. • Aplicaciones industriales del PLN.

4

Lidia Moreno Boronat (Universitat Politècnica de València) Coordinador de Talleres • Paolo Rosso (Universitat Politècnica de València) Coordinadores de Demostraciones • Lluís Hurtado i Oliver (Universitat Politècnica de València) • Jon Ander Gómez Adrián (Universitat Politècnica de València) Coordinadores de Comunicaciones, Posters y Proyectos • Ferran Pla Santamaría (Universitat Politècnica de València) • Antonio Molina Marco (Universitat Politècnica de València) Coordinador de Imagen y Edición • Fernando García Granada (Universitat Politècnica de València) Organización y Relaciones con CEDI 2010 • Lidia Moreno Boronat (Universitat Politècnica de València) • Natividad Prieto Sáez (Universitat Politècnica de València) • Emilio Sanchis Arnal (Universitat Politècnica de València) •

6

Miembros: • Prof. José Gabriel Amores Carredano (Universidad de Sevilla) • Prof. Toni Badia i Cardús (Universitat Pompeu Fabra) • Prof. Manuel de Buenaga Rodríguez (Universidad Europea de Madrid) • Prof. Fco. Javier Calle Gómez (Universidad Carlos III de Madrid) • Prof.ª Irene Castellón Masalles (Universitat de Barcelona) • Prof.ª Arantza Díaz de Ilarraza (Euskal Herriko Unibertsitatea) • Prof. Antonio Ferrández Rodríguez (Universitat d'Alacant) • Prof. Mikel Forcada Zubizarreta (Universitat d'Alacant) • Prof.ª Ana María García Serrano (Universidad Politécnica de Madrid) • Prof. Koldo Gojenola Galletebeitia (Euskal Herriko Unibertsitatea) • Prof. Xavier Gómez Guinovart (Universidade de Vigo)

Formato del Congreso

La duración prevista del congreso será de cuatro días, con sesiones dedicadas a la presentación de artículos, pósters, proyectos de investigación en marcha y demostraciones de aplicaciones. Además prevemos la organización de talleresworkshops satélites para el día 6 y 7 de Septiembre.

5

Consejo Asesor

Comité ejecutivo SEPLN 2010

Presidente del Comité Científico

166

SEPLN 2010

• • • • • • • • • • • • • • • • • • • • •

•

•

Prof. Julio Gonzalo Arroyo (Universidad Nacional de Educación a Distancia) Prof. José Miguel Goñi Menoyo (Universidad Politécnica de Madrid) José B. Mariño Acebal(Universitat Politécnica de Catalunya) Prof.ª M. Antonia Martí Antonín (Universitat de Barcelona) Prof.ª Mª Teresa Martín Valdivia (Universidad de Jaén) Prof. Patricio Martínez Barco (Universitat d'Alacant) Prof. Paloma Martínez Fernández (Universidad Carlos III de Madrid) Profª. Raquel Martínez Unanue (Universidad Nacional de Educación a Distancia) Prof.ª Lidia Ana Moreno Boronat (Universitat Politécnica de Valencia) Prof. Lluis Padró (Universitat Politécnica de Catalunya) Prof. Manuel Palomar Sanz (Universitat d'Alacant) Prof. Ferrán Pla (Universitat Politécnica de Valencia) Prof. Germán Rigau (Euskal Herriko Unibertsitatea) Prof. Horacio Rodríguez Hontoria (Universitat Politécnica de Catalunya) Prof. Kepa Sarasola Gabiola (Euskal Herriko Unibertsitatea) Prof. Emilio Sanchís (Universitat Politécnica de Valencia) Prof. L. Alfonso Ureña López (Universidad de Jaén) Prof.ª Mª Felisa Verdejo Maillo (Universidad Nacional de Educación a Distancia) Prof. Manuel Vilares Ferro (Universidade de Vigo) Prof. Ruslan Mitkov (Universidad de Wolverhampton) Prof.ª Sylviane Cardey-Greenfield (Centre de recherche en linguistique et traitement automatique des langues, Lucien Tesnière. Besançon, France) Prof. Leonel Ruiz Miyares (Centro de Linguistica Aplicada de Santiago de Cuba)

•

• •

•

7

Investigador Luis Villaseñor-Pineda (Instituto Nacional de Astrofísica, Óptica y Electrónica. México) Investigador Manuel Montes y Gómez (Instituto Nacional de Astrofísica, Óptica y Electrónica. México) Prof. Alexander Gelbukh (Instituto Politécnico Nacional. México) Prof. Nuno J. Mamede (Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa. Portugal) Prof. Bernardo Magnini (Fondazione Bruno Kessler. Italia)

Fechas importantes

Fechas para la presentación y aceptación de comunicaciones: • Fecha límite para la entrega de comunicaciones: 1 de mayo de 2010. • Notificación de aceptación: 22 de mayo de 2010. • Fecha límite para entrega de la versión definitiva: 27 de mayo de 2010. • Plazo para inscripción a coste reducido: 15 de julio de 2010.

167

Información para los Autores Formato de los Trabajos • La longitud máxima admitida para las contribuciones será de 8 páginas DIN A4 (210 x 297 mm.), incluidas referencias y figuras. • Los artículos pueden estar escritos en inglés o español. El título, resumen y palabras clave deben escribirse en ambas lenguas. • El formato será en Word ó LaTeX Envío de los Trabajos • El envío de los trabajos se realizará electrónicamente a través de la página web de la Sociedad Española para el Procesamiento del Lenguaje Natural (http://www.sepln.org) • Para los trabajos con formato LaTeX se mandará el archivo PDF junto a todos los fuentes necesarios para compilación LaTex • Para los trabajos con formato Word se mandará el archivo PDF junto al DOC o RTF • Para más información http://www.sepln.org/revistaSEPLN/Instrevista.php

Hoja de Inscripción para Instituciones Datos Entidad/Empresa Nombre : ................................................................................................................................................. NIF : ............................................................ Teléfono : ............................................................ E-mail : ............................................................ Fax : ............................................................ Domicilio : ................................................................................................................................................. Municipio : ................................................... Código Postal : ............ Provincia : .......................... Áreas de investigación o interés: ................................................................................................................... ........................................................................................................................................................................

Datos de envío Dirección Municipio Teléfono

: .............................................................................................. Código Postal : ................. : .......................................................................... Provincia : .............................................. : ........................................... Fax : ................................ E-mail : ...............................

Datos Bancarios: Nombre de la Entidad Domicilio Cód. Postal y Municipio Provincia

: ............................................................................................................................ : ............................................................................................................................ : ............................................................................................................................ : ............................................................................................................................

Cód. Banco (4 dig.) Cód. Suc. (4 dig.) Dig. Control (2 Dig.) Núm.cuenta (10 dig.) ........................................ ........................................ ........................................ ........................................ --------------------------------------------------------------------------------------------------------------------------------------------------

Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN). Sr. Director de: Entidad Núm. Sucursal Domicilio Municipio Provincia Tipo cuenta (corriente/caja de ahorro) Núm Cuenta

: .......................................................................................................................... : .......................................................................................................................... : .......................................................................................................................... : ............................................................................. Cód. Postal : ................. : .......................................................................................................................... : .......................................................................................................................... : ..........................................................................................................................

Ruego a Vds. que a partir de la fecha y hasta nueva orden se sirvan de abonar a la Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN) los recibos anuales correspondientes a las cuotas vigentes de dicha asociación. Les saluda atentamente Fdo: ........................................................................... (nombre y apellidos del firmante) ............................de ..................................de................. -------------------------------------------------------------------------------------------------------------------------------------------------.......................................................................................................................................................................... Cuotas de los socios institucionales: 300 €. Nota: La parte inferior debe enviarse al banco o caja de ahorros del socio

Hoja de Inscripción para Socios Datos Personales Apellidos Nombre DNI Teléfono Domicilio Municipio Provincia

: ................................................................................................................................................. : ................................................................................................................................................. : ............................................................ Fecha de Nacimiento : ........................................... : ............................................................ E-mail : ........................................... : ................................................................................................................................................. : ................................................................................................. Código Postal : ................. : .................................................................................................................................................

Datos Profesionales Centro de trabajo : ..................................................................................................................................... Domicilio : ..................................................................................................................................... Código Postal : .................... Municipio : ..................................................................................... Provincia : ........................................... Teléfono : ................................. Fax : ............................. E-mail : ..................................... Áreas de investigación o interés: ................................................................................................................... ........................................................................................................................................................................

Preferencia para envío de correo: [ ] Dirección personal

[ ] Dirección Profesional

Datos Bancarios: Nombre de la Entidad Domicilio Cód. Postal y Municipio Provincia

: ............................................................................................................................ : ............................................................................................................................ : ............................................................................................................................ : ............................................................................................................................

Cód. Banco (4 dig.) Cód. Suc. (4 dig.) Dig. Control (2 Dig.) Núm.cuenta (10 dig.) ........................................ ........................................ ........................................ ........................................ En.....................a....................................de..............................................de........................... (firma)

-------------------------------------------------------------------------------------------------------------------------------------------------------

Sociedad Española para el Procesamiento del Lenguaje Natural. SEPLN

Sr. Director de: Entidad Núm. Sucursal Domicilio Municipio Provincia Tipo cuenta (corriente/caja de ahorro)

: ......................................................................................................... : ......................................................................................................... : ......................................................................................................... : ............................................................... Cód. Postal : .............. : ......................................................................................................... : .........................................................................................................

Ruego a Vds. que a partir de la fecha y hasta nueva orden se sirvan de abonar a la Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN) los recibos anuales correspondientes a las cuotas vigentes de dicha asociación. Les saluda atentamente Fdo: ........................................................................... (nombre y apellidos del firmante) ............................de ..................................de.................

-----------------------------------------------------------------------------------------------------------------------------------------------------Cuotas de los socios: 18 € (residentes en España) o 24 € (socios residentes en el extranjero). Nota: La parte inferior debe enviarse al banco o caja de ahorros del socio

Información Adicional Funciones del Consejo de Redacción Las funciones del Consejo de Redacción o Editorial de la revista SEPLN son las siguientes: • Controlar la selección y tomar las decisiones en la publicación de los contenidos que han de conformar cada número de la revista • Política editorial • Preparación de cada número • Relación con los evaluadores y autores • Relación con el comité científico El consejo de redacción está formado por los siguientes miembros L. Alfonso Ureña López (Director) Universidad de Jaén [email protected] Patricio Martínez Barco (Secretario) Universidad de Alicante [email protected] Manuel Palomar Sanz Universidad de Alicante [email protected] Felisa Verdejo UNED [email protected]

Funciones del Consejo Asesor Las funciones del Consejo Asesor o Científico de la revista SEPLN son las siguientes: • Marcar, orientar y redireccionar la política científica de la revista y las líneas de investigación a potenciar • Representación • Impulso a la difusión internacional • Capacidad de atracción de autores • Evaluación • Composición • Prestigio • Alta especialización • Internacionalidad El Consejo Asesor está formado por los siguientes miembros: José Gabriel Amores Universidad de Sevilla Toni Badía Universitat Pompeu Fabra Manuel de Buenaga Universidad Europea de Madrid Irene Castellón Universitat de Barcelona Arantza Díaz de Ilarraza Euskal Herriko Unibertsitatea Antonio Ferrández Universitat d'Alacant Mikel Forcada Universitat d'Alacant Ana García-Serrano Universidad Politécnica de Madrid Koldo Gojenola Euskal Herriko Unibertsitatea Xavier Gómez Guinovart Universidade de Vigo Julio Gonzalo UNED José Miguel Goñi Universidad Politécnica de Madrid José Mariño Universitat Politècnica de Catalunya M. Antonia Martí Universitat de Barcelona M. Teresa Martín Universidad de Jaén Patricio Martínez-Barco Universitat d'Alacant Raquel Martínez UNED

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Lidia Moreno Lluís Padro Manuel Palomar Ferrán Pla German Rigau Horacio Rodríguez Kepa Sarasola Emilio Sanchís Mariona Taulé L. Alfonso Ureña Felisa Verdejo Manuel Vilares Ruslan Mitkov Sylviane Cardey-Greenfield Leonel Ruiz Miyares Luis Villaseñor-Pineda Manuel Montes y Gómez Alexander Gelbukh Nuno J. Mamede Bernardo Magnini

Universitat Politècnica de València Universitat Politècnica de Catalunya Universitat d'Alacant Universitat Politècnica de València Euskal Herriko Unibertsitatea Universitat Politècnica de Catalunya Euskal Herriko Unibertsitatea Universitat Politècnica de València Universitat de Barcelona Universidad de Jaén UNED Universidad de A Coruña Universidad de Wolverhampton, UK Centre de recherche en linguistique et traitement automatique des langues, France Centro de Linguística Aplicada de Santiago de Cuba Instituto Nacional de Astrofísica, Óptica y Electrónica, México Instituto Nacional de Astrofísica, Óptica y Electrónica, México Instituto Politécnico Nacional, México Instituto de Engenharia de Sistemas e Computadores, Portugal Fondazione Bruno Kessler, Italia

Cartas al director Sociedad Española para el Procesamiento del Lenguaje Natural Departamento de Informática. Universidad de Jaén Campus Las Lagunillas, EdificioA3. Despacho 127. 23071 Jaén [email protected]

Más información Para más información sobre la Sociedad Española del Procesamiento del Lenguaje Natural puede consultar la página web http://www.sepln.org. Los números anteriores de la revista se encuentran disponibles en la revista electrónica: http://www.sepln.org/revistaSEPLN/revistas.php Las funciones del Consejo de Redacción están disponibles en Internet a través de http://www.sepln.org/revistaSEPLN/edirevista.php Las funciones del Consejo Asesor están disponibles Internet a través de la página http://www.sepln.org/revistaSEPLN/lectrevista.php

© 2010 Sociedad Española para el Procesamiento del Lenguaje Natural

Procesamiento del Lenguaje Natural, Revista nº 44, marzo de ... - sepln [PDF]

Recommend Stories

Idea Transcript

Helpful Links

Smile Life

Get in touch