27/07/2021 - 31/07/2021 | Cátedra Feminismos 4.0 Depo-UVigo
Laura Docío Fernández,



En este proyecto nos proponemos realizar un agente conversacional (tambien denominado chatbot por voz o simplemente chatbot) feminista orientado al diagnóstico del deterioro cognitivo leve mediante el análisis de conversaciones del chatbot con pacientes. Entre los fines estarán, por un lado, la definición de una guía de buenas prácticas en el diseño de chatbots por voz para tareas en las que la interacción se realiza con personas con deterioro cognitivo leve. Y por otro, que el diseño del chatbot se realizará de tal forma que evite los sesgos de género.

Ref.: LC-01641480 – 101018166
01/01/2021 - 30/06/2022 | Unión Europea
Carmen García Mateo,



ELE, in collaboration with its sister project the European Language Grid (ELG), will begin on the 1st January 2021 and aims to aid the development of a strategic research, innovation and deployment agenda to achieve digital language equality in Europe by 2030. ELE and ELG are aligned so that both will run for 18 months culminating with a joint event in June 2022, the META-FORUM 2022. 

The wider consortium comprises five core partners (DCU, DFKI Berlin, Charles University Prague, ILSP Athens,  and the University of the Basque Country, Donostia), Language Technology Expertise from  9 networks, associations and initiatives, 9 companies and 30 research organisations.  Together this consortium of 53 partners from all over Europe will drive the strategic research,innovation and deployment agenda to achieve digital language equality in Europe by 2030. 

A number of consultation events, round tables and  stakeholder meetings are planned during the 18-month project, and will ensure close collaboration with ELG. Research partners will produce updates on the 32 META-NET White Papers, detailing the situation regarding the 77 languages of the project. Each network initiative will produce a report in which they collect, consolidate and present their own vision, and companies will produce various technical deep dives for the different technology areas.

Ref.: RTI2018-101372-B-I00
01/01/2019 - 31/12/2021 | Ministerio de Economía y Competitividad
José Luis Alba Castro, Laura Docío Fernández,



Este proyecto profundizará en la investigación y el desarrollo de técnicas que permitan allanar el camino hacia una plena comprensión y explotación de los canales de comunicación verbal y no verbal que utilizan la expresión facial, el lenguaje corporal y la expresividad de las manos como sus fuentes de señales. El equipo está formado por miembros del GTM y una persona del grupo UVIGO GRADES (Gramática, Discurso y Sociedad).

La comunicación no verbal es un área interdisciplinaria en la que lingüistas, psicólogos, antropólogos, sociólogos y neurocientíficos desarrollaron una plétora de teorías. Puede definirse como la transferencia e intercambio de mensajes en todas y cada una de las modalidades que no implican palabras. Las modalidades son tan diversas como las expresiones faciales, los gestos y los movimientos corporales, las locuciones no verbales, el comportamiento en el espacio interpersonal e incluso la fisonomía (cara, cuerpo, ropa). Las lenguas de signos se basan principalmente en los gestos de las manos y otras partes del cuerpo, y tienen una gramática visual, por lo que muchas de las técnicas desarrolladas para las aplicaciones de reconocimiento de gestos y del habla se han aprovechado para el reconocimiento de la lengua de signos (RLS). Sin embargo, las expresiones faciales y los movimientos corporales, siendo cruciales para el RLS, no han recibido aún la atención merecida en este escenario.

Por lo tanto, el objetivo global del proyecto es desarrollar nuevos algoritmos, sistemas y conjuntos de datos, basados en el procesamiento del habla y del vídeo, y en técnicas de aprendizaje automático, para extraer información multimodal que permita decodificar los canales de comunicación verbal y no verbal de las lenguas orales y de signos.


This project pursuits several goals with different R/D/I scope and time horizon for their achievement:

  1. Developing facial expression recognition techniques beyond the 6 basic expressions of emotion: compound emotions and communicative non-emotional facial expressions (short-term objective).
  2. Design and development of hand-sign recognition algorithms on static images and dynamic video streams (short-term objective).
  3. Building a progressively large dataset for Spanish SL and annotation tools to be used within the ELAN package (short-term objective).
  4. Design and development of a coding tool to translate trajectories of hands, arms, troncal and head, and facial expressions into linguistically interpretable information in Spanish SL and into prosodic modifiers for ASR (long-term objective).
  5. Starting from outputs of Goal 1, design and development of audiovisual techniques for enhanced detection of emotional and linguistic cues useful for a richer closed-caption: questions, laughter, sadness, sobbing, dubitation, turn-taking, etc. (long-term objective).
  6. Launching an international benchmarking contest on Spanish SL using the acquired dataset (short-term objective).


The following activities were planned to achieve the short-term objectives and move towards the long-term objectives. Should you want to delve into details, follow the links.

Activity 1: Understanding and describing nonverbal communication channels.

Activity 2: Compiling a dataset of Spanish SL

Activity 3: Developing facial expression recognition tools in communicative scenarios

Activity 4: Developing a hand-based signing recognition tool

Activity 5: Multimodal fusion and interpretation of nonverbal communication channels 

Activity 6: Result Dissemination

Ref.: TEC2015-65345-P
01/01/2016 - 01/01/2019 | Ministerio de Economía y Competitividad
Carmen García Mateo,



In TraceThem, we will carry out research in algorithmic techniques of multimedia and multilingual search, working in real environments where current techniques fail in performance, generalisation and scalability. Apart from the traditional audiovisual contents (TV shows, news, movies, series...) new scenarios and types of contents have emerged in recent years (MOOCs, video blogs, tutorials...) where the automation of the search process for accessing the contents is a key aspect; processing these multimedia documents involves the added difficulty that, often, contents appear in different languages, representing a higher technological challenge, as tools adapted to different languages are needed, which is not always possible due to the lack of resources or tools to enable completely language independent content indexing. The information that we intend to extract is always within a communicative context (“from” someone and “for” someone), so the characterisation of the people involved in this context will play a central role. We will focus on finding information about people and their way of interacting (“who they are”, “what they say”, “how they communicate”, “how they are doing”), with a special interest in discovering people and content. The extraction of information related to people will be performed through audio processing, video processing and combined audio and video processing. To do this, we will focus on searching for technologies and new solutions for: multimedia content analysis, voice and face biometrics, audio segmentation and speaker diarization, detection of the emotional state and detection of people interacting. Content extraction will be primarily performed by processing audio using both language-dependent and language-independent search on speech. The scientific-technical impact and dissemination of project results will be favored by participation in international competitive evaluations related to the described issue, as these are important in the development of this project activity because they allow to use data sets related with tasks that constitute the current technological challenges. Moreover, in these competitions, common experimental frameworks are set up to enhance collaboration with other research groups and to allow comparisons of different algorithms, helping to discover the strengths and weaknesses of algorithms and developed systems.

These are  two video demo of our research on  "Semantic Indexing and Searching in Multimedia Contents" that are related to the TraceThem activities:

Ref.: LIFE 14 CCM/ES/001209
01/10/2015 - 30/09/2018 | Unión Europea
Soledad Torres Guijarro,



The main objective of the project LifeDemoWave is the demonstration of the feasibility of the use of wave power for electric generation in order to reduce greenhouse gases’ emissions. Thus, according to the regulation LIFE 2014-2020 Regulation (EC) No 1293/2013 the specific objective (d) stablished in article 14 would be reached. Thus, it will contribute to the development and demonstration that the use of other technologies such as wave power helps to mitigate the climate change with easily repeatable equipment. This project wants to raise awareness in society, proving that this is a via to reach clean energy, one of the biggest barriers nowadays. Additionally, this goal is in accordance with the policies stablished by the EU, which stablished guides for an European strategy for sustainable, competitive and safe energy through the Green Paper (8th of March of 2006). Furthermore, the Directive 2009/28/CE stablished that in 2020, 20% of the EU energy consumption has to come from renewable sources, although this value was only up to 14% in the EU. On the 20th January 2014 it was presented an action plan named «The blue energy», highlighting the support to wave power and tidal power as one of the priority areas in the EU to mitigate climate change, being wave power the solution proposed in LifeDemoWave. The final objective of this project is to help the implementation of these policies and support the adaptation of the applicable legislation in order to adopt these technologies, achieving what was stablished in section (a), article 14. For demonstration purposes, two prototypes of wave power generation, 25kW each, will be installed in the Galician coast, (Galicia stands out for having up to 75kW per each meter of wave front) that will be reproducible and scalable at high level. Another objective is to quantify the reduction of the carbon footprint and the emission of pollutants (NOx, NMVOC, SO2, NH3, PM25…). The aim is to reach the set values in the performance indicators, thus, allowing to adjust to the EU environmental polices such as Directive (2008/50/EC), (2001/81/EC), (94/63/EC), etc.  The LifeDemoWave project will consider as well in its design and implementation, the environmental impact in the installation areas and its effect on biodiversity, trying to minimize as much as possible any of these effects and to quantify them explicitly.

Ref.: TEC2012-38939-C03-01
01/01/2013 - 13/12/2015 | Ministerio de Economía y Competitividad
Carmen García Mateo,



SpeechTech4All es un proyecto concedido por el Ministerio de Economía y Competitividad en la convocatoria de 2012 del Programa Nacional de Proyectos de Investigación Fundamental. El proyecto tiene una duración de tres años y en el participan las Universidades de Vigo, Politécnica de Cataluña y del País Vasco.  Carmen García Mateo, directora del Grupo de Tecnologías Multimedia del centro AtlantTIC de la Universidad de Vigo será la coordinadora del proyecto .

El proyecto está dedicado a la investigación avanzada en las principales tecnologías del habla (reconocimiento de voz, traducción automática, conversión de texto a voz) en todas las lenguas oficiales habladas en España, al reconocimiento del estado emocional del hablante, y a la construcción de marcos experimentales multimodales (voz y facial) y multilingües (castellano, gallego, catalán, y euskara) que permitan mostrar el trabajo realizado.

Como resultado del proyecto se obtendrán avances de investigación en cada una de las tecnologías mencionadas. Algunos ejemplos de estos avances son la búsqueda de la universalización del servicio de personalización de voces sintéticas, el desarrollo de técnicas de adaptación al dominio en traducción automática, o el desarrollo de sistemas de detección del estado del hablante mediante procesado conjunto de voz y cara. Se prevé que el proyecto participe en campañas de evaluación competitiva, entre ellas las organizadas por la Red Temática en Tecnologías del Habla, y por Interspeech.

 Con el fin de dar visibilidad a los avances logrados en todas las tecnologías, así como de ilustrar el marcado carácter social que se pretende dar al proyecto, se definen dos demostradores:

 1) El primero, que integra la mayor parte de las tecnologías trabajadas, consiste en el subtitulado multilingüe de material audiovisual relacionado con el campo de la educación: documentales, ponencias, seminarios...

 2) El segundo va dirigido a una de las aplicaciones genuinas de las tecnologías del habla, como es dotar de voz a personas que por diferentes motivos presentan un nivel severo de discapacidad oral, usando un sintetizador que se adapte a las características específicas de dicha persona.

Ref.: TEC2009-14094-C04-04
01/01/2010 - 31/12/2012 | Ministerio de Ciencia e Innovación
Eduardo Rodríguez Banga,

01/01/2010 - 31/12/2012 | Unión Europea
Carmen García Mateo,



META, the Multilingual Europe Technology Alliance, brings together researchers, commercial technology providers, private and corporate language technology users, language professionals and other information society stakeholders. META will prepare an ambitious, joint international effort towards furthering Language Technology as a means towards realising the vision of a Europe united as one single digital market and information space.