One of the main barriers to the advancement of automatic sign language recognition is the scarcity of language resources specially designed for training artificial intelligence algorithms. This lack also applies to Spanish Sign Language (LSE). With LSE_Lex40_UVIGO we aim to contribute to reduce this gap by generating an LSE corpus with isolated sign (glosses)  and sentences annotated and temporally segmented.The design, recording and tagging is a joint project of the research groups GTM and GRADES from University of Vigo.

For the realization of this project we have the collaboration of the Association of Deaf People of Vigo (ASORVIGO) and the Federation of Associations of Deaf People of Galicia (FAXPG). Most of the signing people are deaf, some of them are interpreters of SL and some are SL students.

The project has been presented at the CNLSE Congress of Spanish Sign Language held between 19 and 20 September 2019. Click here to watch the video of the presentation at the CNLSE Congress 2019.

The corpus is growing as more recordings are obtained from volunteers and will continue to grow as long as research groups continue to have funding for this line of research.

The status of the corpus as of May 2020 is reflected in the article below:

Laura Docío et al.
LSE_UVIGO: A Multi-source Database for Spanish Sign Language Recognition
Proceedings of the 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives, 2020 | PDF

 

This dataset was partially used to build n automatic sign recognizer of 40 isolated signs for the CVPR workshop: 2021 ChaLearn Looking at People Sign Language Recognition in the Wild. In the paper we trained models with larger datasets from other languages and used their parameters to fine-tune another model for the LSE_Lex40: “Isolated Sign Language Recognition with Multi-Scale Spatial-Temporal Graph Convolutional Networks“.