Research can bridge country’s multilingual divide.
Researchers at the International Institute of Information Technology, Hyderabad (IIIT-H), have initiated one of the largest crowdsourcing speech projects to connect voice with vernacular languages and build an automatic speech recognition system.
With formidable expertise in Language and Speech Processing, IIIT Hyderabad has joined hands with the government to embark on the Automatic Speech Recognition (ASR) module for translation of Indian languages.
The project is being headed by Prakash Yalla, head, technology transfer office, and Anil Kumar Vuppala, associate professor, speech processing centre.
With his credo of ‘Technology in the Service of Society’, Padma Bhushan Prof.
Raj Reddy, the chairman of the institute’s governing council and recipient of the prestigious ACM Turing Award and Legion of Honour; University Professor of CS and robotics, and Moza Bint Naseer Chair at the School of Computer Science in the Carnegie Mellon University, provides all the inspiration.
Reddy has been championing the cause of Indian language Alexa and strongly believes that AI can help bridge the country’s multilingual divide.
To build AI-enabled automatic speech recognition systems requires 1000s and 1000s of hours of speech data, along with transcribed text of the same for each language.
With Reddy’s vision of reaching out to the common man, conversational AI assumes importance. Hence datasets containing speech in as natural a setting as possible is crucial. The project is looking towards crowdsourcing speech data as an affordable option.
The team is initially inviting volunteers to contribute to Telugu language speech data. “The idea is to collect around 2,000 hours of spoken Telugu over the course of a year. For this, we’re planning on liaisoning with academic institutions across Andhra Pradesh and Telangana and conducting Just-A-Minute and debate competitions. Another approach is via the existing Telugu Wikipedia community consisting of learned scholars and lovers of the language,” Yalla said.
The team is also working with industry partners such as OzoneTel and Pactera Edge and leveraging their network to get access to data.
The initial collection of Telugu speech data is expected to lead to the establishment of protocols and systems in place for crowdsourcing of data for all Indian languages. “If everything works, it’ll become a nationwide data collection exercise, probably the largest ever and we’ll make it available to people free,” Prakash said.
Over 50 per cent of the Indian population uses devices embedded with AI-based speech recognition technology. But in this multi-lingual country, which has 22 official languages and 12 different scripts, these voice-enabled devices are dominated by English-speaking assistants.
Industry 4.0 will open up new career avenues by connecting the manufacturing system with digital technologies
NEET 2020 cut-off score reduced for AYUSH admissions
Jamia Millia Islamia begins online open-book examinations
XLRI to host virtual leadership talk by Kiran Bedi
Buddhist, Ambedkar tourism courses to start at BAM University
Centre For Networking Intelligence at IISc inaugurates new networking lab