Streaming ASR Porting and Optimization for SG2002

Index:S2312

Award:15000Chinese Yuan

Challenge Description

Natural voice interaction is an important form of human-machine interfacing. However, as high-precision ASR models are computationally intensive, these computation projects are usually run on the cloud, worsening the experience.

This challenge aims to port high-precision streaming ASR to Sophgo’s SG2002 processor. You may choose between Chinese and English models. The goal is to achieve minimum word error rate whilst running on limited RAM (256MBytes) and a rate of RTF<1.

You may reference Kaldi, Wenet, and other open source speech recognition projects for your port.

Rubrics

Offline speech recognition using SG2002’s on-board microphone.
You are allowed to use RVV0.7, TPU, and other computational resources available on the SG2002.
The judging will take into account memory usage, real time factor (RTF), word error rate (WER).
The model must run within SG2002’s 256MBytes of RAM.
Real time factor: Lower is better, the model must support streamed recognition at RTF<1.
Word error rate: The model must achieve a WER must be lower than 10% to be considered usable; 5% is ideal.
The host will assess both accuracy and performance and assign scores to each submission based on a predetermined weighted scale. The highest scoring submission wins.

**Assessment Platform:

Submission

Please submit your completed project at https://github.com/plctlab/rvspoc-s2312-asr-sg2002 as a pull request.
Please outline the software environment needed to run your submission in your pull request body.
During the competition, all optimised artifacts may be submitted as the following:
1. Binary files.
2. Encrypted code (please send decryption details to rvspoc@cyberlimes.cn).
3. Source code.
After the results are announced, you will need to publish source code for all submissions.
The committee will close pull request submissions after the challenge’s run time (on ~~February 16~~February 29, 2024) and commence assessment of submissions.

Assessment

The assessment platform for this challenge is SG2002 (LicheeRV Nano/Milk-V duo 256) ¹.
The committee will use the following software configuration to assess the submissions on the hardware platform outlined in item 1:

To be announced.

The submissions must satisfy all seven (7) requirements outlined in the assessment rubrics.
The details of the challenges may change during the competition as we optimise and clarify the rubrics and instructions. Please keep an eye out for updates. The assessment committee reserves the right to final interpretation of all rubrics and instructions.

Notice on Intellectual Properties and Open Source Licenses

All challenge submissions must be open source and committed to a specified repository. The participant(s) (author) holds rights to their work. The host encourages contributing any changes made to the upstream.

LicheeRV Nano/Milk-V duo 256 purchase link:
- LicheeRV Nano： https://sipeed.com/licheerv-nano
- Milk-V duo 256： https://milkv.io/duo (please select the 256M model)
↩︎