In-Sensor Polarimetric Optoelectronic Computing Based on Gate-Tunable 2D ...
Multi-Color Detection of Single Sensor Based on Tellurium Relaxation Char...
Uncooled InAsSb- based high- speed mid- wave infrared barrier detector
High Frequency Mid-Infrared Quantum Cascade Laser Integrated With Grounde...
Multi-function sensing applications based on high Q-factor multi-Fano res...
High-power electrically pumped terahertz topological laser based on a sur...
Van der Waals polarity-engineered 3D integration of 2D complementary logic
Distinguishing the Charge Trapping Centers in CaF2-Based 2D Material MOSFETs
Influence of Growth Process on Suppression of Surface Morphological Defec...
High-Power External Spatial Beam Combining of 7-Channel Quantum Cascade L...
官方微信
友情链接

FPGA Implementation of CNN-LSTM Classifier in Speech Emotion Recognition System

2024-05-14


Gao, Zhaogang; Xiao, Wan'ang; Zhou, Weixin; Yang, Zhenghong Source: 2023 International Conference on High Performance Big Data and Intelligent Systems, HDIS 2023, p 47-52, 2023, 2023 International Conference on High Performance Big Data and Intelligent Systems, HDIS 2023;

Abstract:

Speech emotion recognition is a key technology within the field of human-computer interaction, which equips computers with the ability to recognize and understand human emotions by establishing emotional associations between computers and speech information. However, speech emotion recognition technology remains in the laboratory stage and has not been popularized and applied on a large scale. We design an FPGA-based speech emotion recognition system that deploys a CNN-LSTM neural network model. The neural network model is designed using HLS (High-level synthesis). The neural network is constructed on the PL side, and its scheduling and implementation are managed on the PS side. This system captures speech and analyzes emotions in real-time, which can be used in future wearables, smart homes, and smart robots to improve the human-computer interaction experience. We conducted experiment using the TESS (Toronto Emotional Speech Set) dataset, achieving an accuracy of 97.86%.

©2023 IEEE. (19 refs.)




关于我们
下载视频观看
联系方式
通信地址

北京市海淀区清华东路甲35号(林大北路中段) 北京912信箱 (100083)

电话

010-82304210/010-82305052(传真)

E-mail

semi@semi.ac.cn

交通地图
版权所有 中国科学院半导体研究所

备案号:京ICP备05085259-1号 京公网安备110402500052 中国科学院半导体所声明