Botao Zhao

Botao Zhao

Researcher

I earned my bachelor’s degree from Sichuan University, double-majoring in Biotechnology and Software Engineering (2015–2019). I then pursued my master’s studies at the Institute of Brain-inspired Artificial Intelligence, Fudan University (2019–2022). Currently, I am a Senior Algorithm Engineer at Ping An Technology, working on AI applications in financial services and healthcare. My research interests span embodied intelligence, robotics, medical AI, and voiceprint recognition, with a focus on developing innovative AI solutions for healthcare and financial. I have published over 10 papers in top-tier conferences and journals, including AAAI and Medical Image Analysis (MIA), and I am grateful for the support and recognition from the academic community. I also serve as a reviewer for international journals and conferences, such as European Radiology and ICME, where I contribute to academic discussions and advancements in the field. My goal is to explore cutting-edge challenges at the intersection of embodied intelligence and biomedicine, and to drive the practical application of AI technologies in healthcare and financial to address real-world problems.

Interests
  • Embodied intelligence
  • Medical AI
  • Robotics
  • MLLM
  • Voiceprint recognition
  • TTS

Publications

  1. EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion Recognition, (2025), ‡Co-first Author, In EMNLP2025 (CCF-B)
  2. Generalized Audio Deepfake Detection Using Frame-level Latent Information Entropy, (2025), †First Author, In ICME2025 (CCF-B)
  3. ACCon: Angle-Compensated Contrastive Regularizer for Deep Regression, (2025), †First Author, In AAAI2025 (CCF-A)
  4. Retrieval-Augmented Audio Deepfake Detection (2024), In ICMR2024 (CCF-B)
  5. Cross-grained Contrastive Representation for Unsupervised Lesion Segmentation in Medical Images (2023), In ICCV2023 (CCF-A)
  6. nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-Shot Multi-speaker text-to-speech, (2022), †First Author, In ICASSP2022 (CCF-B)
  7. Denoising of 3D MR Images Using a Voxel-Wise Hybrid Residual MLP-CNN Model to Improve Small Lesion Diagnostic Confidence (2022), In MICCAI2022 (CCF-B)
  8. AUCseg: An Automatically Unsupervised Clustering Toolbox for 3D-Segmentation of High-Grade Gliomas in Multi-Parametric MR Images, (2021), †First Author, In Front. Oncol. (IF=5.74)

中文期刊文章

  1. 基于多模态大模型的具身智能体研究进展与展望, (2025), †First Author, 《大数据》,11 (03),(CCF-T2)
  2. 基于深度卷积和自注意力机制的端到端地震波降噪方法, (2025), †First Author, 《大数据》(CCF-T2)