Skip to main content
Our research experts

Shabnam Ghaffarzadegan, Dr.

Senior research scientist in Artificial Intelligence, audio analytics

“Humans, animals and even plants use sound to communicate and understand the environment. To have a truly intelligent AI system, we need to equip machines with sound perception capability as well.”

Shabnam Ghaffarzadegan, Dr.

My primary research interests include the areas of human-machine collaborations, relating to audio, language and cognitive processing. In my research, I leverage audio signal processing with domain-specific machine learning solutions. Specifically, I develop advanced audio scene classification and audio event detection solutions. The goal of my work is to equip machines with the knowledge and understanding of audio and speech from the environment similar to human. The result of my work is used to enhance machine intelligence and provide alternative human machine interaction.

Curriculum vitae

The University of Texas at Dallas, Richardson, USA

2013-2016
EE PhD graduate, automatic speech recognition, human-machine systems

Educational Testing Service (ETS), Princeton, USA

2015
Research intern, automatic scoring of non-native spontaneous speech

Amirkabir University of Technology, Iran

2009-2012
EE MS graduate, blind audio source separation and localization

Selected publications

  • UT-VOCAL EFFORT II: Analysis and constrained-lexicon recognition of whispered speech

    S Ghaffarzadegan et al. (2014)

    UT-VOCAL EFFORT II: Analysis and constrained-lexicon recognition of whispered speech
    • Shabnam Ghaffarzadegan, Hynek Bořil, John HL Hansen
    • 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • Generative modeling of pseudo-whisper for robust whispered speech recognition

    S Ghaffarzadegan (2016)

    Generative modeling of pseudo-whisper for robust whispered speech recognition
    • Shabnam Ghaffarzadegan, Hynek Bořil, John HL Hansen
    • IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • Model and feature based compensation for whispered speech recognition

    S Ghaffarzadegan et al. (2014)

    Model and feature based compensation for whispered speech recognition
    • Shabnam Ghaffarzadegan, Hynek Bořil, John HL Hansen
    • Fifteenth Annual Conference of the International Speech Communication Association (Interspeech)
  • Generative modeling of pseudo-target domain adaptation samples for whispered speech recognition

    S Ghaffarzadegan et al. (2015)

    Generative modeling of pseudo-target domain adaptation samples for whispered speech recognition
    • Shabnam Ghaffarzadegan, Hynek Bořil, John HL Hansen
    • 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • Deep neural network training for whispered speech recognition using small databases and generative model sampling

    S Ghaffarzadegan et al. (2017)

    Deep neural network training for whispered speech recognition using small databases and generative model sampling
    • Shabnam Ghaffarzadegan, Hynek Bořil, John HL Hansen
    • International Journal of Speech Technology
  • An Ontology-Aware Framework for Audio Event Classification

    Y Sun et al. (2020)

    An Ontology-Aware Framework for Audio Event Classification
    • Yiwei Sun, Shabnam Ghaffarzadegan
    • 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • Self-supervised attention model for weakly labeled audio event classification

    B Kim et al. (2019)

    Self-supervised attention model for weakly labeled audio event classification
    • Bongjun Kim, Shabnam Ghaffarzadegan
    • 2019 27th European Signal Processing Conference (EUSIPCO)
  • Deep Multiple Instance Feature Learning via Variational Autoencoders

    S Ghaffarzadegan (2018)

    Deep Multiple Instance Feature Learning via Variational Autoencoders
    • Shabnam Ghaffarzadegan
    • Proc AAAI Wrkshp Artificial Intelligence Applied to Assistive Technologies and Smart Environments
  • Occupancy Detection in Commercial and Residential Environments Using Audio Signal

    S Ghaffarzadegan et al. (2017)

    Occupancy Detection in Commercial and Residential Environments Using Audio Signal
    • Shabnam Ghaffarzadegan, Attila Reiss, Mirko Ruhs, Robert Duerichen, Zhe Feng
    • Annual Conference of the International Speech Communication Association (Interspeech)
  •  A Real-Time Audio Monitoring Framework with Limited Data for Constrained Devices

    A Salekin et al. (2019)

    A Real-Time Audio Monitoring Framework with Limited Data for Constrained Devices
    • Asif Salekin, Shabnam Ghaffarzadegan, Zhe Feng, John Stankovic

Interview with Shabnam Ghaffarzadegan, Dr.

Senior research scientist in Artificial Intelligence, audio analytics

Please tell us what fascinates you most about research.

Research gives me the opportunity to explore the unknown and learn new things at every stage. Every project shows a new perspective and new challenges in the field that I have worked in for a long time. The possibility to turn mistakes into a learning experience is also very valuable for me.

What makes research done at Bosch so special?

For me the essential factor is the opportunity to conduct core research that I am passionate about while having a direct impact on product and everyday life. The idea that my research is not just limited to papers and has a real impact is very valuable. Another exciting factor is the multi-disciplinary teams at Bosch which brings new insight into each project from different fields and perspectives. Also, at Bosch as a multi-cultural company we have the opportunity to understand people’s need in different parts of the world and their acceptance of a new technology. This information lets us curate our products to be useful worldwide.

What research topics are you currently working on at Bosch?

The focus of my research at Bosch is leveraging domain specific audio and speech technologies with general Artificial Intelligence concepts to develop “Smart Ear” for machines. I incorporate audio perception in the AI systems to help them understand the environment and navigate better in the real world.

What are the biggest scientific challenges in your field of research?

There are many challenges in the field of audio analytics, to name a few:

1) Large variance within each audio category and its context such as hardware, noise condition and acoustic environment. Using different microphones to capture the sound, being at a party or a quiet house or a street, being in a small room or conference hall can all bring so many challenges to our systems. Our goal is developing systems that are robust to the environmental and contextual variations; 2) Unlimited audio vocabulary in real world which makes it impossible to predict the lexicon of a given task. Unlike spoken words that are using a limited set of alphabets, variations in environmental sounds are unlimited. Just think about the different sounds you hear every day. As a result, it is impossible to teach the AI system all the possible sounds in the world. We need to come up with a smarter AI agent that knows when it doesn’t know something; 3) Limited availability of annotated data due to the unlimited vocabulary and contextual variations which is essential for deep learning solutions; 4) Users' privacy concerns to have a system listening to them continuously. As researchers, we need to ensure users' privacy, limit the possible attacks which might compromise our AI systems and be transparent about the information we use from users.

How do the results of your research become part of solutions "Invented for life"?

An effective machine audio perception, along with other modalities such as vision and natural language, can make the next generation of smart life technology a reality. For example, our technology can be used as a security system to detect glass breaking, a smoke alarm, a baby crying, and a dog barking and alert the user. It can teach the smart speakers not to interrupt a human in case they are talking or if there are other loud noises in the environment. Last but not least, audio perception can help the automobile to understand the environment, e.g. detecting police/emergency services passing by and take proper action.

Shabnam Ghaffarzadegan, Dr.

Get in touch with me

Shabnam Ghaffarzadegan, Dr.
Senior research scientist in Artificial Intelligence, audio analytics

Share this on: