From Speech to Underwater Acoustics: A Transfer Learning Framework for Real-Time Passive Diver Detection Using Keyword Spotting Models

Osama Deeb; Saier  Mahmoud; Louay Saleh; Assef Jafar; Oumayma Al Dakkak; Ibrahim Chouaib

doi:10.24423/archacoust.2026.4434

Authors

Osama Deeb Higher Institute for Applied Sciences and Technology, Syria 0009-0008-5915-1192
Saier Mahmoud Higher Institute for Applied Sciences and Technology, Syria
Louay Saleh Higher Institute for Applied Sciences and Technology, Syria
Assef Jafar Higher Institute for Applied Sciences and Technology, Syria 0000-0002-7868-8621
Oumayma Al Dakkak Higher Institute for Applied Sciences and Technology, Syria 0000-0002-8842-0979
Ibrahim Chouaib Higher Institute for Applied Sciences and Technology, Syria

Abstract

Passive acoustic detection of divers faces challenges such as low signal-to-noise ratios (SNRs), data scarcity, and latency in conventional methods. This paper proposes Keyword Spotting for Diver Detection (KWS-DD)—a transfer learning framework that repurposes speech-oriented KWS models for data-efficient diver detection. Diver inhalation signatures are treated as acoustic "keywords," enabling adaptation of the transformer-based HuBERT architecture (pre-trained on speech) to identify quasi-periodic respiratory events in underwater audio. The core innovation of this work lies in adapting the state-of-the-art speech model HuBERT for accurate diver detection via non-speech inhalation acoustics. This approach eliminates the need for respiratory cycles accumulation, enabling real-time detection using minimal domain-specific data (120 inhalation samples). Deployed in diverse marine conditions, the solution achieved 94.4% accuracy and 94.6% F1-score for inhalation sounds. This represents a more than 50% range extension over conventional methods, which proved unreliable beyond 10 meters in low-SNR environments. The framework reduces false alarms caused by boat noise and generalizes to external datasets, validating cross-domain transferability. This work bridges AI-based speech processing and passive sonar signal processing, offering a resource-efficient solution for real-time underwater surveillance.

Online first
Early birds
2026, Vol 51
	No 1
2025, Vol 50
	No 1	No 2	No 3	No 4
2024, Vol 49
	No 1	No 2	No 3	No 4
2023, Vol 48
	No 1	No 2	No 3	No 4
2022, Vol 47
	No 1	No 2	No 3	No 4
2021, Vol 46
	No 1	No 2	No 3	No 4
2020, Vol 45
	No 1	No 2	No 3	No 4
2019, Vol 44
	No 1	No 2	No 3	No 4
2018, Vol 43
	No 1	No 2	No 3	No 4
2017, Vol 42
	No 1	No 2	No 3	No 4
2016, Vol 41
	No 1	No 2	No 3	No 4
2015, Vol 40
	No 1	No 2	No 3	No 4
2014, Vol 39
	No 1	No 2	No 3	No 4
2013, Vol 38
	No 1	No 2	No 3	No 4
2012, Vol 37
	No 1	No 2	No 3	No 4
2011, Vol 36
	No 1	No 2	No 3	No 4
2010, Vol 35
	No 1	No 2	No 3	No 4
2009, Vol 34
	No 1	No 2	No 3	No 4
2008, Vol 33
	No 1	No 2	No 3	No 4	No 4(S)
2007, Vol 32
	No 1	No 2	No 3	No 4	No 4(S)
2006, Vol 31
	No 1	No 2	No 3	No 4	No 4(S)
2005, Vol 30
	No 1	No 2	No 3	No 4
2004, Vol 29
	No 1	No 2	No 3	No 4
2003, Vol 28
	No 1	No 2	No 3	No 4
2002, Vol 27
	No 1	No 2	No 3	No 4
2001, Vol 26
	No 1	No 2	No 3	No 4
2000, Vol 25
	No 1	No 2	No 3	No 4
1999, Vol 24
	No 1	No 2	No 3	No 4
1998, Vol 23
	No 1	No 2	No 3	No 4
1997, Vol 22
	No 1	No 2	No 3	No 4
1996, Vol 21
	No 1	No 2	No 3	No 4
1995, Vol 20
	No 1	No 2	No 3	No 4
1994, Vol 19
	No 1	No 2	No 3	No 4
1993, Vol 18
	No 1	No 2	No 3	No 4
1992, Vol 17
	No 1	No 2	No 3	No 4
1991, Vol 16
	No 1	No 2	No 3-4
1990, Vol 15
	No 1-2		No 3-4
1989, Vol 14
	No 1-2		No 3-4
1988, Vol 13
	No 1-2		No 3-4
1987, Vol 12
	No 1	No 2	No 3-4
1986, Vol 11
	No 1	No 2	No 3	No 4
1985, Vol 10
	No 1	No 2	No 3	No 4
1984, Vol 9
	No 1-2		No 3	No 4
1983, Vol 8
	No 1	No 2	No 3	No 4
1982, Vol 7
	No 1	No 2	No 3-4
1981, Vol 6
	No 1	No 2	No 3	No 4
1980, Vol 5
	No 1	No 2	No 3	No 4
1979, Vol 4
	No 1	No 2	No 3	No 4
1978, Vol 3
	No 1	No 2	No 3	No 4
1977, Vol 2
	No 1	No 2	No 3	No 4
1976, Vol 1
	No 1	No 2	No 3	No 4

From Speech to Underwater Acoustics: A Transfer Learning Framework for Real-Time Passive Diver Detection Using Keyword Spotting Models

Downloads

Authors

Abstract

Other articles by the same author(s)

cover

ippt-pan

Issue

Pages

Section

DOI

License

How to Cite

Principal Contact

Address

Support Contact