Archives of Acoustics, 39, 2, pp. 203-214, 2014

Auditory Display Applied to Research in Music and Acoustics

Audio Acoustics Lab., Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology

This paper presents a relationship between Auditory Display (AD) and the domains of music and acoustics. First, some basic notions of the Auditory Display area are shortly outlined. Then, the research trends and system solutions within the fields of music technology, music information retrieval and music recommendation and acoustics that are within the scope of AD are discussed. Finally, an example of AD solution based on gaze tracking that may facilitate music annotation process is shown. The paper concludes with a few remarks about directions for further research in the domains discussed.
Keywords: Auditory Display, Music, Acoustics, Music Technology, Music Information Retrieval, Sonification, Music Annotation.
Full Text: PDF


AHONEN J., Del GALDO G., KUECH F., PULKKI V. (2012), Directional Analysis with Microphone Array Mounted on Rigid Cylinder for Directional Audio Coding, JAES, 60, 5, 311-324.

Airguitar website (, accessed March 2014).

Allosphere: (accessed Nov. 2013).

AUCOUTURIER J.-J., PACHET F. (2003), Representing musical genre: A state of art, J. New Music Research, 32, 1, 83-93.

BENETOS E., KOTROPOULOS C. (2008), A tensor-based approach for automatic music genre classification, Proc. European Signal Processing Conference, Lausanne, Switzerland.

BEAUCHAMP J.W. (2011), Perceptually Correlated Parameters of Musical Instrument Tones, Archives of Acoustics, 36, 2, 225–238, DOI: 10.2478/v10168-011-0018-8

BERTHAUT F., DESAINTE C.M., HACHET M. (2011), Interacting with 3D Reactive Widgets for Musical Performance, J. New Music Research, 40, 3, 253-263.

BISESI E., PARNCUTT R. (2011), An accent-based approach to automatic rendering of piano performance: preliminary auditory evaluation, Archives of Acoustics, 36, 2, 283-296.

BLAUERT J. (2012), A Perceptionist’s View on Psychoacoustics, Archives of Acoustics, 37, 3, 365–371, DOI: 10.2478/v10168-012-0046-z

BLAUERT J., JEKOSCH U. (2012), A Layer Model of Sound Quality, J. Audio Eng. Soc., 60, 1/2, 4-12.

BLAUERT J., RABENSTEIN R. (2012), Providing Surround Sound with Loudspeakers: A Synopsis of Current Methods, Archives of Acoustics, 37, 1, 5–18, DOI: 10.2478/v10168-012-0002-y

BRAZIL E., FERNSTRÖM M., TZANETAKIS G., Cook P. (2002), Enhancing Sonic Browsing Using Audio Information Retrieval, Proc. International Conf. on Auditory Display, Kyoto, Japan.

BRAZIL E., FERNSTRÖM M., Audio Information Browsing With The Sonic Browser, Proc. CMV'03 Proceedings of the conference on Coordinated and Multiple Views In Exploratory Visualization, IEEE Computer Society Washington, DC, USA, 2003.

DOBRUCKI A., PLASKOTA P., PRUCHNICKI P., PEC M., BUJACZ M., STRUMILLO P. (2010), Measurement System for Personalized Head-Related Transfer Functions and Its Verification by Virtual Source Localization Trials with Visually Impaired and Sighted Individuals, J. Audio Eng. Soc., 58, 9, 724-738.

FERNSTRÖM M., McNAMARA C. (2005), After Direct Manipulation - Direct Sonification, ACM Transaction on Applied Perception, 2, 4, 495-499.

GŁACZYŃSKI J., ŁUKASIK E., Automatic music summarization. A "thumbnail" approach, Archives of Acoustics, 36, 2, 297-309 (2011).

GUY I., ZWERDLING N., RONEN I., CARMEL D., UZIEL E. (2010), Social media recommendation based on people and tags, ACM, 194-201.

HERMANN T. (2008), Taxonomy and Definitions for Sonification And Auditory Display, Proc. of the 14th International Conference on Auditory Display, Paris, France June 24 – 27.

HOLZAPFEL A., STYLIANOU Y., Musical genre classification using nonnegative matrix factorization-based features, IEEE Transactions on Audio, Speech, and Language Processing, 16, 2, 424-434 (2008).

HYOUNG-GOOK K., MOREAU N., SIKORA T. (2005), MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval, Wiley & Sons.

KOSTEK B. (1999), Soft Computing in Acoustics, Applications of Neural Networks, Fuzzy Logic and Rough Sets to Musical Acoustics, Studies in Fuzziness and Soft Computing, Physica Verlag, Heildelberg, New York.

KOSTEK B., CZYZEWSKI A. (2001), Representing Musical Instrument Sounds for their Automatic Classification, J. Audio Eng. Soc., 49, 768-785.

KOSTEK B. (2005), Perception-Based Data Processing in Acoustics. Applications to Music Information Retrieval and Psychophysiology of Hearing, Springer Verlag, Berlin, Heidelberg, New York.

KOSTEK B. (2013), Music Information Retrieval in Music Repositories, Intelligent Systems Reference Library, 42, Springer Verlag, Berlin, Heidelberg, Chapter 17, 464 – 489.

KOSTEK B. (2013), Auditory Display from the Music Technology Perspective, 19th International Conference on Auditory Display (ICAD-2013), Lodz, Poland.


KUNKA B., KOSTEK B. (2012), Objectivization of audio-video correlation assessment experiments, Archives of Acoustics, 37, 1, 63-72.

KUNKA B., KOSTEK B., KULESZA M., SZCZUKO P., CZYZEWSKI A. (2010), Gaze-Tracking-Based Audio-Visual Correlation Analysis Employing Quality of Experience Methodology, Intelligent Decision Technologies, IOS Press, 32, 217-227.

KUNKA B., KOSTEK B. (2013), New Aspects of Virtual Sound Source Localization Research – impact of visual angle and 3D video content on sound perception, J. Audio Eng. Soc., 61, 5, 280-189.

LECH M., KOSTEK B. (2013), Evaluation of the influence of ergonomics and multimodal perception on sound mixing while employing a novel gesture-based mixing interface, J. Audio Eng. Society, 61, 5, 301-313.

LECH M., KOSTEK B. (2013), Gesture-Controlled Sound Mixing System, 19th International Conference on Auditory Display (ICAD-2013), Lodz, Poland.

LI T., OGIHARA M., LI Q. (2003), A comparative study on content-based music genre classification, Proc. 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 282-289, Toronto, Canada.

MANDEL M., ELLIS D. (2007), LABROSA’s audio music similarity and classification submissions, Music Information Retrieval Information Exchange (MIREX).

MARSHALL M., MALLOCH J., WANDERLEY M.M. (2009), Gesture Control of Sound Spatialization for Live Musical Performance, in Gesture Based Human Computer Interaction and Simulation, M. Sales Dias (ed.), Berlin, Springer, 227-238.

MÄKI-PATOLA T., LAITINEN J., KANERVA A., TAKALA T. (2005), Experiments with virtual reality instruments. In Proc. Conf. on New Interfaces for Musical Expression, Vancouver, BC, Canada, 11–6.

Mufin system; (accessed Nov. 2013).

Musicovery system; (accessed Nov. 2013).

NepTune system; (accessed March 2014).

NEUHOFF J. and co-authors (1999), Sonification report: Status of the field and research agenda, Tech. Rep., International Community for Auditory Display,

(, accessed Nov. 2013).

NESS S., THEOCHARIS A., TZANETAKIS G., MARTINS L.G. (2009), Improving automatic music tag annotation using stacked generalization of probabilistic SVM outputs, 17 ACM International Conf. on Multimedia, New York, NY.

PACHET F., CAZALY D. (2003), A classification of musical genre, Proc. RIAO Content-Based Multimedia Information Access Conf.

PAMPALK E., FLEXER A., WIDMER G. (2005), Improvements of audio-based music similarity and genre classification, Proc. Int. Symp. Music Information Retrieval (ISMIR), London, UK.

SELFRIDGE R., REISS J. (2011), Interactive Mixing Using Wii Controller, AES 130th Convention, London, UK.

SHINN-CUNNINGHAM B.G., STREETER T. (2005), Spatial Auditory Display: Comments on Shinn-Cunningham et al., ICAD 2001, ACM Transactions on Applied Perception, 2, 4, 426–429.

SONIFICATION -, a website providing definitions of notions within AD, by HERMANN T, 2014.

STEWART R. (2010), Spatial Auditory Display for Acoustics and Music Collections, Ph.D. thesis, School of Electronic Engineering and Computer Science Queen Mary, University of London, UK.

STEWART R., SANDLER M. (2012), Spatial Auditory Display, J. Audio Eng. Soc., 60, 11, 936-946.

STOCKMAN T., ROGINSKA A., WALKER B., METATLA O. (2012), Guest Editors’ Note: Special Issue on Auditory Display, J. Audio Eng. Soc., 60, 7/8, 496.

SYMEONIDIS P., RUXANDA M.M., NANOPOULOS A., MANOLOPOULOS Y. (2008), Ternary semantic analysis of social tags for personalized music recommendation, Proc. 9th Int. Symp. Music Information Retrieval (ISMIR), 219-224.

The International Society for Music Information Retrieval /Intern. Conf. on Music Information Retrieval website (accessed Nov. 2013).

TRAN P.K., AMREIN B.E., LETOWSKI T.R. (2009), Audio Helmet-Mounted Displays, In T.R. Letowski, E. Schmeisser, D. Russo, & C.E. Rash (Eds.), Displays: Sensation, Perception and Cognition Issues, Rash, C.E., Russo, M.B., Letowski, Ft. Rucker, U. S. Army Aeromedical Research Laboratory: Ft. Rucker, AL, 175–236.

TZANETAKIS G., COOK P. (2002), Musical genre classification of audio signal, IEEE Transactions on Speech and Audio Processing, 10, 3, 293-302.

VALBOM L., MARCOS A. (2005), WAVE: Sound and music in an immersive environment, Computers & Graphics, 29, 6, 871-881.

VAMVAKOUSIS Z., RAMIREZ R. (2012), Temporal Control In the EyeHarp Gaze-Controlled Musical Interface, Inter. Conf. on New Interfaces for Musical Expression, NIME’2012, Ann Arbor, Michigan, USA.

VIGLIENSONI G., WANDERLEY M.M. (2011a), Touchless Gestural Control of Concatenative Sound Synthesis, Schulich School of Music, McGill University, (MoA), Montreal, Canada.

VIGLIENSONI G., WANDERLEY M.M. (2011b), Soundcatcher: Explorations In Audio-Looping And Time-Freezing Using An Open-Air Gestural Controller, McGill University Music Technology Area, Montreal, Canada.

WINTERS R.M., WANDERLEY M.M. (2012), New Directions for Sonification of Expressive Movement in Music, 18th International Conf. on Auditory Display (ICAD2012) Atlanta, Georgia (June 18-21, 2012) (, accessed Nov. 2013).

WINTERS R.M., HATTWICK I., WANDERLEY M.M. (2013), Integrating Emotional Data into Music Performance: Two Audio Environments for the Emotional Imaging Composer, International Conference on Music and Emotion, Jyväskylä, Finland.

DOI: 10.2478/aoa-2014-0025

Copyright © Polish Academy of Sciences & Institute of Fundamental Technological Research (IPPT PAN)