The folded space of machine listening


  • Domenico Napolitano University Suor Orsola Benincasa of Naples
  • Renato Grieco, Dr. Conservatory San Pietro a Majella of Naples



Machine listening, listening space, media archeology, acoustic fi ngerprint, audio watermarking, copyright detection algorithms.


The paper investigates new machine listening technologies through a comparison of phenomenological and empirical/media-archeological approaches. While phenomenology associates listening with subjectivity, empiricism takes into account the technical operations involved with listening processes in both human and non-human apparatuses. Based on this theoretical framework, the paper undertakes a media-archeological investigation of two algorithms employed in copyright detection: “acoustic fi ngerprinting” and “audio watermarking”. In the technical operations of sound recognition algorithms, empirical analysis suggests the coexistence of a multiplicity of spatialities: from the “sound event”, which occurs in three-dimensional physical space, to its mathematical representation in vector space, and to the one-dimensional informational space of data processing and machine-to-machine communication. Recalling Deleuze’s defi nition of “the fold”, we defi ne these coexistent spatial dimensions in techno-culturally mediated sound as “the folded space” of machine listening. We go on to argue that the issue of space in machine listening consists of the virtually infi nite variability of the sound event being subjected to automatic recognition. The diffi culty lies in conciliating the theoretically enduring information transmitted by sound with the contingent manifestation of sound affected by space. To make machines able to deal with the site-specifi city of sound, recognition algorithms need to reconstruct the three-dimensional space on a signal processing level, in a sort of reverse-engineering of the sound phenomenon that recalls the concept of “implicit sonicity” defi ned by Wolfgang Ernst. While the metaphors and social representations adopted to describe machine listening are often anthropomorphic – and the very term “listening”, when referring to numerical operations, can be seen as a metaphor in itself – we argue that both human listening and machine listening are co-defi ned in a socio-technical network, in which the listening space no longer coincides with the position of the listening subject, but is negotiated between human and nonhuman agencies.

