Publications by authors named "Jonas Braasch"

Room-oriented immersive systems are human-scale built environments that enable collective multi-sensory immersion in virtual space. Although such systems are currently seeing increasing applications in public realms, limited understanding remains regarding how humans interact with the virtual environments displayed within. Synthesizing virtual reality ergonomics and human-building interaction (HBI) knowledge allows us to investigate these systems meaningfully.

View Article and Find Full Text PDF

This paper proposes an experiential method for learning acoustics and consequences of room design through the rapid creation of audio-visual congruent walkable auralizations. An efficient method produces auralizations of acoustical landmarks using a two-dimensional ray-tracing algorithm and publicly available floor plans for a 128-channel wave-field synthesis system. Late reverberation parameters are calculated using additional volumetric data.

View Article and Find Full Text PDF

Deep learning is one established tool for carrying out classification tasks on complex, multi-dimensional data. Since audio recordings contain a frequency and temporal component, long-term monitoring of bioacoustics recordings is made more feasible with these computational frameworks. Unfortunately, these neural networks are rarely designed for the task of open set classification in which examples belonging to the training classes must not only be correctly classified but also crucially separated from any spurious or unknown classes.

View Article and Find Full Text PDF

Emotion is a central component of verbal communication between humans. Due to advances in machine learning and the development of affective computing, automatic emotion recognition is increasingly possible and sought after. To examine the connection between emotional speech and significant group dynamics perceptions, such as leadership and contribution, a new dataset (14 group meetings, 45 participants) is collected for analyzing collaborative group work based on the lunar survival task.

View Article and Find Full Text PDF

The standard autocorrelation model of pitch perception posits that the pitch of a stimulus can be predicted from the first major peak of a summary autocorrelation function (SACF) after the zero-delay peak. Models based on this theory are capable of predicting a wide range of pitch phenomena. There are, however, a number of cases where the approach fails.

View Article and Find Full Text PDF

Human listeners localize sounds to their sources despite competing directional cues from early room reflections. Binaural activity maps computed from a running signal can provide useful information about the presence of room reflections, but must be inspected visually to estimate auditory cues. A model was constructed using machine learning to validate the presence of and perform the extraction of these cues.

View Article and Find Full Text PDF

When two similar sounds are presented from different locations, with one (the lead) preceding the other (the lag) by a small delay, listeners typically report hearing one sound near the location of the lead sound source-this is called the precedence effect (PE). Several questions about the underlying mechanisms that produce the PE are asked. (1) How might listeners' relative weighting of cues at onset versus ongoing stimulus portions affect perceived lateral position of long-duration lead/lag noise stimuli? (2) What are the factors that influence this weighting? (3) Are the mechanisms invoked to explain the PE for transient stimuli applicable to long-duration stimuli? To answer these questions, lead/lag noise stimuli are presented with a range of durations, onset slopes, and lag-to-lead level ratios over headphones.

View Article and Find Full Text PDF

Masked detection thresholds were measured for a noise target in the presence of a masker composed of (1) a lead/lag noise pair with the lead interaural time difference (ITD) set the same or opposite to the target, (2) a diotic masker, and (3) a dichotic pair of decorrelated noises. If the precedence effect actually eliminates a second, later arriving stimulus, a spatial release from masking would be expected when the lead ITD is opposite that of the target. Results for a range of lead/lag delays suggest that the precedence effect is not the result of a perceptual removal of the lag.

View Article and Find Full Text PDF

This study investigates how virtual head rotations can improve a binaural model's ability to segregate speech signals. The model takes two mixed speech sources spatialized to unique azimuth positions and localizes them. The model virtually rotates its head to orient itself for the maximum signal-to-noise ratio for extracting the target.

View Article and Find Full Text PDF

Currently, architectural room acoustic metrics make no real distinction between a room impulse response and the auditory system's internal representation of a room. These metrics are generally based on impulse responses, and indirectly assume that the internal representation of the acoustic features of a room is independent of the sound source. However, while a room can be approximated as a linear, time-invariant system, auditory processing is highly non-linear and varies a great deal over time in response to different acoustic inputs.

View Article and Find Full Text PDF

A precedence effect model is described that can use a binaural signal to robustly localize a sound source in the presence of multiple reflections for the frontal horizontal plane. The model also estimates a room impulse response from a running binaural signal and determines the spatial locations and delays of early reflections, without any prior or additional knowledge of the source. A dual-layer cross-correlation/auto-correlation algorithm is used to determine the interaural time difference of the direct sound source component and to estimate a binaural activity pattern.

View Article and Find Full Text PDF

The purpose of this study was to gather behavioral data concerning the precedence effect as manifested by the localization-dominance of the leading elements of compound stimuli. This investigation was motivated by recent findings of Shackleton and Palmer [(2006). J.

View Article and Find Full Text PDF

When a pair of sounds arrive from different directions with a sufficiently short delay between them, listeners hear a perceptually fused image with a perceived location that is dominated by the first arriving sound. This is called the precedence effect. To test the limits of this phenomenon, 200-ms noise stimuli were presented over headphones to model a temporally overlapping direct sound (lead) with a single reflection (lag) at inter-stimulus intervals (ISIs) of 0-5 ms.

View Article and Find Full Text PDF

When sound reflects from an irregular architectural surface, it spreads spatially and temporally. Extensive research has been devoted to prediction and measurement of diffusion, but less has focused on its perceptual effects. This paper examines the effect of temporal diffusion on echo threshold.

View Article and Find Full Text PDF

A number of precedence-effect models have been developed to simulate the robust localization performance of humans in reverberant conditions. Although they are able to reduce reverberant information for many conditions, they tend to fail for ongoing stimuli with truncated on/offsets, a condition human listeners master when localizing a sound source in the presence of a reflection, according to a study by Dizon and Colburn [J. Acoust.

View Article and Find Full Text PDF

This paper describes a system for modeling, recognizing, and classifying sound textures. The described system translates contemporary approaches from video texture analysis, creating a unique approach in the realm of audio and music. The signal is first represented as a set of mode functions by way of the Empirical Mode Decomposition technique for time/frequency analysis, before expressing the dynamics of these modes as a linear dynamical system (LDS).

View Article and Find Full Text PDF

Despite many studies investigating auditory spatial impressions in rooms, few have addressed the impact of simultaneous visual cues on localization and the perception of spaciousness. The current research presents an immersive audiovisual environment in which participants were instructed to make auditory width judgments in dynamic bi-modal settings. The results of these psychophysical tests suggest the importance of congruent audio visual presentation to the ecological interpretation of an auditory scene.

View Article and Find Full Text PDF

Background: Directional sounds simulated using nonindividualized head-related transfer functions (HRTFs) often result in front-back confusion.

Objective: This study was designed to examine how manipulating these nonindividualized HRTF spectra can reduce front-back confusion in headphone-simulated directional sounds.

Method: HRTFs of six ear-level directions were studied (angles of 0 degrees, 45 degrees, 135 degrees, 180 degrees, 225 degrees, and 315 degrees).

View Article and Find Full Text PDF

Although there have been numerous studies investigating subjective spatial impression in rooms, only a few of those studies have addressed the influence of visual cues on the judgment of auditory measures. In the psychophysical study presented here, video footage of five solo music/speech performers was shown for four different listening positions within a general-purpose space. The videos were presented in addition to the acoustic signals, which were auralized using binaural room impulse responses (BRIR) that were recorded in the same general-purpose space.

View Article and Find Full Text PDF

In this investigation, three different swell systems known in pipe organs, the swell box, the crescendo wheel, and the historic wind swell were measured and compared to each other. The dynamic range of the crescendo wheel was found to be most effective, and for frequencies near 2 kHz the increase in sound pressure level could be up to 50 dB between the softest and the loudest adjustment. The maximum dynamic range for the wind swell and the swell box were found to be 10-20 dB in the same frequency range.

View Article and Find Full Text PDF