Publications by authors named "Andrey Anikin"

What information is encoded in the cries of human babies? While it is widely recognized that cries can encode distress levels, whether cries reliably encode the cause of crying remains disputed. Here, we collected 39201 cries from 24 babies recorded in their homes longitudinally, from 15 days to 3.5 months of age, a database we share publicly for reuse.

View Article and Find Full Text PDF

Formants (vocal tract resonances) are increasingly analyzed not only by phoneticians in speech but also by behavioral scientists studying diverse phenomena such as acoustic size exaggeration and articulatory abilities of non-human animals. This often involves estimating vocal tract length acoustically and producing scale-invariant representations of formant patterns. We present a theoretical framework and practical tools for carrying out this work, including open-source software solutions included in R packages soundgen and phonTools.

View Article and Find Full Text PDF

Cat purring, the unusual, pulsed vibration that epitomizes comfort, enjoys a special status in the world of vocal communication research. Indeed, it has long been flagged as a rare exception to the dominant theory of voice production in mammals. A new study presents histological and biomechanical evidence that purring can occur passively, without needing muscle vibration in the larynx controlled by an independent neural oscillator.

View Article and Find Full Text PDF

Across many species, a major function of vocal communication is to convey formidability, with low voice frequencies traditionally considered the main vehicle for projecting large size and aggression. Vocal loudness is often ignored, yet it might explain some puzzling exceptions to this frequency code. Here we demonstrate, through acoustic analyses of over 3,000 human vocalizations and four perceptual experiments, that vocalizers produce low frequencies when attempting to sound large, but loudness is prioritized for displays of strength and aggression.

View Article and Find Full Text PDF

Humans have evolved voluntary control over vocal production for speaking and singing, while preserving the phylogenetically older system of spontaneous nonverbal vocalizations such as laughs and screams. To test for systematic acoustic differences between these vocal domains, we analyzed a broad, cross-cultural corpus representing over 2 h of speech, singing, and nonverbal vocalizations. We show that, while speech is relatively low-pitched and tonal with mostly regular phonation, singing and especially nonverbal vocalizations vary enormously in pitch and often display harsh-sounding, irregular phonation owing to nonlinear phenomena.

View Article and Find Full Text PDF

Italian is sexy, German is rough-but how about Páez or Tamil? Are there universal phonesthetic judgments based purely on the sound of a language, or are preferences attributable to language-external factors such as familiarity and cultural stereotypes? We collected 2,125 recordings of 228 languages from 43 language families, including 5 to 11 speakers of each language to control for personal vocal attractiveness, and asked 820 native speakers of English, Chinese, or Semitic languages to indicate how much they liked these languages. We found a strong preference for languages perceived as familiar, even when they were misidentified, a variety of cultural-geographical biases, and a preference for breathy female voices. The scores by English, Chinese, and Semitic speakers were weakly correlated, indicating some cross-cultural concordance in phonesthetic judgments, but overall there was little consensus between raters about which languages sounded more beautiful, and average scores per language remained within ±2% after accounting for confounds related to familiarity and voice quality of individual speakers.

View Article and Find Full Text PDF

Acoustic correlates of physical effort are still poorly understood, even though effort is vocally communicated in a variety of contexts with crucial fitness consequences, including both confrontational and reproductive social interactions. In this study 33 lay participants spoke during a brief, but intense isometric hold (L-sit), first without any voice-related instructions, and then asked either to conceal their effort or to imitate it without actually performing the exercise. Listeners in two perceptual experiments then rated 383 recordings on perceived level of effort ( = 39 listeners) or categorized them as relaxed speech, actual effort, pretended effort, or concealed effort ( = 102 listeners).

View Article and Find Full Text PDF

While nonlinear phenomena (NLP) are widely reported in animal vocalizations, often causing perceptual harshness and roughness, their communicative function remains debated. Several hypotheses have been put forward: attention-grabbing, communication of distress, exaggeration of body size and dominance. Here, we use state-of-the-art sound synthesis to investigate how NLP affect the perception of puppy whines by human listeners.

View Article and Find Full Text PDF

When producing intimidating aggressive vocalizations, humans and other animals often extend their vocal tracts to lower their voice resonance frequencies (formants) and thus sound big. Is acoustic size exaggeration more effective when the vocal tract is extended before, or during, the vocalization, and how do listeners interpret within-call changes in apparent vocal tract length? We compared perceptual effects of static and dynamic formant scaling in aggressive human speech and nonverbal vocalizations. Acoustic manipulations corresponded to elongating or shortening the vocal tract either around (Experiment 1) or from (Experiment 2) its resting position.

View Article and Find Full Text PDF

Vocal tract elongation, which uniformly lowers vocal tract resonances (formant frequencies) in animal vocalizations, has evolved independently in several vertebrate groups as a means for vocalizers to exaggerate their apparent body size. Here, we propose that smaller speech-like articulatory movements that alter only individual formants can serve a similar yet less energetically costly size-exaggerating function. To test this, we examine whether uneven formant spacing alters the perceived body size of vocalizers in synthesized human vowels and animal calls.

View Article and Find Full Text PDF

The human voice is a primary tool for verbal and nonverbal communication. Studies on laughter emphasize a distinction between spontaneous laughter, which reflects a genuinely felt emotion, and volitional laughter, associated with more intentional communicative acts. Listeners can reliably differentiate the two.

View Article and Find Full Text PDF

A lion's roar, a dog's bark, an angry yell in a pub brawl: what do these vocalizations have in common? They all sound harsh due to nonlinear vocal phenomena (NLP)-deviations from regular voice production, hypothesized to lower perceived voice pitch and thereby exaggerate the apparent body size of the vocalizer. To test this yet uncorroborated hypothesis, we synthesized human nonverbal vocalizations, such as roars, groans and screams, with and without NLP (amplitude modulation, subharmonics and chaos). We then measured their effects on nearly 700 listeners' perceptions of three psychoacoustic (pitch, timbre, roughness) and three ecological (body size, formidability, aggression) characteristics.

View Article and Find Full Text PDF

The ability to recognize the emotions of others is a crucial skill. In the visual modality, sensorimotor mechanisms provide an important route for emotion recognition. Perceiving facial expressions often evokes activity in facial muscles and in motor and somatosensory systems, and this activity relates to performance in emotion tasks.

View Article and Find Full Text PDF

Nonlinear vocal phenomena (NLPs) are commonly reported in animal calls and, increasingly, in human vocalizations. These perceptually harsh and chaotic voice features function to attract attention and convey urgency, but they may also signal aversive states. To test whether NLPs enhance the perception of negative affect or only signal high arousal, we added subharmonics, sidebands or deterministic chaos to 48 synthetic human nonverbal vocalizations of ambiguous valence: gasps of fright/surprise, moans of pain/pleasure, roars of frustration/achievement and screams of fear/delight.

View Article and Find Full Text PDF

To ensure that listeners pay attention and do not habituate, emotionally intense vocalizations may be under evolutionary pressure to exploit processing biases in the auditory system by maximising their bottom-up salience. This "salience code" hypothesis was tested using 128 human nonverbal vocalizations representing eight emotions: amusement, anger, disgust, effort, fear, pain, pleasure, and sadness. As expected, within each emotion category salience ratings derived from pairwise comparisons strongly correlated with perceived emotion intensity.

View Article and Find Full Text PDF

Prosodic features, such as intonation and voice intensity, have a well-documented role in communicating emotion, but less is known about the role of laryngeal voice quality in speech and particularly in nonverbal vocalizations such as laughs and moans. Potentially, however, variations in voice quality between tense and breathy may convey rich information about the speaker's physiological and affective state. In this study breathiness was manipulated in synthetic human nonverbal vocalizations by adjusting the relative strength of upper harmonics and aspiration noise.

View Article and Find Full Text PDF

The ability to recognize emotions undergoes major developmental changes from infancy to adolescence, peaking in early adulthood, and declining with aging. A life span approach to emotion recognition is lacking in the auditory domain, and it remains unclear how the speaker's and listener's ages interact in the context of decoding vocal emotions. Here, we examined age-related differences in vocal emotion recognition from childhood until older adulthood and tested for a potential own-age bias in performance.

View Article and Find Full Text PDF

How do we perceive voices coming from different spatial locations, and how is this affected by emotion? The current study probed the interplay between space and emotion during voice perception. Thirty participants listened to nonverbal vocalizations coming from different locations around the head (left right; front back), and differing in valence (neutral, positive [amusement] or negative [anger]). They were instructed to identify the location of the vocalizations (Experiment 1) and to evaluate their emotional qualities (Experiment 2).

View Article and Find Full Text PDF

We report a series of 22 experiments in which the implicit associations test (IAT) was used to investigate cross-modal correspondences between visual (luminance, hue [R-G, B-Y], saturation) and acoustic (loudness, pitch, formants [F1, F2], spectral centroid, trill) dimensions. Colors were sampled from the perceptually accurate CIE-Lab space, and the complex, vowel-like sounds were created with a formant synthesizer capable of separately manipulating individual acoustic properties. In line with previous reports, the loudness and pitch of acoustic stimuli were associated with both luminance and saturation of the presented colors.

View Article and Find Full Text PDF

Voice synthesis is a useful method for investigating the communicative role of different acoustic features. Although many text-to-speech systems are available, researchers of human nonverbal vocalizations and bioacousticians may profit from a dedicated simple tool for synthesizing and manipulating natural-sounding vocalizations. Soundgen ( https://CRAN.

View Article and Find Full Text PDF

The ability to perceive the emotions of others is crucial for everyday social interactions. Important aspects of visual socioemotional processing, such as the recognition of facial expressions, are known to depend on largely automatic mechanisms. However, whether and how properties of automaticity extend to the auditory domain remains poorly understood.

View Article and Find Full Text PDF

The processing of emotional signals usually causes an increase in pupil size, and this effect has been largely attributed to autonomic arousal prompted by the stimuli. Additionally, changes in pupil size were associated with decision making during non-emotional perceptual tasks. Therefore, in this study we investigated the relationship between pupil size fluctuations and the process of emotion recognition.

View Article and Find Full Text PDF

Recent research on human nonverbal vocalizations has led to considerable progress in our understanding of vocal communication of emotion. However, in contrast to studies of animal vocalizations, this research has focused mainly on the emotional interpretation of such signals. The repertoire of human nonverbal vocalizations as acoustic types, and the mapping between acoustic and emotional categories, thus remain underexplored.

View Article and Find Full Text PDF

There is a battle in the supermarket isle, a battle between what the consumer wants and what the retailer and others want her to see, and subsequently to buy. Product packages and displays contain a number of features and attributes tailored to catch consumers' attention. These are what we call external factors comprising the visual saliency, the number of facings, and the placement of each product.

View Article and Find Full Text PDF

Most research on nonverbal emotional vocalizations is based on actor portrayals, but how similar are they to the vocalizations produced spontaneously in everyday life? Perceptual and acoustic differences have been discovered between spontaneous and volitional laughs, but little is known about other emotions. We compared 362 acted vocalizations from seven corpora with 427 authentic vocalizations using acoustic analysis, and 278 vocalizations (139 authentic and 139 acted) were also tested in a forced-choice authenticity detection task ( N = 154 listeners). Target emotions were: achievement, amusement, anger, disgust, fear, pain, pleasure, and sadness.

View Article and Find Full Text PDF