The spiral shape of the mammalian cochlea not only helps acoustic energy reach the apex of the cochlea, but also induces a radial pressure gradient that increases toward the outer wall. The resulting asymmetric loading of the cochlear partition boosts the sensitivity to low frequency sounds. The mathematics and physics of the effect are explained using wave propagation and wave tracing approaches. Behavioral and morphometric data in both land and sea mammals are presented to support the theory.
Work done in collaboration with Daphne Manoussaki, Emilios Dimitriadis, and Darlene Ketten.
At least two types of fluid waves in the cochlea can be distinguished: compression waves and surface waves, both play a part in cochlear mechanics. Usually, the part played by compression waves is neglected but in explaining bone conduction they are essential. The measurements of von Bekesy on cadaver ears have been superceded by more accurate measurements in living and anesthetized animals. It was discovered that the frequency response of the basilar membrane is considerably sharper than thought before, and depends very much on the physiological condition of the animal studied. The waves in the cochlea can be further divided into long and short waves, the region of the strongest response is the region where short waves prevail. In theories of cochlear mechanics an amplification mechanism has been conceived which enhances the response and increases the sharpness of tuning. The same mechanism, being of physiological origin and thus extremely vulnerable, is also the (main) site of cochlear nonlinearity. A model of the cochlea that includes these types of waves and the physiological amplification mechanism can quantitatively explain nearly all linear and nonlinear phenomena that the real cochlea exhibits. There remain a few problem areas in global cochlear mechanics, most of these have to do with otoacoustic emissions. In a study directed at the origin of Distortion Product Otoacoustic Emissions (DPOAEs), it has been discovered that waves in the fluid of the cochlea do not exactly behave as in the theory. Several explanations of this aberrant behavior have been put forward, these range from consideration of compression waves, via investigation of non-classical models, to a reconsideration of the phenomenon of coherent reflection. It is perhaps too early to attempt a synthesis of all sub-models but that a definitive progress has been made is certain.
The cells and structures of the organ of Corti are hypothesized to act in an electromechanical feedback system boosting the mechanical response of cochlea, such as the basilar membrane, to low-level acoustic input. A physiological model based on this hypothesis predicts key aspects of the electromechanical cochlear response to both acoustical and electrical stimulation. The model explicitly couples mechanical, electrical and fluidic domains including a piezoelectric model of the OHC soma. A method for including hair bundle motility is presented along with preliminary results indicating a subsidiary role for hair bundle motility. The electrical stimulation and response of the cochlea has grown in importance. This is in part because electrical stimulation provides a means for interrogating the response of the cochlea and testing hypothesis of cochlear function. Further, electrical stimulation is the final output of cochlear prosthesis. Finally, as we contemplate combined electrical and mechanical hearing prosthesis for patients with partially functional cochlea, the interaction of injected electrical disturbances in the cochlear fluids with those arising form normally functioning cells becomes more central.
It is the purpose of this talk to present a system's framework for interpreting effective or phenomenological models of the auditory periphery. The auditory system can be modelled with varying degrees of abstraction away from the underlying physiological processes, where the degree of complexity required in a given model is dependent on its intended purpose. The greater complexity in modelling a biophysical process in detail may not be necessary, and indeed may cloud the interpretability of results. Using phenomenological models, designed to incorporate particular dynamic features or phenomena of the real system, may allow a simplified framework to interpret its role in the auditory system as a whole.
It is well known that the mammalian cochlea is nonlinear, which manifests in the mechanical response of the basilar membrane (BM). For sinusoidal excitation the BM displays a compressive nonlinearity, conventionally described using an input-output level curve. This displays a slope of 1 dB/dB at low levels and a slope m < 1 dB/dB at higher-levels. Detailed biophysical models containing physiologically realistic active mechanisms and BM dynamics exist that can explain this simple representation of auditory compression. However, debate is still prevalent as to the exact mechanisms and important anatomical and physiological features. By using simpler phenomenological models for compression, inspired from the more detailed biophysical models, one can obtain a useful framework to explain a variety of experimental physiological data, i.e. BM compression. Two classes of nonlinear systems will be considered in this talk as models of BM compression, one class with static power-law nonlinearity and one class with level-dependent properties (using either an automatic gain control or a Van der Pol oscillator). By carefully choosing their parameters, it will be shown that all models can produce level curves that are similar to those measured on the BM. As well as this, links will be made with investigations on otoacoustic emission nonlinearity, and also to psychophysical measures of dynamic range compression and perceptual data. Thus demonstrating the high degree of generality these simple phenomenological models can be made to have. The complementary nature of biophysical models to phenomenological and even more abstract psycho-physical models will be discussed.
Stapes vibration launches a traveling wave down the cochlear partition that peaks at frequency dependent locations along the cochlear spiral. The traveling wave and peaking occur in both healthy (active) and dead (passive) cochleae. However, in an active cochlea, at locations where in a passive cochlea the traveling wave exhibits a broadly tuned peak, the wave instead continues to grow and attains a relatively sharp and much higher peak a short distance apical of the passive peak place.
The physical basis for even these very basic observations of cochlear tuning remains uncertain. Organ of Corti mass, resistance, and longitudinal coupling have all been employed in cochlear models although their true nature is not known. I will explore their possible roles in passive and active cochlear mechanics, and discuss how they can be measured in the lab. Role of organ of Corti mass in frequency tuning: The basis of the passive peak is not certain, with some models employing significant organ of Corti mass, in which case the concept of local organ of Corti resonance is important, while other models get by with zero organ of Corti mass. We have performed measurements of traveling wave wavelength that are designed to probe the significance of organ of Corti mass.
Traveling wave resistance and the cochlear amplifier: Robust observations of basilar membrane response timing indicate that the cochlear amplifier works as a negative resistance that is large enough to overcome positive (normal) resistance over a limited longitudinal extent. Within that extent more power flows into the traveling wave due to the amplifier than flows out due to damping. Thus, the important impedance that must be overcome by the amplifier is resistance. We are making direct measurements of organ of Corti frequency-dependent impedance to measure resistance.
Longitudinal coupling is generally not good for cochlear models as it works against well-established observations of cochlear mechanics, in particular the sharp apical drop-off in phase and amplitude. However, several models of active cochlear mechanics employ longitudinal coupling. Also, measurements of passive stiffness indicate a significant cellular component, which suggests the existence of significant longitudinal coupling. We are beginning studies to eliminate cells of the organ of Corti in order to look for changes in traveling wave wavelength that will help identify the role of longitudinal coupling in cochlear mechanics.
The stria vascularis of the inner ear contains a complicated network of transport proteins that transport potassium into the endolymph and establish a large positive endocochlear potential required for normal function of the inner ear. A computational model of ion transport in the stria vascularis was constructed based on the biophysical properties of individual ion channels and transporters. The outermost layer of marginal cells was found to be capable of sustaining considerable potassium flux into the endolymph but not capable of making a direct contribution to the endocochlear potential. An expanded model revealed that the inclusion of the channels and transporters expressed in the intermediate and basal cells are sufficient to generate the endocochlear potential. A particularly interesting prediction is the sensitivity of strial function to the potassium concentration in the intrastrial space. The model can predict the results of altering expression levels of distinct ion transporters and channels. This can be used to understand the effects of genetic mutations and drug interactions, including loop diuretic ototoxicity and genetic deafness due to potassium and chloride transport deficiencies, such as Jervell and Lange-Nielsen syndrome and Bartter's syndrome, type IV. Such simulations demonstrate the utility of compartmental modeling to investigate the role of ion homeostasis in inner ear physiology and pathology.
Cochlear implants are in widespread use (>100,000 patients worldwide) and provide a level of hearing restoration that allows most recipients to converse by telephone. However, cochlear implants are not useful for patients with no remaining auditory nerve, so new prosthetic devices have been designed to stimulate the cochlear nucleus in the brainstem and the inferior colliculus in the midbrain, using both surface and penetrating electrodes. We will present psychophysical results and speech recognition results from cochlear implants and from surface and penetrating electrodes at the level of the cochlear nucleus and inferior colliculus. Surprisingly, many psychophysical measures of temporal, spectral and intensity resolution are similar across stimulation sites and electrode types. Excellent speech recognition and modulation detection are observed in cochlear implants and in some patients with stimulation of the cochlear nucleus, but not in patients who lost their auditory nerve from vestibular schwannomas. Quantitative comparison of results from electrical stimulation of the auditory system at different stages of neural processing, and across patients with different etiologies can provide insights into auditory processing mechanisms.
Except at the handful of sites explored by the inverse method, the characteristics---indeed, the very existence---of traveling-wave amplification in the mammalian cochlea remain largely unknown. Uncertainties are especially pronounced in the apex, where mechanical measurements lack the independent controls necessary for assessing damage to the preparation. At a functional level, the form and amplification of cochlear traveling waves are determined by quantities known as propagation and gain functions. The properties of these functions, and their variation along the length of the cochlea, are central to an understanding of cochlear mechanics. We outline a method for deriving propagation and gain functions from measurements of basilar-membrane (BM) mechanical transfer functions. By applying the method to indirect estimates of near-threshold BM responses obtained from (1) Wiener-kernel analysis of chinchilla auditory-nerve responses to noise (Recio-Spinoso et al. 2005; Temchin et al. 2005) and (2) zwuis analysis of cat auditory-nerve responses to complex tones (van der Heijden and Joris 2003; 2006), we derive and interpret propagation and gain functions throughout the cochlea in sensitive, undamaged preparations.
Outer hair cells are critical to the amplification and sharp frequency selectivity of the mammalian cochlea. Outer hair cell has a unique form of motility (electromotility) driven by changes in the cell's transmembrane potential. The major features of the electromotile cell are length changes, active force production, and electric charge transfer. We will discuss the modeling of these three interrelated phenomena at the molecular, cellular, and organ levels with a particular focus on high-frequency conditions. The membrane protein (motor) prestin is a crucial molecular component of active hearing. We will present a mathematical model describing the prestin-related transfer of an electric charge across a portion of the membrane under high-frequency conditions. We will also discuss how the outer hair cell can overcome the mechanical (viscous) and electrical (capacitive) high-frequency filtering in the cochlear environment and produce an active force significant to the cochlear amplification.
Our focus is on physically based modeling. For basic three-dimensional models for the fluid - elastic waves in the cochlea, direct numerical computation requires many hours of super computer time. In contrast the combination of asymptotic and numerical methods requires seconds on a small computer for a given frequency. For validation, several life-sized models of the human cochlea have been fabricated by micromachining. The direct measurements of response and computation produce reasonable agreement, sufficient to justify the efficient computational procedure. The need for the full three-dimensional fluid model is clarified by the measurements of Olson, which show a rapid decay of the pressure with the distance perpendicular to the basilar membrane. The calculations show the similar decay. Olson recently extended the measurement of the nonlinear distortion products, and the computations show qualitative agreement.
A more elaborate model includes what may be the most important cellular features of the OC. The model is multiscale, from ciliary tip links with diameter of a few nanometers to the basilar membrane with features on the scale of millimeters. The validation for the extended model is from measurements by Ulfendahl and colleagues with confocal microscopy of the details of the motion of the cross section of the OC. The full organ of Corti model is extended for the computation of high frequency, for which the longitudinal traveling waves are of significance.
Are cochlear traveling waves genuine waves? This is not a semantic issue. There is physics behind it. Waves carry energy. In a unidirectional wave the energy is propagating in a single direction. The energy flow can be visualized by varying (modulating) the intensity of the stimulus that drives the wave. These intensity fluctuations do not cause instantaneous variations in the intensity of the wave. Instead the fluctuations are propagated at a finite speed that need not match the phase velocity of the wave. Mathematically, the travel speed of intensity fluctuations is described by the group velocity. The resulting travel time to a given location is the group delay. Obviously, in a unidirectional wave the group delay will grow monotonically with distance. Other systems, such as an array of uncoupled resonators, generally lack this monotonic growth of group delay.
To analyze how group delay varies along the cochlea - and to test whether it obeys the monotonic growth demanded by a unidirectional traveling wave - one needs to know how phase varies with stimulus frequency and with cochlear location. We derived these phase patterns in the apex of the cochlea from our auditory nerve measurements, and analyzed them in terms of group delays. Joint work with Philip X. Joris.
Auditory pathway from sound to perception is a multi-level information processing system. It is modeled as a transform with uniform and fine frequency resolution at low frequencies, yet nonuniform and coarse frequency resolution towards higher frequencies. Such a transform is built with discrete Fourier transform and ear characteristics and called auditory transform. The transform is invertible perceptually. Inversion from perception to sounds is non-unique and involves optimization. Likewise, non-unique statistical inversion arises in blind source separation problems. Recent frequency and time domain methods will be presented to illustrate various inversions and resulting separations. Joint work with Jie Liu, Yingyong Qi and Fang-Gang Zeng.
Short talk: A fundamental problem in auditory cortex is how to determine a neuron's receptive field. In previous work spectrotemporal receptive fields (STRFs), which are calculated through the spike triggered average (STA), have been used successfully to determine the modulation preferences and stimulus selectivity properties of auditory cortex neurons. While informative, STRFs may be biased by stimulus correlations and they do not characterize neural sensitivity to multiple stimulus dimensions. In this study we overcame these limitations by using a model in which a neuron is selective for two dimensions in a high dimensional stimulus space. To derive the model, single neuron responses were recorded in response to a dynamic moving ripple stimulus in the primary auditory cortex (AI) of the cat. Each relevant dimension was then determined by maximizing the mutual information between the neural response and the projection of the stimulus onto directions in the stimulus space. This process removes the effects of stimulus correlations from the estimates of the dimensions. After the relevant dimensions were determined we calculated the nonlinear, memory-less input-output function that relates spiking probability to the stimulus projection. For all neurons we found that the nonlinearities of the STA and the first relevant dimension were monotonic and highly correlated. The nonlinearity of the second relevant dimension was usually symmetric. When the nonlinearities of the spike triggered average and the first dimension were plotted against depth the layers that received thalamic input had the most asymmetric nonlinearities. The two-dimensional nonlinearity for the first and second relevant dimensions also varied with layer, with the most separable nonlinearities in layers that receive thalamic input. This implies that the processing by the two dimensions may be dissociated in thalamic input layers though this approximation is not appropriate at further positions in the AI microcircuit. These results argue for a hierarchical model of spectrotemporal processing in the AI microcircuit.
Work done incollaboration with Tatyana Sharpee and Christoph E. Schreiner.
Short talk: The long latencies associated with mammalian otoacoustic emissions (OAEs) are generally attributed to delays arising from basilar-membrane (BM) traveling waves. If traveling waves are responsible for OAE latencies, one might expect significantly shorter OAE latencies in species lacking a tuned or flexible BM. To test this hypothesis, we examine stimulus frequency otoacoustic emissions (SFOAEs) evoked using low-level stimuli in a wide range of species including human, cat, guinea pig, chicken, gecko, and frog. SFOAE phase gradients imply emission latencies of 1 ms or longer in all species, a delay significantly longer than can be accounted for by middle-ear transmission and energy propagation via fluid compression. Therefore, basilar-membrane traveling waves are not necessary for significant OAE latencies. To explain the long latencies, we hypothesize that OAE latencies reflect delays associated with mechanical tuning. To test this hypothesis, we compare SFOAE latencies with ANF-based measures of tuning sharpness and find a correlation in all species except the frog. Our results suggest that in most species OAE latencies reflect the presence of mechanical frequency selectivity that may or may not be associated with traveling waves. [work performed with C.A. Shera and D.M. Freeman]
Short talk: A statistical generative model for the human speech process has been developed that embeds a substantially richer structure than the Hidden Markov Model (HMM) currently in predominant use for automatic speech recognition. This switching dynamic-system model generalizes and integrates the HMM and the piece-wise stationary nonlinear dynamic system (state-space) model. Depending on the level and the nature of the switching in the model design, various key properties of the speech dynamics can be naturally represented in the model. Such properties include the temporal structure of the speech acoustics, its causal articulatory movements, and the control of such movements by the multidimensional targets correlated with the phonological (symbolic) units of speech in terms of overlapping articulatory features.
One main challenge of using the multi-level switching dynamic-system model for speech recognition is the computationally intractable inference (decoding with confidence measure) on the posterior probabilities of the hidden states. This leads to computationally intractable optimal parameter learning (training) also. Several versions of BayesNets have been devised with detailed dependency implementation specified to represent the switching dynamic-system model of speech. We discuss the variational technique developed for general Bayesian networks as an efficient approximate algorithm for the decoding and learning problems. Some common operations of estimating phonological states' switching times have been shared between the variational technique and the human auditory function that uses neural transient responses to detect temporal landmarks associated with phonological features. This suggests that the variational-style learning may be related to human speech perception under an encoding-decoding theory of speech communication, which highlights the critical roles of modeling articulatory dynamics for speech recognition and which forms a main motivation for the switching dynamic system model for speech articulation and acoustics.
Short talk: Periodic patterns in natural sounds are an important acoustic attribute that contributes rhythm and pitch perception. Although numerous studies have examined the neuronal representation of periodic stimuli the mechanisms responsible for encoding the shape of a stimulus envelope concurrently in with periodic information are not well understood. Traditionally, it is assumed that temporal patterns in acoustic signals are represented by either the average neuronal discharge rate or temporal synchrony to the sound envelope. Compelling evidence for a pure rate or synchrony neuronal code, however, is lacking. Here we demonstrate that neurons in the auditory midbrain of cats employ two complementary mechanisms that enable them to efficiently encode temporal periodicity and envelope shape information. We recorded single unit activity in the central nucleus of the inferior colliculus (ICC) and compared neuronal responses to periodic noise bursts and sinusoidally modulated noise. We develop a shuffled correlation technique that allows us to systematically characterize the temporal periodicity response pattern for onset and sustained responses. Neurons with sustained responses faithfully encode the envelope shape at low modulation rates but deteriorate and fail to account for timing and envelope information at high rates. In contrast, onset neuronal responses accurately entrain to the stimulus repetition and provide a means of encoding repetition information at rates exceeding 1000Hz. These results argue against conventional rate or synchrony based codes and provides two independent but complementary mechanisms by which ICC neurons simultaneously encode envelope shape and repetition information in complex. (supported by NIDCD R01DC006397-01A1)
Short talk: The candidacy for cochlear implantation of patients with increasing amounts of residual hearing as well as bimodal and bilateral stimulation put a higher emphasis on an optimal frequency-to-place map, either within one cochlea or between cochleae. Several attempts to use psychophysical methods, such as pitch ranking or pitch scaling, to determine this map on an individual basis failed in most (non-musically trained) patients. The present study used a computational model to integrate recent histological information on the tonotopical organization and the course of the primary auditory nerve fibers with patients' CT data into an individualized frequency map. With this model, pre-operative planning of electrode insertion was performed. It turned out that the excitation site along the fiber is not only dependent on the electrode position, but to a large extent also on each individual's anatomy. The actually achieved electrode location was modeled on the basis of the post-operative CT scan. In this way, the computational model allowed for prediction of the induced pitch percept per electrode contact, calculation of threshold profiles, a detailed analysis of current spread and the spread of excitation. These predictions will be compared with actual data for eight patients, implanted with a HiRes90k. They were fitted with a physiologically correct frequency-to-place map as based on the model predictions, illustrating that the modeling work now has direct clinical implications. Joint work with Randy K. Kalkman, David M.T. Dekker and Jeroen J. Briaire.
Short talk: While hair cells are the mechanoreceptor cells in the ear, reverse transduction in these cells, which provides feedback to the senosory process, is shown to be essential for the sensitivity and frequency selectivity of the ear. One such reverse transduction in hair bundles and is known as fast adaptation. Another reverse transduction in the cell body of outer hair cells is called electromotility. Previously we examined the effectiveness of electromotility by comparing it with viscous drag due to shear motion in the gap between the reticular lamina and the tectorial membrane that is associated with basilar membrane vibration (Ospeck et al, Biiophys. J 2003). It showed that electromotility can counteract viscous drag up to about 10 kHz without any enhancing mechanism. Using a similar method, here we attempt to evaluate the effectiveness of fast adaptation by estimating the mechanical work it does in response to steady sinusoidal stimulation with small amplitudes and then comparing the work with the viscous loss at the gap. We found that "twitch," which is re-closure of the transducer channel due to Ca entry, leads to a gain in the mechanical energy, whereas "release," which is relaxation due to Ca entry, does not. Our calculation leads to a frequency limit, up to which fast adaptation can counteract the viscous drag. The limiting frequency that we estimated for twitch was about 100 Hz, quite low compared with the auditory frequency of mammals. However, the limiting frequency that we obtained for avian ear is higher than their auditory frequency range (~2 kHz), indicating that we can explain the auditory range of the avian ear, which depends on fast adaptation alone. These results are therefore consistent with the assumption that the reverse transduction in the mammalian ear is primarily due to electromotility.
Work done in collaboration with B. Sul.
Short talk: Audio compression technologies have made it easy to store digital audio files. However, digital files could then be easily distributed without respecting the copyright. Therefore, in recent years, digital watermarking has been proposed as a technique to fight against piracy. How well does it work? In this short (and informal) presentation, I'd like to compare two technologies -- audio compression and audio watermarking -- in the light of "dualities". First, in information theoretic terms, compression is a "source coding" problem and watermarking is a "channel coding" problem. Nevertheless, when the two technologies are developed, they satisfy similar psycho-acoustical constraints. Finally, watermarking versus compression can be seen as part of a game. In this game, one can still debate whether it is advantageous to go first.
Poster: The absorption of small molecules can change membrane curvature and affect numerous biological processes. In the red blood cell, the amphiphilic compounds salicylate and chloropromazine visibly alter membrane curvature and result in crenation and cup formation, respectively. However, in cell types in which the membrane is tightly anchored to the cytoskeleton, changes in membrane curvature are below the resolution of the light microscope. One such example is the cochlear outer hair cells (OHCs), which display axial deformations in response to changes in transmembrane potential. The OHC motor protein prestin is sensitive to salicylate and chlorpromazine, but the mechanism by which these drugs exert their toxic effects is unknown. We have extended the application of fluorescence polarization microscopy (FPM), a technique for measuring the orientation of fluorescent membrane markers, to the cylindrical OHC. Our steady state model for the orientation of di-8-ANEPPS, a voltage-sensitive membrane probe, predicts the absorption transition dipole moment of the molecule is oriented at 27 degrees with respect to the plane of the membrane. Following treatment with salicylate or chlorpromazine, orientation changes for di-8-ANEPPS are consistent with subtle changes in plasma membrane curvature. These results demonstrate the sensitivity of FPM to nanoscale changes in membrane architecture, and suggest that salicylate and chlorpromazine may, in part, mediate outer hair cell electromotility through modulation of the membrane curvature strain.
Work done in collaboration with Jennifer N. Greeson.
Short talk: Despite focusing on the central auditory pathway, the auditory periphery continues to be fascinating. Personal observations from the periphery raise a few unresolved questions with a mathematical flavor as follows: