Performance Analysis of Human Hearing 3D Direction Estimation and
Biologically Inspired Exploitation of Multipath Near the Sensor
Background
The ability of humans to estimate the horizontal direction (azimuth) of a sound source is mainly based on the timedifference of arrival at the two ears. However, it is much less known how two ears can be used to find the elevation angle. Experimental research indicates that humans rely strongly on sound distortions caused by their external ears (pinnae) to determine the elevation angle of a sound source. These distortions occur because the pinnae interact significantly with incoming sound waves (multipath effects). In addition to the pinnae, the human torso, shoulders, and head also diffract the incoming sound waves. Collectively, these propagation effects are termed the headrelated transfer function (HRTF).

Fig. 1: Pinna (external ear) reflections of sound for different elevations [ source].

Our Research
 In this project, we analyzed the accuracy of human hearing in finding the direction of sound sources in three dimensions (3D) or azimuth and elevation, using a statistical approach [1].
 We converted the empirical HRTF to a parametric statistical measurement model that incorporates the diffractions associated with the head shape and multipath reflections related to pinnae.
 We performed a statistical performance analysis of this model. We computed the asymptotic frequency domain CramerRao bound (CRB) on the error of the 3D direction estimates (elevation and azimuth angle) and the meansquare angular error lower bound (MSAE_CR) at different values of azimuth and elevation angles.
 Our analytical results are shown in Figs. 2 and 3. From Fig. 2, we observe the following: (i) the accuracy of direction estimation is symmetric about the vertical median plane (azimuth = 0 deg.), but not about the visuoaural plane (elevation = 0 deg.), (ii) the accuracy is better in front of the head compared to the sides, (iii) the accuracy becomes progressively better towards the positive elevations, and (iv) the accuracy is slightly poor at certain midelevation angles indicated by the patches of dark and light blue color.
 From Fig. 3, it is evident that the estimation accuracy improves with wider bandwidth of the source spectrum. This means that the frequency diversity improves the direction estimation.
 We then proposed a manmade sensing (e.g., sonar or radar) system inspired by the effects of the outer ear on the human hearing. Namely, we suggested to exploit closerange multipath reflections from parts of the platform or vehicle on which the sensor is mounted. We showed how these closerange multipath reflections improve the performance of the system by increasing the effective array aperture. Our numerical results implied that location accuracy improves when multiple reflectors are employed and a wider source spectrum is used [2].

Fig. 2: Squareroot of the meansquare angular error lower bound on source direction estimation at different azimuth and elevation angles in the frontal hemisphere of the head.
Left: frontside view. Right: backside view.
Fig. 3: Squareroot of the CramerRao bound on (left) elevation angle and (right) azimuth angle estimation as a function of normalized bandwidth of the source spectrum.

References

S. Sen and A. Nehorai,
"Performance analysis of 3D direction estimation based on headrelated transfer function,"
IEEE Trans. on Audio, Speech and Language Processing, Vol. 17, pp. 607613, May 2009.

S. Sen and A. Nehorai,
"Exploiting closetothesensor multipath reflections using a humanhearinginspired model,"
IEEE Trans. on Signal Processing,, Vol. 57, pp. 803808, Feb 2009.
