Auditory, Speech & Language Processing

Human listeners are able to perceptually segregate one sound source from an acoustic mixture, such as a single voice from a mixture of other voices and music at a busy cocktail party. How can we engineer "machine listening" systems that achieve this perceptual feat?

For computers and humans to successfully interact using natural language, linguistic content from both speech and text signals is required. The research approach taken within the AI group is to mix algorithmic development for machine learning with insights from psychology, acoustics, and linguistics in order to find new ways of extracting speech from audio (as in computational audition above), word transcripts from speech (automatic speech recognition), and linguistic meaning from words (natural language processing).