White Papers

Shared Resources for Robust Speech-to-Text Technology

Overview This paper describes ongoing efforts at Linguistic Data Consortium to create shared resources for improved speech-to-text technology. Under the DARPA EARS program, technology providers are charged with creating STT systems whose outputs are substantially richer and much more accurate than is currently possible. These aggressive program goals motivate new approaches to corpus creation and distribution. EARS participants require multilingual broadcast and telephone speech data, transcripts and annotations at a much higher volume than for any previous program. New distribution methods also provide for efficient deployment of needed resources to participating research sites as well as enabling eventual publication to a wider community of language researchers.

Further White Paper Details
PublisherUniversity of Pennsylvania File FormatPDF
Date PublishedMarch 2005
FormatWhite Papers   
Topics
    N/A

Quick Sitemap Links: