Reuters Transcribed Subset

Abstract

This data was created by selecting 20 files each from the 10 largest classes in the Reuters-21578 collection. The files were read out by 3 Indian speakers and an Automatic Speech Recognition (ASR) system was used to generate the transcripts.

Information files:

Data files:


The UCI KDD Archive
Information and Computer Science
University of California, Irvine
Irvine, CA 92697-3425
Last modified: 10 August 2007