This data was created by selecting 20 files each from the 10 largest classes in the Reuters-21578 collection. The files were read out by 3 Indian speakers and an Automatic Speech Recognition (ASR) system was used to generate the transcripts.
Information files:
Data files: