Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

NSF Research Award Abstracts 1990-2003 Data Set
Download: Data Folder, Data Set Description

Abstract: This data set consists of (a) 129,000 abstracts describing NSF awards for basic research, (b) bag-of-word data files extracted from the abstracts, (c) a list of words used for indexing the bag-of-word

Data Set Characteristics:  

Text

Number of Instances:

129000

Area:

N/A

Attribute Characteristics:

N/A

Number of Attributes:

N/A

Date Donated

2003-11-18

Associated Tasks:

N/A

Missing Values?

N/A

Number of Web Hits:

34586


Source:

Original Owner and Donor

Abstracts provided by:

Michael J. Pazzani
ICS Department, School of Computer Science, UCI, Irvine CA, 92697, USA
pazzani '@' ics.uci.edu

Bag-of-word data provided by:

Amnon Meyers
ICS Department, School of Computer Science, UCI, Irvine CA, 92697, USA
ameyers '@' ics.uci.edu


Data Set Information:

The abstracts, one per file, were furnished by the NSF (National Science Foundation). A sample abstract is shown in the next section.

The bag-of-word data was produced by automatically processing the abstracts with a text analyzer called NSFAbst, built using VisualText. While most fields of the output are very accurate, the authors were not extracted from the Investigator: field with 100% accuracy, due to wide variability in that field.

The word list came from a separate process, and may not include all the words of interest in the abstracts.


Attribute Information:

N/A


Relevant Papers:

N/A



Citation Request:

Please refer to the Machine Learning Repository's citation policy


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML