NSF Research Award Abstracts 1990-2003 Data Set
Download: Data Folder, Data Set Description
Abstract: This data set consists of (a) 129,000 abstracts describing NSF awards for basic research, (b) bag-of-word data files extracted from the abstracts, (c) a list of words used for indexing the bag-of-word
|
|
Data Set Characteristics: |
Text |
Number of Instances: |
129000 |
Area: |
N/A |
Attribute Characteristics: |
N/A |
Number of Attributes: |
N/A |
Date Donated |
2003-11-18 |
Associated Tasks: |
N/A |
Missing Values? |
N/A |
Number of Web Hits: |
55039 |
Source:
Original Owner and Donor
Abstracts provided by:
Michael J. Pazzani
ICS Department, School of Computer Science, UCI, Irvine CA, 92697, USA
pazzani '@' ics.uci.edu
Bag-of-word data provided by:
Amnon Meyers
ICS Department, School of Computer Science, UCI, Irvine CA, 92697, USA
ameyers '@' ics.uci.edu
Data Set Information:
The abstracts, one per file, were furnished by the NSF (National Science Foundation). A sample abstract is shown in the next section.
The bag-of-word data was produced by automatically processing the abstracts with a text analyzer called NSFAbst, built using VisualText. While most fields of the output are very accurate, the authors were not extracted from the Investigator: field with 100% accuracy, due to wide variability in that field.
The word list came from a separate process, and may not include all the words of interest in the abstracts.
Attribute Information:
N/A
Relevant Papers:
N/A
Citation Request:
Please refer to the Machine Learning
Repository's citation policy
|