1. Reuters Transcribed Subset: This dataset is created by reading out 200 files from the 10 largest Reuters
classes and using an Automatic Speech Recognition system to create
corresponding transcriptions. 2. Farm Ads: This data was collected from text ads found on twelve websites that deal with various farm animal related topics. The binary labels are based on whether or not the content owner approves of the ad. 3. Japanese Credit Screening: Includes domain theory (generated by talking to Japanese domain experts); data in Lisp 4. Cargo 2000 Freight Tracking and Tracing: Sanitized and anonymized Cargo 2000 (C2K) airfreight tracking and tracing events, covering five months of business execution (3,942 process instances, 7,932 transport legs, 56,082 activities). 5. Machine Learning based ZZAlpha Ltd. Stock Recommendations 2012-2014: The data here are the ZZAlpha® machine learning recommendations made for various US traded stock portfolios the morning of each day during the 3 year period Jan 1, 2012 - Dec 31, 2014. |