Anonymous Microsoft Web Data

Donated on 10/31/1998

Log of anonymous users of www.microsoft.com; predict areas of the web site a user visited based on data on other areas the user visited.

Dataset Characteristics

-

Subject Area

Computer Science

Associated Tasks

Recommender-Systems

Feature Type

Categorical

# Instances

37711

# Features

294

Dataset Information

Additional Information

We created the data by sampling and processing the www.microsoft.com logs. The data records the use of www.microsoft.com by 38000 anonymous, randomly-selected users. For each user, the data lists all the areas of the web site (Vroots) that user visited in a one week timeframe. Users are identified only by a sequential number, for example, User #14988, User #14989, etc. The file contains no personally identifiable information. The 294 Vroots are identified by their title (e.g. "NetShow for PowerPoint") and URL (e.g. "/stream"). The data comes from one week in February, 1998.

Has Missing Values?

No

Variable Information

Each attribute is an area ("vroot") of the www.microsoft.com web site. The datasets record which Vroots each user visited in a one-week timeframe in Feburary 1998.

Papers Citing this Dataset

A one‐step method for modelling longitudinal data with differential equations

By Yueqin Hu, Raymond Treinen. 2019

Published in The British journal of mathematical and statistical psychology.

Genetic Algorithm with an Improved Initial Population Technique for Automatic Clustering of Low-Dimensional Data

By Xiangbing Zhou, Fang Miao, Hongjiang Ma. 2018

Published in Information.

Robust auto-weighted multi-view subspace clustering with common subspace representation matrix

By Wenzhang Zhuge, Chenping Hou, Yuanyuan Jiao, Jia Yue, Hong Tao, Dongyun Yi. 2017

Published in PloS one.

FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference

By Tianyu Wang, Marco Morucci, M. Awan, Yameng Liu, Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky. 2017

Published in ArXiv.

0 to 5 of 8

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download
8 citations
25571 views

Keywords

health

Creators

Jack Breese

David Heckerman

Carl Kadie

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy