
KASANDR
Donated on 5/15/2017
KASANDR is a novel, publicly available collection for recommendation systems that records the behavior of customers of the European leader in e-Commerce advertising, Kelkoo.
Dataset Characteristics
Multivariate
Subject Area
Other
Associated Tasks
Other
Feature Type
Integer
# Instances
17764280
# Features
2158859
Dataset Information
Additional Information
We created this data by sampling and processing the www.kelkoo.com logs. The data records offers which were clicked (or shown) to the users of the www.kelkoo.com (and partners) in Germany as well as meta-information of these users and offers and the objective is to predict if a given user will click on a given offer.
Has Missing Values?
No
Variable Information
userid offerid countrycode category merchant utcdate implicit-feedback 1. train_de.csv (3,14 GB) Instances: 15,844,718 Attributes: 2,299,713 userid: Categorical, 291,485 offerid: Categorical, 2,158,859 countrycode: Categorical, 1 (de - Germany) category: Integer, 271 merchant: Integer, 703 utcdate: Timestamp, 2016-06-01 02:00:17.0 to 2016-06-14 23:52:51.0 implicit feedback (click): Binary, 0 or 1 2. test_de.csv (381,3 MB) Instances: 1,919,562 Attributes: 2,299,713 userid: Categorical, 278,293 offerid: Categorical, 380,803 countrycode: Categorical, 1 category: Integer, 267 merchant: Integer, 738 utcdate: Timestamp, 2016-06-14 23:52:51.0 to 2016-07-01 01:59:36.0 implicit feedback (click): Binary, 0 or 1
Dataset Files
| File | Size | 
|---|---|
| de.tar.bz2 | 900.5 MB | 
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset kasandr = fetch_ucirepo(id=385) # data (as pandas dataframes) X = kasandr.data.features y = kasandr.data.targets # metadata print(kasandr.metadata) # variable information print(kasandr.variables)
Sidana, S., Laclau, C., & Amini, M. (2017). KASANDR [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5PK7M.
Creators
Sumit Sidana
Charlotte Laclau
Massih-Reza Amini
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.