Website Phishing

Donated on 11/1/2016

Dataset Characteristics

Multivariate

Subject Area

Computer Science

Associated Tasks

Classification

Feature Type

Integer

# Instances

1353

# Features

9

Dataset Information

Additional Information

The phishing problem is considered a vital issue in the e-commerce industry especially e-banking and e-commerce taking the number of online transactions involving payments. We have identified different features related to legitimate and phishy websites and collected 1353 different websites from difference sources.Phishing websites were collected from Phishtank data archive (www.phishtank.com), which is a free community site where users can submit, verify, track and share phishing data. The legitimate websites were collected from Yahoo and starting point directories using a web script developed in PHP. The PHP script was plugged with a browser and we collected 548 legitimate websites out of 1353 websites. There is 702 phishing URLs, and 103 suspicious URLs. When a website is considered SUSPICIOUS that means it can be either phishy or legitimate, meaning the website held some legit and phishy features.

Has Missing Values?

No

Introductory Paper

Phishing detection based Associative Classification data mining

By Neda Abdelhamid, A. Ayesh, F. Thabtah. 2014

Published in Expert systems with applications

Variables Table

Variable NameRoleTypeDescriptionUnitsMissing Values
SFHFeatureIntegerno
popUpWindowFeatureIntegerno
SSLfinal_StateFeatureIntegerno
Request_URLFeatureIntegerno
URL_of_AnchorFeatureIntegerno
web_trafficFeatureIntegerno
URL_LengthFeatureIntegerno
age_of_domainFeatureIntegerno
having_IP_AddressFeatureIntegerno
ResultTargetIntegerno

0 to 10 of 10

Additional Variable Information

URL Anchor Request URL SFH URL Length Having ’@’ Prefix/Suffix IP Sub Domain Web traffic Domain age Class collected features hold the categorical values , “Legitimate”, ”Suspicious” and “Phishy”, these values have been replaced with numerical values 1,0 and -1 respectively. details of each feature are mentioned in the research paper mentioned below

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download
1 citations
16542 views

Keywords

phishing

Creators

Neda Abdelhamid

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy