Audit Data

Donated on 7/13/2018

Exhaustive one year non-confidential data in the year 2015 to 2016 of firms is collected from the Auditor Office of India to build a predictor for classifying suspicious firms.

Dataset Characteristics

Multivariate

Subject Area

Business

Associated Tasks

Classification

Feature Type

Real

# Instances

777

# Features

-

Dataset Information

Additional Information

The goal of the research is to help the auditors by building a classification model that can predict the fraudulent firm on the basis the present and historical risk factors. The information about the sectors and the counts of firms are listed respectively as Irrigation (114), Public Health (77), Buildings and Roads (82), Forest (70), Corporate (47), Animal Husbandry (95), Communication (1), Electrical (4), Land (5), Science and Technology (3), Tourism (1), Fisheries (41), Industries (37), Agriculture (200).

Has Missing Values?

Yes

Variables Table

Variable NameRoleTypeDescriptionUnitsMissing Values
no
no
no
no
no
no
no
no
no
no

0 to 10 of 18

Additional Variable Information

Many risk factors are examined from various areas like past records of audit office, audit-paras, environmental conditions reports, firm reputation summary, on-going issues report, profit-value records, loss-value records, follow-up reports etc. After in-depth interview with the auditors, important risk factors are evaluated and their probability of existence is calculated from the present and past records.

Dataset Files

FileSize
audit_data/audit_risk.csv79.3 KB
audit_data/trial.csv39 KB

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (27.8 KB)
0 citations
8105 views

Creators

Nishtha Hooda

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy