Legal Case Reports

Donated on 10/18/2012

A textual corpus of 4000 legal cases for automatic summarization and citation analysis. For each document we collect catchphrases, citations sentences, citation catchphrases and citation classes.

Dataset Characteristics

Text

Subject Area

Other

Associated Tasks

Classification

Feature Type

# Instances

# Features

Dataset Information

Additional Information

This dataset contains Australian legal cases from the Federal Court of Australia (FCA). The cases were downloaded from AustLII (http://www.austlii.edu.au). We included all cases from the year 2006,2007,2008 and 2009. We built it to experiment with automatic summarization and citation analysis. For each document we collected catchphrases, citations sentences, citation catchphrases, and citation classes. Catchphrases are found in the document, we used the catchphrases are gold standard for our summarization experiments. Citation sentences are found in later cases that cite the present case, we use citation sentences for summarization. Citation catchphrases are the catchphrases (where available) of both later cases that cite the present case, and older cases cited by the present case. Citation classes are indicated in the document, and indicate the type of treatment given to the cases cited by the present case.

Has Missing Values?

Dataset Files

File	Size
corpus/citations_class/06_1234.xml	6.2 MB
corpus/fulltext/07_1062.xml	2.9 MB
corpus/citations_class/06_1112.xml	2.1 MB
corpus/fulltext/08_498.xml	1.5 MB
corpus/citations_class/06_1663.xml	1.1 MB

Rows per page

0 to 5 of 10536

Reviews

There are no reviews for this dataset yet.

Download (80.8 MB)

0 citations

7458 views

Creators

Filippo Galgani

DOI

10.24432/C5ZS41

License

This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.