Cancer SAGE

These datasets were constructed from the publicly available PKDD 2005 Discovery Challenge datasets to study cancer genomic data.

The original dataset was pre-treated by filtering and selecting genes to generate several datasets according to available biological information on the GEO web site and the SAGE map data repository.

Data Files
File Description
Description Short description of the Cancer SAGE data.
Unfiltered dataset Dataset containing Cancer SAGE data for the complete 27679 tags and 90 biological conditions.
Medium dataset Dataset containing Cancer SAGE data for filtered 822 tags and 74 biological conditions. The archive contains results as flows generated with the Clementine data mining software.
Small dataset Dataset containing cancer SAGE data for filtered 516 tags and 74 biological conditions. The archive contains results as flows generated with the Clementine data mining software.
Reference

Exploratory analysis of cancer SAGE data, Ricardo Martinez, Richard Christen, Claude Pasquier and Nicolas Pasquier, in: Discovery Challenge of the PKDD international conference on Principles of Knowledge Discovery in Databases, 2005.