These datasets were constructed from the publicly available PKDD 2005 Discovery Challenge datasets to study cancer genomic data.
The original dataset was pre-treated by filtering and selecting genes to generate several datasets according to available biological information on the GEO web site and the SAGE map data repository.
|Description||Short description of the Cancer SAGE data.|
|Unfiltered dataset||Dataset containing Cancer SAGE data for the complete 27679 tags and 90 biological conditions.|
|Medium dataset||Dataset containing Cancer SAGE data for filtered 822 tags and 74 biological conditions. The archive contains results as flows generated with the Clementine data mining software.|
|Small dataset||Dataset containing cancer SAGE data for filtered 516 tags and 74 biological conditions. The archive contains results as flows generated with the Clementine data mining software.|
Exploratory analysis of cancer SAGE data, Ricardo Martinez, Richard Christen, Claude Pasquier and Nicolas Pasquier, in: Discovery Challenge of the PKDD international conference on Principles of Knowledge Discovery in Databases, 2005.