Mining for Mutually Exclusive Gene Expressions

Introduction

Datasets

Tools

Publications

Datasets

In the link below you can find the datasets that we are using in our research on mining for mutually exclusive gene expressions We are currently using two real SAGE (Serial Analysis of Gene Expression) datasets in our studies. The first one consists of 90 SAGE libraries and 27679 tags. The second one is a reduced dataset consisting of 74 SAGE libraries and 822 tags. Both datasets have been provided by Dr Olivier Gandrillon’s team (Centre de Genetique Moleculaire et Cellulaire de Lyon, France) and have been studied and presented at the ECML/PKDD Discovery Challenge Workshops in 2004 and 2005. The SAGE libraries contained in these datasets have been prepared as of December 2002 [1]. They are collected from various human tissue types (colon, brain, ovary, etc.) and are labeled according to their cell state that is either normal or cancerous.

Large SAGE Dataset (90x27679)

Small SAGE Dataset (74x822)


References

1. O. Gandrillon. "Guide to the gene expression data". In Proceedings of the ECML/PKDD Discovery Challenge Workshop, Pisa, Italy, 2004, pp. 116–120

 

Return to MLKD