Mostra i principali dati dell'item

dc.contributor.authorNosova, Ekaterina
dc.date.accessioned2011-11-17T13:15:55Z
dc.date.available2011-11-17T13:15:55Z
dc.date.issued2011-03-19
dc.identifier.urihttp://hdl.handle.net/10556/190
dc.description2009 - 20101en_US
dc.description.abstractThe search for similarities in large data sets has a relevant role in many scientific fields. It permits to classify several types of data without an explicit information about them. Unfortunately, the experimental data contains noise and errors, and therefore the main task of mathematicians is to find algorithms that permit to analyze this data with maximal precision. In many cases researchers use methodologies such as clustering to classify data with respect to the patterns or conditions. But in the last few years new analysis tool such as biclustering was proposed and applied to many specific problems. My choice of biclustering methods is motivated by the accuracy obtained in the results and the possibility to find not only rows or columns that provide a dataset partition but also rows and columns together. In this work, two new biclustering algorithms, the Combinatorial Biclustering Algorithm (CBA) and an improvement of the Possibilistic Biclustering Algorithm, called Biclustering by resampling, are presented. The first algorithm (that I call Combinatorial) is based on the direct definition of bicluster, that makes it clear and very easy to understand. My algorithm permits to control the error of biclusters in each step, specifying the accepted value of the error and defining the dimensions of the desired biclusters from the beginning. The comparison with other known biclustering algorithms is shown. The second algorithm is an improvement of the Possibilistic Biclustering Algorithm (PBC). The PBC algorithm, proposed by M. Filippone et al., is based on the Possibilistic Clustering paradigm, and finds one bicluster at a time, assigning a membership to the bicluster for each gene and for each condition. PBC uses an objective function that maximizes a bicluster cardinality and minimizes a residual error. The biclustering problem is faced as the optimization of a proper functional. This algorithm obtains a fast convergence and good quality of the solutions. Unfortunately, PBC finds only one bicluster at a time. I propose an improved PBC algorithm based on data resampling, specifically Bootstrap aggregation, and Genetics algorithms. In such a way I can find all the possible biclusters together and include overlapped solutions. I apply the algorithm to a synthetic data and to the Yeast dataset and compare it with the original PBC method. [edited by the author]en_US
dc.language.isoenen_US
dc.subjectBiclusteringen_US
dc.titleMulti-biclustering solutions for classification and prediction problemsen_US
dc.typeDoctoral Thesisen_US
dc.subject.miurMAT/08 ANALISI NUMERICAen_US
dc.contributor.coordinatoreLongobardi, Patriziaen_US
dc.description.cicloIX n.s.en_US
dc.contributor.tutorPaternoster, Beatriceen_US
dc.identifier.DipartimentoMatematicaen_US
 Find Full text

Files in questo item

Thumbnail

Questo item appare nelle seguenti collezioni

Mostra i principali dati dell'item