E-learning in analysis of genomic and proteomic data 2. Data analysis 2.2. Analysis of high-density genomic data 2.2.1. DNA microarrays 2.2.1.10. Meta-analysis of microarray experiments

Methods of meta-analysis of microarrays are implemented as (Table 2):

  • R-packages or R-scripts
  • executable files
  • WinBUGS codes

R-software

Most of the methods is implemented in R, an open-source program for statistical computing, described in chapter 3. Software.

Executable files

Executable files are available for VennMapper and procedure which implements GSFLD.

VennMapper

First, it is necessary to load the input data file. Venn Mapper accepts only tab-delimited text files in the following format: The first row must contain the identifiers of the columns. The first record describes the gene-identifier used, and can be any alpha-numeric value; this field is not used in the application. Other fields contain log2ratios for particular gene in particular experiment. Data can come from different microarray platforms, but the combined file (as described above) has to be created. Program is able to deal with missing values as well. User has to set a fold difference that is biologically important. The fold difference serves a cutoff for identifying significant genes for each experiment. When all required inputs have been entered, the user click on GO! button. Program informs the user about its progress. When program is finished and closed three output files: fact_x.x_numbers.txt, fact_x.x_zvalue.txt, fact_x.x_genes.txt are created in same folder as input file. X.X is replaced by input fold difference. In these files, the number of common genes, relevant Z-statistic and names of the genes can be found. If output files already existed, they will be overwritten. Importantly, if (one of) these files was beforehand loaded in another program, e.g. Microsoft Excel, Venn Mapper will not be able to write to these files.

Procedure GSFLD

Procedure GSFLD loads input data and input parameters (name of input and output file) from text files. It generates output file in the same folder. The input file contains: number of samples, number of genes and transformed expression profiles: rows represent genes and identification of group, columns represents samples, one column represents gene names. The output file contains the information about the number of genes, the number of classification errors and genes identificators. Transformation of data is done by program GeneSpring (v6.1) (Conway, 2003) and procedures are written in C. Procedures are available as supplementary material in Jiang et al. (2004).

 

WinBUGS software

Bayesian models use software WinBUGS for estimation of model parameters (Lunn et al., 2000). In this SW the algorithm of Markov Chian Monte Carlo and Gibbs Sampling is implemented. WinBUGS is available at www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml. Model definition, input data and intial values of parameters are loaded from text files, for the rest of the analysis a graphic interface is used.


Table 2. Implementation of methods of meta-analysis of microarrays

Method

Implementation

Availability

SOGL

R-package OrderedList

www.bioconductor.org

Meta-profiling

.exe

Unknown

VennMapper

.exe

http://www.gatcplatform.nl//vennmapper/index.php

MAP-matches

R scripts

On request

Fisher’s method of inverse chi-square

Unknown

Unknown

Effect size modeling

R-package GeneMeta

www.bioconductor.org

LASSO

R-package lasso2

www.bioconductor.org

GSRF

Unknown

Unknown

GSFLD

.exe

http://www.biomedcentral.com/content/supplementary/1471-2105-5-81-S4.zip

TSP-clasifier

Unknowne

Unknown

Bayesian models

WinBUGS

http://www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml

Estimates of FDR

R script

Unavailable

Two-stage ANOVA

Unknown

Unknown

Z-statistic

R-package metaArray

www.bioconductor.org

Latent variable method

R-package metaArray

www.bioconductor.org