E-learning in analysis of genomic and proteomic data 2. Data analysis 2.2. Analysis of high-density genomic data 2.2.1. DNA microarrays 2.2.1.10. Meta-analysis of microarray experiments
Methods of meta-analysis of microarrays are implemented as (Table 2):
- R-packages or R-scripts
- executable files
- WinBUGS codes
R-software
Most of the methods is implemented in R, an open-source program for statistical computing, described in chapter 3. Software.
Executable files
Executable files are available for VennMapper and procedure which implements GSFLD.
VennMapper
First, it is necessary to load the input data file. Venn Mapper accepts only tab-delimited text files in the following format: The first row must contain the identifiers of the columns. The first record describes the gene-identifier used, and can be any alpha-numeric value; this field is not used in the application. Other fields contain log2ratios for particular gene in particular experiment. Data can come from different microarray platforms, but the combined file (as described above) has to be created. Program is able to deal with missing values as well. User has to set a fold difference that is biologically important. The fold difference serves a cutoff for identifying significant genes for each experiment. When all required inputs have been entered, the user click on GO! button. Program informs the user about its progress. When program is finished and closed three output files: fact_x.x_numbers.txt, fact_x.x_zvalue.txt, fact_x.x_genes.txt are created in same folder as input file. X.X is replaced by input fold difference. In these files, the number of common genes, relevant Z-statistic and names of the genes can be found. If output files already existed, they will be overwritten. Importantly, if (one of) these files was beforehand loaded in another program, e.g. Microsoft Excel, Venn Mapper will not be able to write to these files.
Procedure GSFLD
Procedure GSFLD loads input data and input parameters (name of input and output file) from text files. It generates output file in the same folder. The input file contains: number of samples, number of genes and transformed expression profiles: rows represent genes and identification of group, columns represents samples, one column represents gene names. The output file contains the information about the number of genes, the number of classification errors and genes identificators. Transformation of data is done by program GeneSpring (v6.1) (
WinBUGS software
Bayesian models use software WinBUGS for estimation of model parameters (Lunn et al., 2000). In this SW the algorithm of Markov Chian Monte Carlo and Gibbs Sampling is implemented. WinBUGS is available at www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml. Model definition, input data and intial values of parameters are loaded from text files, for the rest of the analysis a graphic interface is used.
Table 2. Implementation of methods of meta-analysis of microarrays
Method |
Implementation |
Availability |
SOGL |
R-package OrderedList |
|
Meta-profiling |
.exe |
Unknown |
VennMapper |
.exe |
|
MAP-matches |
R scripts |
On request |
Fisher’s method of inverse chi-square |
Unknown |
Unknown |
Effect size modeling |
R-package GeneMeta |
|
LASSO |
R-package lasso2 |
|
GSRF |
Unknown |
Unknown |
GSFLD |
.exe |
http://www.biomedcentral.com/content/supplementary/1471-2105-5-81-S4.zip |
TSP-clasifier |
Unknowne |
Unknown |
Bayesian models |
WinBUGS |
|
Estimates of FDR |
R script |
Unavailable |
Two-stage ANOVA |
Unknown |
Unknown |
Z-statistic |
R-package metaArray |
|
Latent variable method |
R-package metaArray |