E-learning in analysis of genomic and proteomic data 1. Introduction 1.1 Recent challenges in modern genomic and proteomic analysis 1.1.2. Molecular biology techniques for genome and proteome analysis

The aim of molecular biology is to study the genome, transcriptome and proteome. Many laboratory techniques have been invented for this purpose. In general, the techniques of molecular biology can be divided into two major groups. The conventional methods and high-throughput methods. The conventional methods are considered to be more precise, however, are able to to analyze only a restricted number of molecules in one experiment. On the opposite, the high-throughput methods are able to analyze thousands of molecules in one experiment (in one sample), but at cost of relatively decreased accuracy. The standard procedure is thus to use the high-throughput methods as a scan for potential markers and to confirm some limited number of selected candidates by conventional methods. PCR (polymerase chain reaction) and its variants (RT-PCR, real-time PCR, quantitative PCR…), FISH (fluorescence in-situ hybridization) and its variants (M-FISH, SKY…), CGH (comparative genomic hybridization), Western blot, Northern blot, Southern blot and gel-electrophoresis are only some of the most basic methods that can be considered conventional. Almost each high-throughput method was developed on the basis of one of these methods. Microarrays use hybridization concept of Southern blotting, arrayCGH can be viewed as microarray upgrade on CGH, 2-D gel electrophoresis is upgrade of a simple gel electrophoresis etc. As there is a lot of a good literature describing the conventional and high-density molecular biology techniques, in the following we will make only a brief description of some of them.

 

1.1.2.2. High-density methods

1.1.2.2.1. Microarrays

Before we introduce the concept of microarrays, we will define some terms that will be used througout the text. The probe is a known molecule designed for reaction with a specific molecule under investigation and the target is the unknown molecule which shall be identified.

A microarray (a chip) is a high-throughput technology allowing for simultaneous comparison of biomolecules (nucleic acids, proteins, organic compounds or even tissues) between different samples of interest. The main principle is the immobilization of labeled molecules of interest onto a solid surface into thousands of small areas called spots regularly ordered into rows and columns (an array).

Each spot contains a small amount of a so-called “probe” - a molecule designed for reaction with a specific molecule under investigation. The probes react with the labeled corresponding molecules of interest creating thus complexes which bind the molecules of interest onto a surface. These complexes are then detected by special detection scanners where the intensity of the signal is captured and later converted into a numeric value. The assumption is that the intensity of the signal on each spot corresponds to the amount of the molecule of interest bound to this spot.


Based on the type of the biological material that is to be analysed we distinguish different types of microarrays:
 

  1. DNA microarray - is used for analysis of the structure of genes and their expression
  2. Protein microarray - is used for the detection of proteins
  3. Tissue micorarray - enables simultaenous analysis of tumour samples
  4. Transfection microarray (cell-based microaray) - chip for analysis of gene function
  5. Chemical Compound microarray - collection of organic compounds used for detection of proteings bounding to specific chemical compound

Nowadays, microarray analysis became one of the most applied methods of molecular biology. In comparison with conventional techniques it allows for simultaneous comparison of thousands of molecules (genes, proteins…) in one experiment. It is a robust method, rutinely used in both biological and medical research.
Microarrays became an important tool in diagnostics, epidemiology and classification of diseases. In phylogenetic analyses it is a useful tool for genome analysis of species and allows the creation of phylogenetic trees. Because of the wide range of possibilites, microarrays start to be used in almost all biological and medical domains.

1.1.2.2.2. Two dimensional (2-D) gel electrophoresis

The two-dimensional gel electrophoresis (2-DE) is the technology commonly used in molecular biology to analyze the abundance of proteins in a sample. It is based on classic electrophoresis, which is performed separately in two different directions. The aim of 2-DE is to split the proteins according to their properties across the gel.

Principle
1. First, the proteins are extracted from the tissue of interest. One of the most important aspects of the experiment is the choice of the gel that has to be porous enough to enable the protein movement. The most commonly used are agarose or polyacrylamide gel.
2. The sample is placed on the gel and the proteins start to migrate through the gel towards the edges, until they reach isoelectric point, which is the point, where the overall charge on the protein is zero. The proteins are separated according to their charge/molecular weight. Then the proteins are arranged to move in the second direction according to their pH gradient.
3. Finally, the gel is stained to detect particular spots which refer to sample proteins.
4. When the spots are labeled, the image of a gel is captured. The pixel intensity of a spot should correspond to the amount of protein in the spot and a special SW for image analysis is applied to quantify the pixel intensities of each spot. The spot pixel intensities together with the information about spot coordinates and different quality measures of the spot are stored in the raw data file.

A special type of the 2-DE is 2-D Fluorescence Difference Gel Electrophoresis (DIGE). In this experiment proteins are first labeled with a fluorescence dye. This allows putting on the gel proteins from different samples, distinguished by different fluorescent dyes. For instance we can study in on one gel three different samples, where one matches the control group and two corresponds to different treatment groups.

1.1.2.2.3. MASS spectrometry

Mass spectrometry is an analytical tool that is used for measuring the molecular mass of a sample. In proteomics it is used for the analysis of proteins, peptides and oligonucleotides. This technology can be considered a high-density as it produces information about hundreds / thousands peptides for each sample. 

A mass spectrometer does not actually measure the molecular mass directly, but rather the mass-to-charge ratio of the ions formed from the molecules.


1.1.2.2.4. Sequencing methods

Next generation sequencing

  • RNAseq

More reading on RNA seq:
The review article on RNA seq:
http://papers.gersteinlab.org/e-print/rnaseq/preprint.pdf

Combining RNAseq and ChIPseq:

http://themindwobbles.wordpress.com/2008/09/01/transcriptome-input-and-output-combined-rnaseq-and-chipseq/

  • ChIPseq

Comparison of next generation sequencing technologies for transcriptome characterization
http://www.biomedcentral.com/content/pdf/1471-2164-10-347.pdf