Draft sequences of the human genome were the most remarkable achievements of the scientific community at the beginning of the 3rd Millennium. Ten years of unprecedented research by 16 leading centers in molecular biology throughout the world, brought together by the International Human Genome Project (HGP) with Francis S. Collins as its leader on one side, and by a private company, Celera Genomics, directed by Craig Venter, as their competitor on the other, has come to the logical and compromised end – “The End of the Beginning” – so-called by Francis Collins.
Key words: Developmental cytogenetics, functional genomics, genome sequencing
Taking into account the profound contribution of already collected information from genome sequences into biological sciences and medical practice, it is worthwhile to address this review to some of the most crucial points of this still rather hot post-genomic area. Briefly, these basic questions are as follows:
1) What really does stand behind the “draft sequence” of the human genome?
2) Where are we now in actually understanding the human genome?
3) What are the new scientific successors to the human genome project?
4) What are the major impacts of the human genome project on medicine?
5) What are new options for developmental cytogenetics in the post-genomic era?
These and other questions dealing with the human genome are more thoroughly outlined in numerous recent publications in the literature. Of special importance is the detailed review of the HGP prepared by the International Human Genome Sequencing Consortium published in Nature in 2001 . This particular article is a milestone of the new era of post-genomic science.
1) What Really Does Stand Behind the “Draft Sequence” of the Human Genome?
Sequencing of the genome means establishing the exact position of each of the four known nucleotides (adenine, guanine, cytosine, and thymine) along the gigantic DNA molecule of 1.5-2 m in length and total nucleotide content of ~3 x 109 bp. Now the sequence is 90% complete for the euchromatic regions of all chromosomes. Actually, by the middle of 2000, 20% of the human genome was sequenced 10 or more times (feasible error 1 x 106), while 70% of the human genome was sequenced with much less precision (an error 1 per 1 x 103), and ~10% was not sequenced at all. According to more recent data , 91% of the human genome euchromatin has already been sequenced with precision 1 error per 1 x 104. There is no doubt that by the year 2003, which is the 50th anniversary of the discovery of the double DNA helix by Watson and Crick , all of the human genome will be sequenced with high precision. The most precise sequences so far accomplished have been only for two minor human chromosomes, 21 and 22.
Where Are We Now in Actually Understanding the Human Genome? It has already been found that only 1.1-1.4% of the total DNA sequence actually encodes for proteins. This comprises only 5% of the total 28% of the DNA sequence that is transcribed into RNA. About 43-50% of total DNA consists of repeated sequences of various types with 45% represented by four classes of so-called “parasitic DNA elements”, 3% by repeats of just a few bases, and 5% by duplications of large DNA segments.
The total number of genes in the human genome is estimated to be close to 31,000 (that is much less than expected before), of which 22,000 genes are provided by the HGP Consortium, and 26,000 found [probably by expression sequence tags (EST) techniques) by Celera Genomics.
According to other information, 32,000 genes comprise the human genome altogether (39,000 according to Celera Genomics) with 15,000 of these already known to some extent, and 17,000 predicted . According to the Online Mendelian Inheritance in Man Catalogue (OMIM), about 10,000 human genes have already been mapped to individual chromosomes. In comparison with the human gene, about 6,000 genes have already been sequenced for the yeast genome, 13,000 for the famous fruit fly, 18,000 for the worm C. elegans, and 26,000 for a plant Arabidoptis thaliana.
Thus, the human genome evolution cannot be attributed directly to the increase in gene number, but more probably to more complex intragenic variations. The differences between human, worm and fruit fly might be attributed mostly to their protein complexity (e.g., number of domains per protein, novel combinations of domains, etc.).
Of special scientific and practical value is the discovery of the identity of 99.9% of all the human genome sequenced so far; 0.1% of the basic intergenomic variations are represented by the single nucleotide polymorphism (SNP). The latter is encountered every 1,000-2,000 bp, which makes a total count of about 3.2 million. One-and-a-half million SNPs are domed inside the genes (meaningful SNPs) and 1.7 million outside the genes (meaningless SNPs). Intense efforts have been made to develop a catalogue of human DNA variations. Special software programs and sophisticated new equipment to enable efficient screening of SNPs throughout the human genome have been suggested, and are already in use at the most advanced molecular centers.
2) What are the New Scientific Successors to the Human Genome Project?
The HGP has already resulted in four main ramifications: i) Comparative Genomics; ii) Functional Genomics; iii) Ethical, Legal and Social Implications (ELSI); [4,5].
i) Besides the “draft version” of human genomes, over 600 species (mostly bacteria) have already been sequenced. These include yeast (1996) fruit fly (1999) and worms (1998). Sequencing across mice, rat, and zebra fish genomes is in progress, and sequencing of other vertebrate genomes (pig, dog, cow and chimpanzee) is being seriously considered. Comparative analysis of different genomes has a great implication for genome evolution studies as well as for gene identification. Comparison of sequences across different genomes is now considered as the most powerful tool to identify the protein coding DNA sequences . Next to Comparative Genomics is the Human Genome Diversity Project, directed mainly to the ethnic, population, race and inter-individual DNA variations. SNP analysis with powerful microchip technology opens new and very efficient avenues for these studies .
ii) Out of a total 32,000 genes in the human genome, information about their protein products is available for less than 6,000 genes so far. The function of the remaining 26,000-27,000 gene products has been completely ignored. The analysis of gene product(s) (RNA or proteins) activity constitutes a major goal for the Functional Genomics (also called Proteomics). New technologies for studying gene expression (transcription) of about 10,000 genes in one experiment have been developed. This makes it possible to investigate gene expression patterns (profiles) in different organs and tissues, also in normal or abnormal development at different stages of ontogenesis. The same large scale analysis has already been applied not only to RNA but also to protein analysis; their structure, quantity, location, posttranslational modifications and protein-protein interaction patterns.
iii) The new “post-genomic era” is characterized by penetration, dissemination and profound influence of genetics in different areas of social life. Examination of the ELSI of the HGP is an integral and very important component of the HGP. Crucial steps now concern effective federal legislation to outlaw the use of genetic data in the workplace and in obtaining health insurance. The ELSI program is responsible for elaborating the general strategy of providing adequate professional and public education of modern genetics, fair use of genetic information, and safe and efficient integration of genetic information into clinical practice.
Adaptation of different religious convictions to the novel data about the human genome and the feasible implication for human health benefits deserves especially careful attention.
3) What are the Major Impacts of the Human Genome Project on Medicine?
The major impacts of the HGP in medical practice are mostly confined to the creation of Molecular Medicine – highly individualized medicine operating with DNA and its products (RNA, proteins) for diagnosis, prevention and treatment of many disorders .
The other contributions of the HGP on modern medicine are also quite profound and include: i) elaboration of precise and efficient methods for the diagnosis and prediction, and in the near future, for the treatment of inherited disorders. ii) Identification of genes participating in the origin and manifestation of all monogenic and most common multi-factorial (multi-genic) diseases. iii) Development of new, designed drugs, based on genomic research. iv) Predisposition gene studies and the birth of Predictive (Preventive) Medicine. v) The creation of a solid scientific background in Pharmaco-genetics and also Pharmaco-genomics. vi) Elaboration of scientific background and the first clinical trials in gene therapy.
The forthcoming application of HGP into major trends of Molecular Medicine of the near future should include: a) identification of genes and gene interactions within relevant gene nets for many common illnesses within the next 5-7 years. b) Predictive genetic tests applicable to individuals with a family history of a particular disorder. c) Individual DNA data base (Gene-Pass) with special emphasis of its medical significance for pregnant women, subjects of special professions or sportsmen. d) Genetic prediction of individuals at-risk for diseases and responsiveness to the drugs. Predictive genetic tests available for dozens of common diseases will enable everyone to learn more about their susceptibilities, and reduce genetic risk by means of medical surveillance, lifestyle modification, diet or drug therapy. (The medical mainstream is expected in the next decade or so, but it deserves exhaustive studies now .) e) The pharmacogenomic approach to predictive drug responsiveness becomes applicable for a number of drugs and disorders. f) Widespread application of gene therapy approaches for the treatment of many common diseases, especially tumors.
In conclusion, it should be pointed out that the actual impact of human genome sequences, even in its draft version, on human life and civilized mankind is difficult to overestimate. Although many positive trends of this discovery are quite obvious, social attitude to them is still very contradictory. The oppression of genetic medicine and new technology is steadily growing, and might become very serious in the future . Efforts to educate the public need to start now, so as to have more people capable of explaining and understanding the potential benefits of molecular medicine. Substantial advancement in understanding of modern genetics in the post-genomic era by a wide range of clinicians is an indispensable prerequisite of wide implementation of genetic approaches for human life.
4) What Are New Options For Developmental Cytogenetics in the Post-Genomic Era?
There is little doubt that deciphering the genetic load of each chromosome provides new and powerful options for understanding the “philosophy” of chromosome organization, the reason for particular genes to form specific clusters within chromosomes that remain stable throughout millions of years of evolution (so-called gene synteny), the functional meaning of heterochromatin and of the whole chromosome organization as a discrete functional unit. Thus, thorough knowledge of intimate nucleotide chromosome structure, accurate cytogenetic gene maps and detailed information on gene distribution within each chromosomal band, make a solid background for more comprehensive studies to be undertaken in the area of developmental cytogenetics . Early prenatal stages of human embryonic development provide ample opportunities for these sophisticated researches. Beside precise knowledge of the human genome, there are other important advantages that provide additional credit for the prompt advancement in developmental cytogenetics of humans. These include:
i) Ability of detailed complex analysis of the fetus-bearing chromosome imbalance before the death of the embryo in utero (detailed ultrasonography, tissue sampling for cytogenetic, molecular, biochemical analysis, etc.).
ii) Access to early artificially terminated embryos for detailed morphological studies.
iii) Knowledge of the precise genetic cargo of each chromosome or of any fragment that might facilitate comprehensive analysis of gene interactions and imbalance of transcription factors in embryonic malformation syndrome and prenatal death.
iv) Unique abilities to study gene expression profiles by means of sophisticated microchip and microtray techniques. I am personally convinced that all earlier studies concerned with the analysis of developmental defects and fetal abnormalities in humans caused by chromosome imbalance should be thoroughly revised, or even disregarded altogether, as being carried out on aborted fetuses with far advanced degeneration processes. Now developmental cytogenetics gets new powerful impact from both obstetric techniques of tissue sampling and genomics input.
There are several urgent questions that should be addressed right now in the developmental cytogenetics of human beings:
a) What are the actual reasons for abnormal development and early death in human embryos with an imbalance of different chromosomes or their parts?