Genomics could help cure cancer
You were a professor of computer sciences at the University of Arizona, usa , how did you move to the field of genomics and bioinformatics?
In 1994, I read about Craig Venter ( acknowledged as the guru of the shotgun sequencing approach ) and his team at The Institute of Genomic Research (tigr) mapping the genome of the Haemophilus influenza , a small 1.8 million base pair genome, using the 'whole genome shotgun dna sequencing technique'. I got interested and wondered why couldn't the same technology be used for sequencing the much larger human genome.
A colleague and I approached the us -based, public-funded, Human Genome Project to sponsor the sequencing of the human genome using the shotgun technique. But it refused to back the project on the grounds that, besides other things, it would be impossible to compute the large volumes of data generated in the process.
Later, I left the university to join Venter at Celera Genomics and began work on the Drosophila Melanogaster (common fruit fly) genome. Once the fruit fly genome was assembled, it proved irrefutably that the mapping of the human genome was also possible using the same technique.
We went on to assemble the human genome in a record time of just nine months.
How is the shotgun sequencing technique different from the HGP technique?
In the shotgun technique randomly sampled readings or fragments of the genome are reconstructed in the proper order to decipher the entire genome. The process is analogous to shredding a magazine to peices, picking up 800 letters at a time from the shreds, 40 million times, and then stitching the magazine back together. The presence of repetitive dna, found nearly a million times throughout the genome, quite like the presence of a million copies of the same alphabet in the shredded magazine, makes the process difficult.
On the other hand, the hgp method involves breaking down the human genome into larger fragments whose place in the genome is known, deciphering them and then rebuilding the genome by physical mapping.
The hgp method required a mammoth human effort -- more than a 1,000 people worked to decipher the human genome. At Celera, in contrast, we were able to map the genome with a staff of just a hundred in comparatively very little time.
What is your specific task in the team at Celera?
The computations require specific software, which is not available off the shelf. Granger Sutton (earlier at tigr and an expert at algorithms, now a colleague at Celera) and I were responsible for building this assembler software for sequencing the human genome. In fact the assemblers developed now handle problems 1,000 times larger than can be handled by other existing techniques.
Any interesting findings of the human genome?
For one, there are fewer genes than we earlier believed. Earlier the estimates ranged from 60,000 to 150,000 genes. We have now found 17,764 genes and estimate a maximum of about 39,000 genes. We are genetically not as complex when compared with other species as we once believed. This means our complexity comes from, among other things, a complicated regulation of the way genes express themselves. Humans have more regulatory proteins than does the fruit fly. The genes get transcribed by the proteins which turn them on or off. That creates all the variety.
What has the recent mapping of the mouse genome added to our knowledge?
For one, it's proved beyond doubt that Celera's whole genome assembly process works. It will help to convince our remaining sceptics about the nature of the work we are doing. Genomes of the mouse and similar other distant but related species will assist us in accurately identifying the genes and the regulatory signals in the human genome, which in turn will herald a new era in this kind of research.
But why has Celera decided to sell it and not give it free?
Celera does not take money from the public; it's a private company.It has to sell its products. We did give the human genome sequence to the public for nothing.
Will the genome not be out of the reach of many?
No, I don't believe so. In fact, the genome is priced at much less than other similar products in the market. What about the price for developing nations, such as India?
Right now, the price may be steep for some centres of research in developing nations. But there is so much competition today. Others will come out with similar data and with time the utility of the data reduce. In time, the same data that is today costly will be cheap enough for all to purchase and it will be easily available in the public domain. That is the way markets work these days.
Don't you find it unethical?
It is not a matter of ethics. It is about whether you believe in capitalism or not. I do have a personal belief on the subject though.
What is your personal belief?
I will not share it here.
What about genetic modification of organisms, do you think it should be allowed?
Genetic modification is not creating life but only engineering it. Scientists have only the ability to manipulate genes, not to create life. Moreover engineering crops and animals could be one way of securing a sustainable world. I do hold a personal opinion on the issue, but I wouldn't want to speak about it. It's for the policymakers to discuss and decide upon.
As an expert scientist shouldn't you be the one commenting on the issue?
I don't want to talk about it to the popular press, if the us congress holds an enquiry and I am asked to appear before it, I'll certainly speak my mind.
What are the other maps that Celera is looking at to decipher?
The other animals we are looking at are the horse and fish. We are certainly not thinking of the chimpanzee as it is genetically similar to the humans. Their genome is only about 1.3 per cent different from ours.
What are the other future areas of research in genomics?
To date, we have only sequenced the genes. The next immediate goal is to develop a nearly perfect annotation of the genes of the human genome, their regulatory signals and to capture as many functions as possible. We have, till now, focussed on collecting data. The data is now to be analysed thoroughly. With time we will get to understand how the genes work. In the next 50 years diagnostics will emerge. Diseases like cancer could be detected earlier and therefore cured. Then, the next step would be the emergence of cures for some diseases like diabetes. Also I think, we know little about the evolution of species. A lot is to be learnt about it from genomics. It is personally very intirguing. We need to know how we evolved and how speciation took place at the genetic level.
And do you think India should forge its own technology in genomics?
Yes, as bioinformatics research will be moulded to cater to local needs, India should too carry out research independently. To study the diseases found commonly in the country, research should be carried out by individuals and research centres across India. This will demand that the country have its own technology base. India has tremendous human resources that can be used for data integration, which is intensive work.
In fact, India can take lead in mapping the relatively smaller genomes like that of bacteria with two to five million base pairs. Indian researchers can also use the basic techniques like the microarray technology, which is relatively inexpensive.