A genome sequence is a long sequence drafted in a 4 letter code–3 bn. letters in the case of a human genome. But what’s the meaning–how is the code deciphered? Historically this is left to pro annotators who use info from a number of sources ( for example, information about similar genes in other organisms ) to work out where a gene starts, stops, and what it does.
Even the “gold standard” of pro annotation is a remarkably slow process. The general public Library of Science is harnessing the power of the web to boost access to info, and to help consultation and the understanding of science. PLoS Biology is presenting info on an independent project pushing towards the same goals.
Andrew Su, John Huss III and associates describe their attempts to build a ‘Gene Wiki’–an online repository of info on human genes, stored inside Wikipedia. There’s a lot of potential info about any given gene–its name, sequence, position on a chromosome, the protein ( s ) it encodes, other gene ( s ) it interacts with, for example.
And presenting this info is known as ‘gene annotation.’ As info may come from plenty of different analysts working independently, it is vital that resources exist to gather the data together.
Existing annotation libraries include Gene Portals and Model Organism Databases–however, the info stored in these is said to be decisive, which needs consistent updates by particular professionals and formal show of info.
The writers are assured that their stubs will seed the posting of more detailed info from scientists who encounter them on Wikipedia–and they report that, so far, they seem to be succeeding : the comprehensive number of edits on mammalian gene pages has doubled.