A researcher at the Johns Hopkins
Institute of Genetic Medicine has led the effort to
compile
the largest free resource of experimental information about
human proteins to date. Reporting in the
February issue of Nature Biotechnology, the research
team describes how all researchers around the
world can access this data and speed their own research.
"Advances in technology have made data generation much
easier, but processing it and
interpreting observations are now the major hurdles in
science," said Akhilesh Pandey, associate
professor of
biological chemistry, pathology and
oncology and a member of the McKusick-Nathans
Institute of Genetic Medicine at Johns Hopkins.
"We've created a repository that incorporates
easy-to-use Web forms so that all researchers
can contribute and share data," said Pandey, who
coordinated this effort with scientists and software
developers at the Institute of Bioinformatics, a nonprofit
institute he founded in Bangalore, India, in
2002.
Like the online encyclopedia Wikipedia, Human
Proteinpedia allows any researcher to contribute
and edit his or her data as research progresses.
"Researchers will be able to quickly review what has
been discovered by others about their protein of interest,
speeding their own work," Pandey said.
Human Proteinpedia contains information on when and
where specific proteins are expressed or
not, including in cells and tissues from diseases such as
cancers; how the proteins are modified; and
with which other proteins they interact. The repository
includes only experimental data and doesn't
include computer-generated predictions, which may not turn
out to be real. The current version of
Human Proteinpedia compiles data provided by more than 71
laboratories from all over the world and
contains entries for more than 15,230 human proteins.
"With the amount of proteomic data pouring in each
day, however, cataloging all of human
protein data by hand is a herculean task. So we're hoping
that the scientific community will come
together to contribute data generated in individual
laboratories," Pandey said. "This will not only
improve the quality of the data but also increase the pace
at which data is collected in a common
repository. We're excited about the enthusiasm and
involvement of the entire global proteomics
community and hope that we can work with companies like
Google and Microsoft that are interested in
enabling such data sharing and dissemination for biological
data."
The research was funded by the National Institutes of
Health Roadmap Initiative, the National
Heart Lung and Blood Institute and internal funds from the
Institute of Bioinformatics in Bangalore,
India.