Department Banner Image
Coy C. Carpenter Library


Bioinformatics Resources

 

Databases

Array Express

ArrayExpress is a public archive for transcriptomics data, which is aimed at storing MIAME - and MINSEQE - compliant data in accordance with MGED recommendations. The ArrayExpress Warehouse stores gene-indexed expression profiles from a curated subset of experiments in the archive.

BioCyC

BioCyc is a collection of 414 Pathway/Genome Databases. Each database in the BioCyc collection describes the genome and metabolic pathways of a single organism.

BodyMap

Databank of expression information of human and mouse genes.

BRENDA

BRENDA is the main collection of enzyme functional data available to the scientific community.

CMR

The Comprehensive Microbial Resource (CMR) is a website used to display information on all of the publicly available, complete prokaryotic genomes. Common data types across all genomes in the CMR make searches more meaningful and cross genome analysis highlight differences and similarities between the genomes.

dbGaP

The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. dbGaP provides two levels of access - open and controlled - in order to allow broad release of non-sensitive data, while providing oversight and investigator accountability for sensitive data sets involving personal health information. Summaries of studies and the contents of measured variables as well as original study document text are generally available to the public, while access to individual-level data including phenotypic data tables and genotypes require varying levels of authorization.

Ensembl Genome

Ensembl is a joint project between the EMBL-EBI and the Wellcome Trust Sanger Institute that aims at developing a system that maintains automatic annotation of large eukaryotic genomes. Access to all the software and data is free and without constraints of any kind. The project is primarily funded by the Wellcome Trust . It is a comprehensive source of stable annotation with confirmed gene predictions that have been integrated from external data sources. Ensembl annotates known genes and predicts new ones, with functional annotation from InterPro , OMIM , SAGE and gene families.

Entrez Gene

Entrez Gene is NCBI's database for gene-specific information.

GenBank

GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences

Genome

The Genome database provides views for a variety of genomes, complete chromosomes, sequence maps with contigs, and integrated genetic and physical maps. The database is organized in six major organism groups: Archaea , Bacteria , Eukaryotae , Viruses , Viroids , and Plasmids and includes complete chromosomes, organelles and plasmids as well as draft genome assemblies.

Genome Browser

The Genome Browser was created by the Genome Bioinformatics Group of University of California at Santa Clara.

GEO

G ene E xpression O mnibus is a gene expression/molecular abundance repository supporting MIAME compliant data submissions, and a curated, online resource for gene expression data browsing, query, and retrieval.

KEGG

KEGG is a complete computer representation of the cell, the organism, and the biosphere, enabling computational prediction of higher-level complexity of cellular processes and organism behaviors from genomic and molecular information.

Harvester

Harvester crawls and crosslinks the following bioinformatic sites:
4DXp - AceView - BLAST - Biocompare - CDART - CDD - ensEMBL - Entrez - FishMap - Galaxy - UCSC GenomeBrowser - gfp-cDNA - Google-Scholar - gopubmed - Harvester42 - H-Inv - HomoloGene - iHOP - IPI - MapView - MGI - MINT - Mitocheck - OMIM - PolyMeta - PSORT II - RGD - SMART - SOSUI - STRING - TAIR - Unigene - UniprotKB - Wikipedia - WikiProtein

Mouse Genome

MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human health and disease.

PRIDE

PRIDE is a database of ESTs and gene expression profiles obtained mainly in the Plant Science Center, RIKEN. PRIDE contains information on gene expression profiles of Zinnia elegans, and will contain that of BY-2 and other organisms such as Lotus japonica and arabidopsis.

SNP

The Single Nucleotide Polymorphism database ( dbSNP) is a public-domain archive for a broad collection of simple genetic polymorphisms.

UniProt

The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information.

   

Worm Base

database of the model organism Caenorhabditis elegans and related nematodes.

   

Tools

Bioconductor

Bioconductor is an open source and open development software project to provide tools for the analysis and comprehension of genomic data.

Discovery Studio Visualizer

With DS Visualizer, you can visualize and share molecular information in a clear and consistent way, and in a wide variety of industry-standard formats. You can also create high quality graphics.

ExPASy

The ExPASy ( Expert Protein Analysis System) proteomics server of the Swiss Institute of Bioinformatics (SIB) is dedicated to the analysis of protein sequences and structures as well as 2-D PAGE.

FirstGlance

FirstGlance is an easy way to look at the 3D structures of proteins, DNA, RNA, and their complexes. FirstGlance in Jmol can display major structural features of the molecule with one click each. One-click options display secondary structure, amino and carboxy (or 3' and 5') termini, composition (protein, DNA, RNA, ligands, and solvent), the distributions of hydrophobic, polar, and charged amino acids, salt bridges and cation-pi orbital interactions for amino acids. Non-standard residues and missing sidechains are flagged automatically.

GeneCards

GeneCards® is a searchable, integrated database of human genes that provides concise genomic, proteomic, transcriptomic, genetic and functional information on all known and predicted human genes. Information featured in GeneCards includes orthologies, disease relationships, mutations and SNPs, gene expression, gene function, pathways, protein-protein interactions, related drugs & compounds and direct links to cutting edge research reagents and tools such as antibodies, recombinant proteins, clones, expression assays and RNAi reagents.

GenePattern

GenePattern combines a powerful scientific workflow platform with more than 100 genomic analysis tools .

GenMAPP

GenMAPP is a free computer application designed to visualize gene expression and other genomic data on maps representing biological pathways and groupings of genes. Integrated with GenMAPP are programs to perform a global analysis of gene expression or genomic data in the context of hundreds of pathway MAPPs and thousands of Gene Ontology Terms ( MAPPFinder), import lists of genes/proteins to build new MAPPs ( MAPPBuilder), and export archives of MAPPs and expression/genomic data to the web.

Ingenuity Pathways Analysis

IPA software is a web-based application with expert-curated knowledge database that can assist researchers in modeling, analyzing and understanding complex biological pathways and networks.

LOCATE

LOCATE is a curated database that houses data describing the membrane organization and subcellular localization of proteins from the RIKEN FANTOM4 mouse and human protein sequence set. The membrane organization is predicted by the high-throughput, computational pipeline MemO. The subcellular locations were determined by a high-throughput, immunofluorescence-based assay and by manually reviewing peer-reviewed publications.

NCBI Structure

C n3D is a helper application for your web browser that allows you to view 3-dimensional structures from NCBI's Entrez retrieval service. Cn3D runs on Windows, Macintosh, and Unix. Cn3D simultaneously displays structure, sequence, and alignment, and now has powerful annotation and alignment editing features.

SAM

An Excel Add-in that can be applied to data from Oligo or cDNA arrays, SNP arrays, protein arrays, etc.; correlates expression data to clinical parameters including treatment, diagnosis categories, survival time, paired (before and after), quantitative (egg. tumor volume) and one-class. Both parametric and non-parametric tests are offered. Correlates expression data with time, to study time trends. The experimental units can fall into one or two classes, or be paired. Automatic imputation of missing data via nearest neighbor algorithm (better, faster in SAM version 2.0) .Adjustable threshold determines number of genes called significant. Uses data permutations to provide estimate of False Discovery Rate for multiple testing. Gene lists in Excel workbook form, easily exportable into TreeView. Cluster or other software.

TM4

Normalized and filtered expression files can be analyzed using TIGR Multiexperiment Viewer (MeV). MeV is a versatile microarray data analysis tool, incorporating sophisticated algorithms for clustering, visualization, classification, statistical analysis and biological theme discovery. MeV can handle several input file formats. These include the “.mev” and “.tav” files generated by TIGR Spotfinder and TIGR MIDAS, and also Affymetrix® (“.txt”) and Genepix® (“.gpr”) files.

Vocabularies

Gene Ontology

The Gene Ontology project provides a controlled vocabulary to describe gene and gene product attributes in any organism.

Literature Searching Tools

Biological Abstracts

Includes citations and some abstracts from over 6500 international life sciences journals. Fields covered include: biology, botany, zoology, biotechnology, and environmental studies.

Google Patents

All patents available through Google Patent Search come from the United States Patent and Trademark Office (USPTO). Google Patent Search covers the entire collection of issued patents and millions of patent application made available by the USPTO—from patents issued in the 1790s through those most recently issued in the past few months. To date, the USPTO has made available approximately 7 million patents and over a million patent applications.

Google Scholar

Google Scholar searches peer-reviewed papers, theses, books, abstracts and articles, from academic publishers, professional societies, preprint repositories, universities and other scholarly organizations.

GOPubMed

Your keywords are submitted to PubMed and the resulting abstracts are classified using Gene Ontology and Medical Subject Headings (MeSH). MeSH is a hierarchical vocabulary covering biomedical and health-related topics. GeneOntology is a hierarchical vocabulary for molecular biology covering cellular components, biological processes and molecular functions.

GraphPad Prism

GraphPad Prism is a combination of basic biostatistics, curve fitting and scientific graphing in one program. Designed for the practical scientist, Prism does not expect you to be a statistician. It guides you through each analysis - giving you as much help as you need - and tracks and organizes your work.

iHOP

A network of concurring genes and proteins extends through the scientific literature touching on phenotypes, pathologies and gene function. iHOP provides this network as a natural way of accessing millions of PubMed abstracts. By using genes and proteins as hyperlinks between sentences and abstracts, the information in PubMed can be converted into one navigable resource, bringing all advantages of the internet to scientific literature research.

NextBio

NextBio is a life science search engine that enables researchers and clinicians to access and understand the world's life sciences information. NextBio content includes pre-processed data from the public resources such as NCBI GEO (Gene Expression Omnibus), ArrayExpress, SMD (Stanford Microarray Database), and many others. In addition, individual organizations and users contribute data to NextBio for the benefit of the entire scientific community. Users and organizations can also keep their data private and share it with a select group of individuals. NextBio currently supports any type of gene-centric data (gene expression, proteomics, siRNA screens, etc.) for human, mouse, rat, fly, worm and yeast. We are actively working on adding support for monkey, plants and many other organisms.

Scirus

Scirus searches over 480 million science-specific Web pages. It filters out non-scientific sites, findspeer-reviewed articles such as PDF and PostScript files, which are often invisible to other search engines, and searches web information, preprint servers, digital archives, repositories and patent and journal databases.

Web of Science

Contains three ISI Citation Databases (Sciences, Arts & Humanities, Social Sciences), which together index over 8,000 peer-reviewed journals. Provides bibliographic data, author abstracts, and cited references. Useful for searching databases for articles that cite a known author or work.