Skip to main content

Secondary Databases (PROSITE, PRINTS, BLOCKS)

Secondary Databases (PROSITE, PRINTS, BLOCKS 



Secondary Databases


Introduction

Biological databases are broadly classified into primary and secondary databases.
Primary databases store raw experimental data (e.g., nucleotide or protein sequences), whereas secondary databases contain derived information obtained by analyzing primary sequence data.
Secondary databases are mainly used to:
Identify protein families
Detect conserved motifs, patterns, and domains
Predict protein function
Study structure–function relationships
Examples of secondary databases include PROSITE, PRINTS, BLOCKS, Pfam, etc.


1. PROSITE Database

Definition
PROSITE is a secondary database that documents protein domains, families, and functional sites in the form of patterns and profiles.

Developed by

Swiss Institute of Bioinformatics (SIB)
Maintained along with UniProt
Principle
PROSITE is based on the idea that functionally important regions of proteins are conserved during evolution.
These conserved regions can be represented as:

 1. Patterns (regular expressions)
2. Profiles (position-specific scoring matrices)


Components of PROSITE

Patterns
Short conserved motifs
Written as regular expressions
Useful for identifying active sites or binding sites
Example: Serine protease active site

Profiles
More sensitive than patterns
Can detect distant homologs
Represent the probability of amino acids at each position.

Documentation (PROSITE entries)
Each entry includes:
Description of the protein family/domain
Biological function
References
Links to UniProt


Applications

Protein function prediction
Identification of catalytic and binding sites
Annotation of newly sequenced proteins
Detection of protein families

Advantages
High specificity
Well-curated and annotated
Easy interpretation


Limitations

Patterns may miss distant homologs
False negatives may occur

2. PRINTS Database
Definition
PRINTS is a secondary protein database that identifies protein families using fingerprints, which are groups of conserved motifs.

Developed by

University of Manchester, UK


Principle
Unlike PROSITE, which uses single motifs, PRINTS uses multiple conserved motifs (fingerprints) to characterize a protein family.
A protein is considered a member of a family only if it matches most or all motifs in the fingerprint.


Structure of PRINTS

Each PRINTS entry consists of:
A set of conserved motifs
Alignment of sequences
Functional annotation
Cross-references to other databases


Key Features
Fingerprints improve accuracy
Reduces false positive matches
Useful for family-level classification


Applications

Identification of protein superfamilies
Functional annotation of proteins
Evolutionary studies
Validation of protein family membership

Advantages

High reliability due to multiple motifs
Better discrimination between closely related families
Limitations

Less sensitive to very divergent sequences
Smaller coverage compared to some databases

3. BLOCKS Database
Definition

BLOCKS is a database of conserved regions (blocks) in protein families, represented as ungapped multiple sequence alignments.

Developed by
Fred Hutchinson Cancer Research Center, USA


Principle

A block is a conserved region found in multiple proteins, without insertions or deletions.
These blocks represent functionally or structurally important regions of proteins.


Characteristics
Derived from PROSITE families
Focuses on local conserved regions
Uses position-specific scoring matrices (PSSMs)

BLOCKS Format


Each entry contains:

Protein family name
Conserved block sequences
Alignment information
Scoring matrices


Applications

Detection of conserved motifs
Protein classification
Functional prediction
Sequence similarity searches

Advantages
Highly conserved regions improve accuracy
Ungapped alignments are easy to analyze


Limitations

Ignores variable regions
Limited coverage for novel proteins


Comparison of PROSITE, PRINTS and BLOCKS




Importance of Secondary Databases

Help in functional annotation of proteins
Aid in genome annotation projects
Support comparative genomics and evolutionary studies
Essential tools in bioinformatics and proteomics.


Conclusion

Secondary databases such as PROSITE, PRINTS and BLOCKS play a crucial role in understanding protein structure and function. By analyzing conserved motifs and domains, these databases help in accurate protein classification, functional prediction, and evolutionary analysis, making them indispensable tools in modern bioinformatics.



Comments

Popular Posts

❥NORTHERN BLOTTING

NORTHERN BLOTTING – 30 MARK DETAILED NOTES  𓆞❥ 𓆞❥ 𓆞❥ 𓆞❥ 𓆞❥ 𓆞 ❥ 𓆞❥ 𓆞❥  Northern blotting is a molecular biology technique used to detect specific RNA molecules in a complex mixture. It provides information about gene expression, RNA size, and transcript abundance by hybridizing RNA with a labeled complementary DNA or RNA probe. 📌 Named by analogy to Southern blotting (DNA detection). 2. Principle The principle of Northern blotting is based on: Separation of RNA molecules by size using denaturing agarose gel electrophoresis Transfer (blotting) of separated RNA onto a nylon or nitrocellulose membrane Hybridization of membrane-bound RNA with a labeled complementary probe Detection of RNA–probe hybrids by autoradiography or chemiluminescence ✔ Only RNA sequences complementary to the probe will be detected. 3. Types of RNA Analyzed mRNA (most common) rRNA tRNA miRNA and siRNA (with modified protocols) 4. Requirements / Materials Total RNA or poly(A)+ RNA Denaturing agarose ...

Biological Databases – Types of Data and DatabasesNucleotide Sequence Databases (EMBL, GenBank, DDBJ)

Biological Databases – Types of Data and Databases Nucleotide Sequence Databases (EMBL, GenBank, DDBJ) 1. Introduction Biological databases are systematic, computerized collections of biological information that allow efficient storage, retrieval, updating, and analysis of large volumes of biological data. With the advent of genome sequencing, molecular biology, and bioinformatics, biological databases have become essential tools in biological research. These databases support studies in genomics, proteomics, evolutionary biology, taxonomy, medicine, agriculture, and biotechnology. 2. Types of Data Stored in Biological Databases Biological databases store diverse types of biological information, including: 1. Sequence Data DNA sequences RNA sequences Protein sequences 2. Structural Data Three-dimensional structures of proteins Nucleic acid structures 3. Functional Data Gene functions Enzyme activity Regulatory elements 4. Genomic Annotation Data Gene location Exons, introns Promoters a...

Information retrieval from databases - search concepts, Tools for searching, homology searching, finding Domain and Functional site homologies

Information retrieval from databases - search concepts, Tools for searching, homology searching, finding Domain and Functional site homologies Information Retrieval from Databases 1. Introduction Information retrieval in bioinformatics refers to the process of extracting relevant biological data (DNA, RNA, protein sequences, structures, or functional information) from databases. Aim : Identify sequences, functions, or structural features for analysis, comparison, and annotation. Databases can be primary (raw sequence data) or secondary/derived (annotated, processed data). 2. Search Concepts in Biological Databases 2.1 Types of Searches Exact Match Search Returns results only if the query exactly matches database entries. Useful for known accession numbers or IDs. Pattern/Keyword Search Searches based on specific motifs, keywords, or annotations. Example: “kinase domain,” “signal peptide.” Similarity/Homology Search Detects sequences similar to the query based on sequence alignment. Use...

❃HPLC – High Performance Liquid Chromatography

HPLC – High Performance Liquid Chromatography ┏━━━━━ •❃°•°❀°•°❃•━━━━•━━━┓  1. Introduction High Performance Liquid Chromatography (HPLC) is an advanced analytical technique used for the separation, identification, and quantification of components present in a mixture. It is based on the differential distribution of analytes between a stationary phase and a liquid mobile phase under high pressure. HPLC is widely used in biochemistry, biotechnology, pharmaceuticals, food analysis, environmental studies, and clinical diagnostics. 2. Principle of HPLC The principle of HPLC is based on partition, adsorption, ion-exchange, or size-exclusion mechanisms, depending on the type of column used. A liquid mobile phase is pumped at high pressure through a column packed with fine stationary phase particles Sample components interact differently with the stationary phase Components with stronger interaction elute slower Components with weaker interaction elute faster Separated components are detec...

Exploitation of Somaclonal and Gametoclonal Variations for Plant Improvement

Exploitation of Somaclonal and Gametoclonal Variations for Plant Improvement  1. Introduction Plant tissue culture often induces genetic and epigenetic variations among regenerated plants. These variations, when stable and heritable, can be exploited as a source of novel traits for crop improvement. Somaclonal variation: Variation arising in plants regenerated from somatic cells cultured in vitro. Gametoclonal variation: Variation arising in plants regenerated from gametic cells (anther, pollen, ovule culture). Both provide additional genetic variability beyond conventional breeding. 2. Somaclonal Variation 2.1 Definition Somaclonal variation refers to genetic variation observed among plants regenerated from somatic tissue cultures, such as callus, suspension cultures, or explants. Term coined by Larkin and Scowcroft (1981). 2.2 Sources of Somaclonal Variation Chromosomal changes Aneuploidy Polyploidy Chromosome rearrangements Gene mutations Point mutations Insertions and deletions...

Microbial Production of PharmaceuticalsSomatostatin, Humulin and Interferons

Microbial Production of Pharmaceuticals Somatostatin, Humulin and Interferons 1. Introduction Advances in recombinant DNA technology have enabled microorganisms to produce human therapeutic proteins safely, economically and in large quantities. Microbial systems such as Escherichia coli and yeast (Saccharomyces cerevisiae) are widely used for the production of pharmaceuticals that were earlier isolated from human or animal tissues. Important microbial-derived pharmaceuticals include somatostatin, human insulin (Humulin) and interferons. 2. Advantages of Microbial Production of Pharmaceuticals High yield and rapid production Cost-effective and scalable Free from animal pathogens Consistent product quality Easy genetic manipulation 3. General Steps in Microbial Production of Recombinant Pharmaceuticals Isolation of target gene Construction of recombinant DNA Insertion into suitable vector Transformation into host microorganism Expression of protein Downstream processing and purification ...