Education page
BLAST tutorial

Post-BLAST Analysis
PSI-BLAST

Position-Specific Iterative (PSI) BLAST is a program based on the BLAST 2.0 algorithm that is designed to detect weak relationships between the query and members of the database not necessarily detectable by standard BLAST searches. The added sensitivity of this program over regular BLAST comes from the use of a profile that is constructed (automatically) from a multiple alignment of the highest scoring hits in the initial BLAST search. The profile is generated by calculating position-specific scores for every position in the alignment. A highly conserved position will receive a high score and weakly conserved positions receive scores near zero. The profile is then used to perform additional BLAST searches (called iterations) and the results of each iteration used to refine the profile.

PSI-BLAST analysis is useful both for identifying the distant members of a protein family, whose relationship is not recognizable by straight sequence comparison, and also for deducing the function of hypothetical proteins that are unannotated in the database.

Visit the PSI-BLAST Tutorial for further guidance.

  • A PSI-BLAST query is identical to a BLAST query with added specification by the user of the expectation (E) value cut-off for inclusion of a match in the first and subsequent iterations. The E value cut-off can always be over-ridden by the user on a case by case basis if a sequence hit of interest is worse than the threshold.
  • The initial PSI-BLAST search uses the same matrix options available for Gapped BLAST, since it is a Gapped BLAST search.
  • Each iteration of the search uses a position-specific substitution matrix built from the search results of the previous iteration.
  • The user can continue to search iteratively until satisfied that no new matches will be identified. The point at which no new hits are identified by additional searches is known as "convergence".
  • In addition to the tutorial try the following resources for insight into PSI-BLAST: (1) Altschul S.F., and Koonin, E.V. 1998. Iterated profile searches with PSI-BLAST - a tool for discovery in protein databases. Trends in Biochemical Sciences. 23(11):444-7 and (2) Altschul, S.F. The Statistics of Sequence Similarity Scores.

  • Superfamily analysis with PSI-BLAST
  • Frequently no single query will pull out all family members. Therefore it is best to put some thought into the choice of query(ies).
  • One approach is to perform an exhaustive iterative search using all known members of the superfamily.
  • A second approach is to choose as the query a small number of members whose BLAST hits are most diverse (e.g. forming the greatest number of distinct clusters).

  • Functional analysis with PSI-BLAST
  • Annotated homologs of the sequence of interest may be identified in other organisms.
  • If such homologs have not already been identified using a BLAST search, it is likely that the new relatives identified using PSI-BLAST will possess, at best, weak similarity and/or subtle relationship to the query. These relationships may become apparent using profile based searching methods.
  • The true probability of any such relationship occuring by chance is defined by the E value corresponding to the first time the sequence appears in the search. Later E values for that hit are not meaningful.
  • Verification of the relationship between the hit and the query may be accomplished by doing the PSI-BLAST analysis in reverse; that is, looking for the query among the relatives of the hit of interest. The E value of that alignment is expected to be significant.
  • Where homologs are not identified, PSI-BLAST may be nevertheless provide information about the function of a protein by identifying domains of known biochemical function as partial length alignments to the query.


  • Multiple Alignment
    An alignment of three or more sequences with gaps (spaces) inserted into the sequences to optimize the alignment of residues with common structural positions or common ancestry.
  • Roughly define the family of interest using BLAST or PSI-BLAST search algorithms, and compile a non-redundant file of the sequences belonging to the family of interest in FASTA format.
  • Refine the list of family members by discarding any that appear to be a different size, or by trimming those with extraneous sequence unrelated to the query.
  • Choose a multiple alignment program. The Clustal W algorithm is a widely used and reasonably robust alignment program that is free and available for Macintosh, Windows and UNIX platforms.
  • Each sequence is compared by the program in pairwise fashion to every other sequence in the collection, and a distance matrix and phylogenetic tree constructed based on the relative relatedness of each pair of sequences. The alignment is constructed using the tree as a guide, with the most highly related sequences being aligned first. The alignment is adjusted as additional less related sequences are added. Gaps are added as needed, with attention paid to predicted secondary structure constraints.
  • The final alignment should be inspected to make sure it is logical. The alignment can subsequently be edited, trimmed and annotated.


  • Motif searching with PHI-BLAST
    A new service called Pattern Hit Initiated BLAST (PHI-BLAST), that searches for particular patterns in protein queries is now available in Version 2.0 of the BLAST program suite.
  • PHI-BLAST expects as input a protein query sequence and a pattern contained in that sequence.
  • PHI-BLAST searches the specified database for other protein sequences that also contain the input pattern and have significant similarity to the query sequence in the vicinity of the pattern occurrences.
  • Statistical significance is reported using E-values as for other forms of BLAST, but the statistical method for computing the E-values is different.
  • PHI-BLAST is integrated with Position-Specific Iterated BLAST (PSI-BLAST), so that the results of a PHI-BLAST query can be used to initiate one or more rounds of PSI-BLAST searching.
  • PHI-BLAST is under development and may change substantially over time.
  • Revised June 2, 2000

    BLAST tutorial glossary Query tutorial PSI-BLAST tutorial Guide BLAST information