Blastn equivalents of MegaBLAST options

MegaBLAST is a legacy sequence alignment tool optimized for rapid processing of long but slightly different nucleotide sequences. It is a part of the NCBI C Toolkit which last version was included in the NCBI BLAST+ 2.2.26 package released in March 2012. In the following NCBI BLAST+ releases, MegaBLAST was replaced with the blastn tool.

Although blastn and MegaBLAST implement nearly the same alignment algorithm, their command-line options differ. Here we describe blastn synonyms for MegaBLAST options.

In the tables below, MegaBLAST options are grouped by categories from the blastn help. The full lists of blastn and MegaBLAST options can be obtained from their help messages or by the following links: MegaBLAST options and NCBI BLAST+ options.

Input query options

Both MegaBLAST and blastn use input files of query sentences in the FASTA format.

MegaBLAST blastn Description
-i -query Input (query) file name.
-L -query_loc Location on the query sequence.
-S -strand Query strand(s) to search against database.

General search options

Note that MegaBLAST and blastn use different formats of sequence databases. MegaBLAST requires a database created by the formatdb tool from NCBI C Toolkit; a database for blastn must be created with the makeblastdb tool from the NCBI BLAST+ package.

MegaBLAST blastn Description
-d -db BLAST database name.
-o -out Output file name.
-e -evalue BLAST expectation value (E-value) threshold.
-W -word_size Word size (length of best perfect match).
-G -gapopen Cost to open a gap.
-E -gapextend Cost to extend a gap.
-q -penalty Penalty for a nucleotide mismatch.
-r -reward Reward for a nucleotide match.

Formatting options

MegaBLAST uses a pair of options (-m and -D) to specify its output format; blastn uses a single -outfmt option.

MegaBLAST blastn Description
-m -D -outfmt Output format for alignments.
-I -show_gis Show NCBI GIs (GenInfo Identifiers) in output.
-v -num_descriptions Number of database sequences to show one-line descriptions for.
-b -num_alignments Number of database sequences to show alignments for.
-T -html Produce HTML output.

Query filtering options

There are four available filters in MegaBLAST that can may be specified in the -F option:

  • D – the DUST algorithm for filtering low-complexity regions;
  • R – human-specific repeats;
  • V – vector screen;
  • L – low-complexity sequences (equivalent to D).

Of these filters, blastn supports only DUST.

MegaBLAST blastn Description
-F "D" -dust yes Filter low-complexity sequences in query with DUST.
-F "m D" -dust yes -soft_masking true Consider filtered low-complexity regions as soft masks.
-U -lcase_masking Use lower-case filtering of query FASTA sequences.

Restrict search or results

MegaBLAST blastn Description
-l -gilist Restrict search of database to list of GIs.
-p -perc_identity Percent identity threshold.

Discontinuous MegaBLAST options

MegaBLAST blastn Description
-N -template_type Discontiguous MegaBLAST template type.
-t -template_length Discontiguous MegaBLAST word template length.

Statistical options

MegaBLAST blastn Description
-z -dbsize Effective length of BLAST database.
-Y -searchsp Effective length of search space.
-H -max_hsps Maximum number of High-scoring Segment Pairs (HSPs) to save per database sequence.

Extension and miscellaneous options

MegaBLAST blastn Description
-y -xdrop_ungap X-dropoff value for ungapped extensions.
-X -xdrop_gap X-dropoff value for preliminary gapped extensions.
-Z -xdrop_gap_final X-dropoff value for the final gapped alignment.
-n -no_greedy Use non-greedy dynamic programming extension for affine gap scores.
-a -num_threads Number of parallel threads to use.

MegaBLAST options missing in blastn

Option Description
-O ASN.1 SeqAlign file; must be used in conjunction with -D 2 option.
-J Believe the query defline.
-M Maximal total length of queries for a single search.
-P Maximum number of positions for a hash value.
-s Minimal hit score to report.
-Q Masked query output; must be used in conjunction with -D 2 option.
-f Show full IDs in the output (default – only GIs or accessions).
-R Report the log information at the end of output.
-A Multiple Hits window size.
-g Make discontinuous MegaBLAST generate words for every base of the database.
-V Force use of the legacy BLAST engine.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s