Restoring models in protein structure files by Swiss PDB Viewer

Despite its name, Swiss PDB Viewer implements a number of features besides visualization of protein molecules. One of such features is side chain reconstruction for protein structures that contain only backbone atoms. However, Swiss PDB Viewer does not write model records to its output PDB files that may cause problems with other PDB-processing programs.

In this post, we present a Python script that adds proper model records to a PDB file produced by Swiss PDB Viewer.

Continue reading

Command-line options of Isaac Variant Caller

Isaac Variant Caller implements the fast variant-calling algorithm and can be considered as an alternative to GATK or samtools variant callers. Unfortunately, it seems to have no manual that would describe its command-line options.

Here we give the list of the Isaac Variant Caller command-line options obtained from its source codes that are publicly available on GitHub.

Continue reading

Filtering noise in LASTZ dot plots

LASTZ, a whole-genome alignment tool, provides an option to produce a dot plot file of the obtained pairwise alignments. Such a file can be visualized in R using its plot function or from the command line using this R script. However, LASTZ dot plots often contain noise that originates from repetitive elements even if the genomes being aligned to each other have been masked.

For example, the dot plot below shows the pairwise alignments between chromosome 1 sequences of the human genome (the GRCh38.p2 assembly) and the chimpanzee genome (the Pan_troglodytes-2.1.4 assembly). Both sequences were masked with RepeatMasker before alignment; LASTZ was launched with the following parameters.

lastz hs_ref_GRCh38.p2_chr1.mfa \
    ptr_ref_Pan_troglodytes-2.1.4_chr1.mfa \
    --nogapped --notransition --step=20 --ambiguous=iupac \
    --format=rdotplot --output=human_chimp_chr1.rdotplot
lastz-alignment-human-chimpanzee-chromosome-1

LASTZ alignments between chromosome 1 sequences of the human and chimpanzee genomes.

Continue reading

Compiling PHAST under OS X Yosemite or higher

The PHAST (stands for PHylogenetic Analysis with Space/Time models) package implements a number of methods related to comparative and evolutionary genomics. PHAST depends on the LAPACK library and, when compiled under OS X, uses its built-in version. However, the compilation of PHAST under OS X Yosemite or higher stops showing the following error:

fatal error: 'vecLib/clapack.h' file not found

The reason the compilation fails is that the vecLib framework that had been considered deprecated in earlier OS X versions was removed starting from OS X Yosemite. Instead of vecLib, one should use the Accelerate framework embedded in OS X. For that purpose, the files include/external_libs.h and src/make-include.mk should be modified in the following way.

-#include <vecLib/clapack.h>
+#include <Accelerate/Accelerate.h>
-LIBS = -lphast -framework vecLib -lc -lm
+LIBS = -lphast -framework Accelerate -lc -lm

Besides, the FSHIFT macro in the file src/util/clean_genes.c should be replaced with another one (e.g., FRAMESHIFT) because the Accelerate framework contains the macro of the same name but with different meaning.

The changes described above are included in my fork of the original PHAST repository on GitHub: https://github.com/gtamazian/phast.

Creating GIF animations of protein molecules with PyMOL

PyMOL is an open-source molecular visualization system useful for producing high-quality figures of protein structures. Besides static figures, PyMOL can also generate animations with the mpng command that writes movie frames to separate files. However, mpng provides no options to customize the produced images. Here we describe an appoach to get a customized looped animation in PyMOL and present a Python script implementing it. The script is based on Maximilian Ebert’s solution from the PyMOL mailing list.

Continue reading

Blastn equivalents of MegaBLAST options

MegaBLAST is a legacy sequence alignment tool optimized for rapid processing of long but slightly different nucleotide sequences. It is a part of the NCBI C Toolkit which last version was included in the NCBI BLAST+ 2.2.26 package released in March 2012. In the following NCBI BLAST+ releases, MegaBLAST was replaced with the blastn tool.

Although blastn and MegaBLAST implement nearly the same alignment algorithm, their command-line options differ. Here we describe blastn synonyms for MegaBLAST options.

Continue reading