Converting an AGP file to the BED format

The AGP format is used to describe the assembly structure in the NCBI Genome database. Since AGP is a plain-text tabular data format that specifies positions of smaller sequence objects on larger ones (e.g., contigs on scaffolds), AGP files can be converted to the BED format for their further processing.

Continue reading

Combining a large number of VCF files

The bcftools and vcftools packages provide routines for merging or concatenating multiple VCF files. However, specifying a large number of input VCF files may terminate their processing because an operating system will not be able to keep so many files opened. This problem can be overcome by iterative combining of files: first, pairs of the original VCF files are processed, then pairs of the obtained files are processed and so on until we get the resulting VCF file.

Here we describe an iterative scheme for merging or concatenating VCF files using bcftools and GNU parallel and present a Python script that implements it.

Continue reading

Restoring models in protein structure files by Swiss PDB Viewer

Despite its name, Swiss PDB Viewer implements a number of features besides visualization of protein molecules. One of such features is side chain reconstruction for protein structures that contain only backbone atoms. However, Swiss PDB Viewer does not write model records to its output PDB files that may cause problems with other PDB-processing programs.

In this post, we present a Python script that adds proper model records to a PDB file produced by Swiss PDB Viewer.

Continue reading

Command-line options of Isaac Variant Caller

Isaac Variant Caller implements the fast variant-calling algorithm and can be considered as an alternative to GATK or samtools variant callers. Unfortunately, it seems to have no manual that would describe its command-line options.

Here we give the list of the Isaac Variant Caller command-line options obtained from its source codes that are publicly available on GitHub.

Continue reading