Adding rs numbers to VCF file

The VCF format provides a fixed field for a variant ID. It is recommended to use IDs from the NCBI dbSNP database (so-called rs numbers) for variants that have been already described in it. Here we describe how to add rs numbers to a custom VCF file using the bcftools package.

Step 1. Obtain dbSNP VCF file

To add rs numbers to a VCF file, we need the dbSNP VCF file that contains that numbers. The file can be downloaded from the NCBI FTP server as described here.

For example, the VCF file of all human variants from the dbSNP build 147 on the GRCh37.p13 assembly can be obtained at the following location: ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/All_20160408.vcf.gz.

Step 2. (Optional) Remove existing IDs from VCF file

You may skip this step if you would like to preserve existing IDs in your VCF file. Otherwise, the existing variant IDs can be removed from the VCF file using the bcftools annotate tool with the –remove option.


bcftools annotate --output file.noids.vcf.gz --output-type z \
  --remove ID file.vcf.gz
tabix -p vcf file.noids.vcf.gz

Note that we use the –output-type option to produce a gzipped VCF file and apply tabix to index it for the next step.

Step 3. Add rs numbers from dbSNP VCF file

Finally, we use bcftools annotate with the –columns option to add the rs numbers to the VCF file.


bcftools annotate --annotations All_20160408.vcf.gz --columns ID \
  --output file.rsnum.vcf.gz --output-type z file.noids.vcf.gz

2 thoughts on “Adding rs numbers to VCF file

  1. When I try to use this command, it shows “[W::bcf_hdr_check_sanity] GL should be declared as Number=G”. How do I fix this?

    Like

    Reply
    • It looks like there is something wrong with the header of your VCF files. You can extract the header using `bcftools view –header-only`, fix it in any text editor, and replace headers in the VCF files using `bcftools reheader`.

      Like

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s