Loading new genomes
Selecting “New” from the main menu takes you to the new project page, where you can import two genomes to compare.
Sequence files
You can add the top and bottom genome files with one of:
dragging and dropping from your file browser into the box
typing the full path to the filename in the box
typing an accession in the box. This can be a GenBank/RefSeq sequence accession, or an assembly accession starting with
GCA_orGCF_.
Ziplign will first check if what you put in the box is a file on your computer.
If it is not, then it checks to see if it “looks like” an accession. This
means that it starts with GCA_, GCF_, AC_, NC_, NG_,
NT_, NW_, NZ_, or it starts with two letters followed by at
least six digits and then anything else afterwards - for example CP039850.1.
If it looks like an accession then it will try to download
the sequence and annotation. Note that this is not sanity checked and providing
an accession that does not exist will result in errors upon trying to
download. It is intentionally permissive - allowing two letters plus six digits
etc - so as to not rule out real accessions because it is not trivial to
specify exactly what counts as a real accession.
Ziplign automatically detects the format (and any compression) of each sequence file based on its contents, not the name of the file.
Ziplign supports these file formats:
FASTA
FASTQ (loading sequencing reads is NOT recommended! The assumption is that you want to view contigs)
GFF3, as long as the sequence(s) are included in the file as well as the annotation. Ziplign will show the annotation features.
GenBank. Ziplign will use the sequence, but for now has partial support for the annotation. It will load the genes only, and no other features.
EMBL. Ziplign will use the sequence, but for now has partial support for the annotation. It will load the genes only, and no other features.
Ziplign can read uncompressed files, and files compressed with gzip, bzip2 and xz.
BLAST options
Ziplign runs blastn to compare the genomes. You can add options to the
blastn call. Please note that these options are NOT sanity checked and
are simply added onto the end of the blastn call.
Please do not use any options relating to file input or output, since it will
cause Ziplign to crash. Specifically, Ziplign already uses the options -db,
-query, -out, and -outfmt. Do not try to change them.
The default in the BLAST options box is -evalue 0.1,
to remove obviously short/unlikely matches. This is different from the
BLASTN defaults, which has no e-value cutoff.
If you want no e-value cutoff, then delete the text from the box, so that it
is empty and the default BLAST settings are used.
It is beyond the scope of this help to go into the details of the extra
BLASTN options. Run blastn -help in a terminal to see the full help.
Start processing
Once the genome files are provided, the start button will change from disabled to enabled. Press it and Ziplign will then:
check the input genome files exist
import the genomes
make a BLAST index of the bottom genome
run BLAST using the top genome as the query against the bottom genome database
import the BLAST results
switch to the genome comparison view
The progress is output in text at the bottom of the window. If anything goes wrong, error messages will appear there.