Seq2Logo 2.0 - DTU Health Tech (2025)

Sequence logo generator


Seq2Logo is a web-based sequence logo generation method for construction andvisualization of amino acid binding motifs and sequence profiles includingsequence weighting, pseudo counts and two-sided representation of amino acidenrichment and depletion.

Note that Seq2Logo as default includes a pseudo count correction for lowcounts.This means that the amino acid frequencies displayed in the sequence logosare corrected for low number of observations using a Blosum amino acidsimilarity matrix. To turn this feature off, the Weight on prior must be set tozero.

Submission


CITATIONS

For publication of results, please cite:

  • Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion. Martin Christen Frolund Thomsen; Morten Nielsen, Nucleic Acids Research 2012; 40 (W1): W281-W287.
    Abstract
    Full text

NOTE

For big submissions please keep in mind that the computation time scales exponentially.Eg.
A job of 10000 sequences with a sequence lenght of 38 takes about 20 seconds.
A job of 20000 sequences with a sequence lenght of 38 takes about 80 seconds.
A job of 40000 sequences with a sequence lenght of 38 takes about 6 minuts.
A job of 80000 sequences with a sequence lenght of 38 takes about 24 minuts.
etc.
if you submit too large alignments, the job might not finish within the server's time limit of 2 hours. To get results from these large submission you can download a local version of Seq2Logo and run it on your own machine.

Usage Instructions

The user interface of Seq2Logo is split in 3 parts; submission, graphical layout and advanced settings.

Submission

In the submission part the user can:

  1. Upload their alignment file, either by copy/paste or by choosing a local file.
  2. Specify the logo type, either Shannon, Kullback-Leibler, Weighted Kullback-Leibler, Probability Weighted Kullback-Leibler or PSSM-Logo.
  3. Choose which kind of sequence weighting should be used to reduce sequence redundancy.
  4. If the Hobohm algorithm is chosen, the user can also specify the similarity threshold for two sequences to be deemed (1 is equal 100% identity, default is 63%).
  5. Assign the weight on prior value that should be used to adjust for a small alignment file (Recomended for dataset with less than 50 sequences).
  6. Type the unit of the Y-axis. (It is important to note that MSA and rawpeptide input data will always be calculated as bit content *)
  7. Choose additional output formats for the logo file.

* Shaner et al. gives a good description of the information content (the bit content) in their paper 'Sequence Logos: A Powerful, Yet Simple, Tool ', which can be accessed Here.

NOTE: As stated above the paste field Seq2Logo supports following alignments formats: Peptides, Fasta, Clustal,weight matrices and frequency tables.

FORMAT DESCRIPTIONS:

The Peptide format is a file where each line is a new peptide sequence, only the amino acid and gap symbols areaccepted.

The Fasta format is a file where '>' describes the header line, and all following lines composes the sequence belongingto the header. Only the amino acid and gap symbols are accepted in the sequence.

The CLUSTAL format is a file where the data is separated in two or three columns, first column containing the sequencename, second column containing the sequence, and the optional third column containing the position number of the last aminoacid.

The PSSM format is a file where the data is stored in a weight matrix. There are a few different formats accepted by Seq2Logo:
General for all PSSM is the optional header line (starting with: 'Last position-specific scoring matrix...'),and the required amino acid header line (this can now contain other character if the PSSM-logo is chosen).
In regards to the weights in the PSSM, only numbers (integers, floats and scientific), are allowed.

The Blast Matrix: Special format.

Simple Weight Matrix: This is the simplest of the weight matrices,with only the weights provided. (Note: These weights cannot be integers!)

Weight Matrix w/ position: This is the same as the simple matrix,but with the first column specifying the position (Note: Integers allowed!)

Weight Matrix w/ position and consensus sequence: This is the sameas the position matrix, but with an aditional column specifying the consensussequence (Note: This extra column is not usedby Seq2Logo, but only allowed for the convenience.)

Special Weight Matrix: This is a scrapped version of the simple matrix, and itallows the user to specify other than amino acids eg. gaps. (Note: This matrix can only be used with the PSSM-logo option,and there is a limitation of minimum 3 characters and maximum 20 characters!)

The Frequency format is identical to a PSSM-matrix, but whereweights/frequencies sums up to 1.00 per position (up to 2% inaccuracy allowed), and where of cause no weight/frequency is negative.

Seq2Logo 2.0 - DTU Health Tech (1)

Graphical Layout

In the grafical layout part the user can:

  1. Assign the number of stacks per line.
  2. Assign the number of lines per page.
  3. Set the resolution of the image. For convenience a dropdown menu has been provided with som standard formats to choose from.
  4. Assign a logo title. (This is optional.)
  5. Specify the layout of the graph. **
  6. Choose a coloring scheme from the list, or assign the colors of the individual amino acids manually.
*Feel free to send an email request if you want additional formats added.
**This field allows you to really customize your logo.

Seq2Logo 2.0 - DTU Health Tech (2)

Advanced Settings

In the advanced settings part the user can:

  1. Set the minimum width for stacks with gaps. * **
  2. Set the position number of the first amino acid in the alignment.**
  3. Set the frequency of which the position numbers are shown on the X-axis. ***
  4. Set a segment range, if only a part of the full alignment is wanted. ****
  5. Set the Y-axis range, This option allows the user to manually set the Y-axis maximum and minimum value, which makes it easier to compare several logos with eachother. *****
  6. Upload separate substitution frequency matrix.
  7. Upload separate Background frequency file (distribution of amino acids).
*If set to 1 there is no width adjustment of the stacks to show positions where gaps occur.
**This feature is meant for MSA and rawpeptide formats only.
***If the value is set to 1 all positions numbers are shown. If the value is left out or 0 the interval is determined automatically.
****Use the following format is "start-end", eg. 5-56
*****Use the following format: "Ymin:Ymax", eg. -4.32:4.32

Seq2Logo 2.0 - DTU Health Tech (3)

Implementation of easy access to Seq2Logo from other servers

Learn how to make an easy transfer of alignment files from your program or webpage to Seq2Logo Click here.

Output Format

DESCRIPTION

Once the Seq2Logo server has finished running the job you submitted it will show an image file containing the logo. This logo describes the information content of the alignment file you submitted.

  1. The Y-axis describes the amount of information in bits*.
  2. The X-axis shows the position in the alignment.
  3. At each position there is a stack of symbols representing the amino acid. Large symbols represent frequently observed amino acids, big stacks represents conserved positions and small stacks represents variable positions.
  4. The chosen formats plus the raw eps and the weight matrix is downloadable through the links in the top left corner.
  5. By clicking the "show" link next to the "Warning!" sign, a list of the warnings will be shown. This will tell the user if any problem occurred, which might compromise the quality of the logo.
  6. By clicking the "show" link next to the "Settings:" sign, a list of the user specified settings, which was used in the creation of the logo, will be shown.

* You can rename the Y-axis unit to what you prefer, but for all logos except of PSSM-logo the true unit is the bit content.

Seq2Logo 2.0 - DTU Health Tech (4)


EXAMPLE OUTPUT

There are multiple logo types which influence the visual output. Click to show these different outputs here:

There are also a few methods which influence the visual output. These different outputs are shown here.

An alignment with gaps is handled by ignoring the gaps when calculating frequencies, and by shrinking the width of the stack according to the gap percentage.

Seq2Logo 2.0 - DTU Health Tech (5)

The Shannon Logo


Article Abstract


Seq2Logo: A method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion.

and ,DTU Health Tech, Technical University of Denmark, DK-2800 Kgs Lyngby, Denmark.

Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphicalrepresentation of the information content stored in a multiple sequencealignment (MSA) and provide a compact and highly intuitive representation of theposition-specific amino acid composition of binding motifs, active sites, etc.in biological sequences. Accurate generation of sequence logos is oftencompromised by sequence redundancy and low number of observations. Moreover,most methods available for sequence logo generation focus on displaying theposition-specific enrichment of amino acids, discarding the equally valuableinformation related to amino acid depletion. Seq2logo aims at resolving theseissues allowing the user to include sequence weighting to correct for dataredundancy, pseudo counts to correct for low number of observations anddifferent logotype representations each capturing different aspects related toamino acid enrichment and depletion. Besides allowing input in the format ofpeptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providingeasy access for non-expert end-users to characterize and identify functionallyconserved/variable amino acids in any given protein of interest. The output fromthe server is a sequence logo and a PSSM. Seq2Logo is available athttp://www.cbs.dtu.dk/biotools/Seq2Logo.

Abstract
Full text & Supplementary Data

Software Downloads

  • Version 2.1
    • all
  • Version 2.0
    • all

GETTING HELP

If you need help regarding technical issues (e.g. errors or missing results) contact Technical Support. Please include the name of the service and version (e.g. NetPhos-4.0) and the options you have selected. If the error occurs after the job has started running, please include the JOB ID (the long code that you see while the job is running).

If you have scientific questions (e.g. how the method works or how to interpret results), contact Correspondence.

Correspondence: Technical Support:

Seq2Logo 2.0 - DTU Health Tech (2025)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Manual Maggio

Last Updated:

Views: 5668

Rating: 4.9 / 5 (69 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Manual Maggio

Birthday: 1998-01-20

Address: 359 Kelvin Stream, Lake Eldonview, MT 33517-1242

Phone: +577037762465

Job: Product Hospitality Supervisor

Hobby: Gardening, Web surfing, Video gaming, Amateur radio, Flag Football, Reading, Table tennis

Introduction: My name is Manual Maggio, I am a thankful, tender, adventurous, delightful, fantastic, proud, graceful person who loves writing and wants to share my knowledge and understanding with you.