The subsequence search finds sequences that carry a given part of sequence or sequence motifs described in a query sequence. This type of search is relevant in case of filtering by sequence parts responsible for e.g. structural stability or biological activity. The search takes into account only exact matches within the sequences to be filtered.
The full sequence search filters based on exact, full-length sequence matches. The query sequence matches only if it possesses all the monomer abbreviations present in a whole sequence to be searched.
Table 1. Subsequence and full sequence searches
Query sequence | Target sequence | Hit | |
---|---|---|---|
Full sequence | Subsequence | ||
HCAYKAMGNMAMCAQRTPY | HCAYKAMGN MAMCAQRTPY (exact match) | ||
RPHCAYKAMGNMAMCAQRTPYGS (submatch) | |||
HCAYKAMGNMAMC V QRTPY (mutation) | |||
HCAYKAMGNMAMCAQRT G PY (insertion) | |||
HCAYKAMGNMA MCAQRTPY (deletion) | |||
HCAYKAMGNMAMCAQRTPY PY (duplication) |
The distance-based similarity search enables to filter sequence variants which differ in neighbouring motifs, monomer composition and length.
To perform the search, the distance should be set by providing a number. The value means the number of differences between the query sequence and the target sequences.
Table 2. Distance-based similarity search
Query sequence | Target sequence | Hit | ||
---|---|---|---|---|
Distance is exactly 1 | Distance is exactly 2 | Distance is up to 5 | ||
HCAYKAMGNMAMCAQRTPY | HCAYKAMGNMAMCAQRTPY (wild type) | |||
HCAYKAMGNMAMC V QRTPY (mutation) | ||||
HCAYKAMGNMAMCAQRTPY K (insertion) | ||||
HCAYKAMGNMA MCAQRTPY (deletion) | ||||
HCAYKAMGNMAMCAQRTPY PY (duplication) | ||||
HC W KAMGNMAMC V QRTPY (mutation) | ||||
HCAY P KAMGNMAMCAQRTPY K (insertion) | ||||
HCAYKAMGNMAMCAQRTPY QRTPY (duplication) | ||||
HCAYK HCAYK AMGNMAMCAQRTPY (duplication) | ||||
M HCAYK YK AMG Q MAMCAQRTP Y (all types of changes) |