Parallel and Distributed Processing Symposium, International
Download PDF

Abstract

Algorithm development for finding quasiperiodic regions in sequences is at the core of many problems arising in biological sequence analysis. We solve an important problem in this area. Let {\rm A} be an alphabet of size \eta and {\rm A}^\iota denote the set of sequences of length \iota over {\rm A}. Given a sequence S=s_1s_2\cdots s\iota \in {\rm A}^\iota, a positive integer p is called a period of S if s_i=s_{i+p} for 1\le i\le \iota -p. S is called p-periodic if it has a minimum period p. Let \Omega _\iota (p) denote the set of p-periodic sequences in {\rm A}^\iota. A natural measure of "nearness to p-periodcity" for S is the average Hamming distance to the nearest p-periodic sequence: D(S)=\min _{{\rm T}\in \Omega \iota (p)}D(S,T). If T is a sequence \in \Omega \iota (p) such that D(S,T)=D(S), then T is called a nearest p-periodic sequence of S and S is called p-quasiperiodic associated with the score D(S). This paper develops an efficient algorithm for finding a nearest p-periodic sequence of S by means of its modulo-p \beta =(\underbrace {q+1,\cdots ,q+1_r, \underbrace q,\cdots ,q}_{p-r}, where \iota =\alpha _1+\alpha _2+\cdots +\alpha _n is a partition of \iota and q is the quotient and r is the remainder when \iota is devided by p. This paper shows that there exists a sequence in {\rm A}^\iota whose modulo-p incidence matrix has row sum vector \alpha and column sum vector \beta.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!