Abstract
Recently, Next Generation Sequencing(NGS) techniques produces huge amount of sequence data day by day. To analyze the sequence data, the efficient method which can handle large amount of data is required. Self Organizing Map (SOM), which uses the frequencies of N-tuples, can categorize the set of DNA sequences with unsupervised learning. In this paper, SOM which uses the correlation coefficient among the nucleotides is proposed, and the performance is examined in the experiments of mapping the genome sequences of several species.