Montag, 31. August 2009

Mappable Map

Back from vacation. This three liner creates a mappable map of the genome. It takes about a day on our new 16 core machine that I wanted to stress a little bit. splitter is of course from emboss. bowtie is used for quick exact matching.

for fasta in $(ls ${FASTADIR}*.fa);
do splitter $fasta raw::stdout -size 36 -overlap 35 -auto | $BOWTIE -r -v 0 -m 1 -p 16 $BOWTIEINDEX - | cut -f 1 | sort -n | awk -v fa=$(basename $fasta .fa) 'BEGIN{chr="chr" substr(fa,2)}{ if(NR==1){firstpos=$1;lastpos=$1};if($1 > lastpos + 1){ print chr "\t" firstpos "\t" lastpos; firstpos = $1 } lastpos = $1 }END{ print chr "\t" firstpos  "\t" lastpos}' > $(basename $fasta .fa).bed;

Keine Kommentare:

Kommentar veröffentlichen