Skip to Content

Qu'est-ce que fuzznuc et fuzzpro ?

Version imprimable

Voici un extrait d'un échange sur la liste emboss qui peut nous être utile :

Subject: Re: [EMBOSS] question about 'fuzznuc'and 'fuzzpro'
> I know I can give a pattern like 'ACCGGT' and search against a file which contains multiple sequences. Is there a way I can specify 
> a 'pattern file' which contains multiple patterns that I want to search for instead of just one pattern each time? For example, I have
> a fileA which contains multiple DNA sequences. I want to create a fileB which contains 20 patterns that I want to seach each of them
> against the sequences in the fileA. We are in the transition from GCG to EMBOSS. And the program 'findpatterns' in GCG can do this.
> But I couldn't find corresponding emboss program that does the same thing.

New in EMBOSS 4.0.0, contributed by Henrikki Almusa of Medicel in Helsinki.

fuzznuc (and fuzzpro and fuzztran) now can read in a file of patterns with the commandline syntax:
fuzznuc @patternfile

You can also use @patternfile in response to the prompt for a pattern.
Here is an example pattern file with FASTA-style IDs and mismatch counts for each pattern:
>pat1
cggccctaaccctagcccta
>pat2 <mismatch=1>
cg(2)c(3)taac
cctagc(3)ta
>pat3
cggc{2,4}taac{2,5}

Here is a file with just the second pattern, and no name (it will default to pattern1
cg(2)c(3)taac
cctagc(3)ta

You can set a default name with -pname and a default mismatch with -pmismatch
I note we could document this better in the fuzz* program manual entries. We will do for the 4.1 release.