Hi all,
I'm working in the field of bioinformatics where I study more a large number
of variations (N> 10E6) along the human genome among individuals affected
by a genetic disease. ( * )
my data looks like this (much simplified !)
#CHROMOSOME POSITION GENE PROPERTIES SAMPLE1 SAMPLE2
SAMPLE3
chr1 987 GENE1,GENE3 score=1 null A/A A/G
chr1 988 GENE3 score=4;id=989 A/G null A/A
chr1 1988 null score=4;id=989 C/G null C/A
People in my lab have to filter those variation using different strategy,
they then extract the overlapping genes and then filter those genes to find
one or more gene that could explain the disease.
Currently they use
knime.org to run those filters.
(
http://www.myexperiment.org/workflows/2320.html )
But I've got the feeling that drools could be used to filter those
variations.
Users would 'just' have to write some rules, run the engine and get the
result.
Rules would be removed/switched to get the result for another strategy.
Question:
1) is drools a good choice here ?
2) how should I model my data ?
If I use this java model:
class Gene {
List<Variation> getVariations();
}
class Variation {
String getChromosome();
int getPosition();
List<Gene> getGenes();
List<Genotype> getGenotypes();
}
class Genotype{
Variation getVariation();
Sample getSample();
String getDNA();
}
it will create a large graph, can drools support it ?
3) Where in the documentation can I find how to easily set the state my
objects in order to filter them. E.g:
variations ->[filters1]-> [filter2]----------------->[fiilter5] -> result
`>[filter3]->[filter4]--/
Thank you,
Pierre
(*)
http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-vari...
--
View this message in context:
http://drools.46999.n3.nabble.com/Newbie-is-my-data-model-suitable-for-dr...
Sent from the Drools: User forum mailing list archive at
Nabble.com.