A multi-objective optimisation approach to the design of experiment in de novo assembly projects


Genomics projects are characterised by difficult biological pipelines and high sequencing costs. In particular, de novo assembly projects must go through data production, assembly, and results validation. Early mistakes in the first (and most expensive) step can therefore be detected only at a very late stage and have serious consequences. Our goal is to design a pipeline able to provide the users with the optimal input for the sequencing experiments within a de novo assembly project. We present a new approach, based on multi-objective optimisation, aiming at transforming the design of genomics experiments from a set of ‘best practices’ to an algorithmically controlled procedure. We implemented our model with mode FRONTIER and we show how our method can be used to infer the final quality of a whole genome assembly project from the results obtained on a small but representative sample. © 2012 IEEE.

Proceedings - International Workshop on Database and Expert Systems Applications, DEXA