Beteiligter Wissenschaftler (Partner-Universität Justus-Liebig-Universität Gießen)
The Bioinformatics Resource Facility (BRF) operates a complex and highly specialized hardware and software infrastructure which forms the basis for the academic and scientific activities within CeBiTec. One of the BRF’s key tasks is the structured acquisition and storage of experimental data and – to the greatest possible extent – the automated processing of that data. It is achieved with the aid of a compute cluster that accesses the required data via high-performance network links on storage and database systems. In addition to a flexibel general-purpose compute cluster, the BRF operates several special-purpose computer systems in order to be able to cope with the ever increasing requirements for complex bioinformatics analysis. Among these compute nodes are systems capable of running hardware-accelerated BLAST searches and several GPU systems that are used to perform short-sequence mapping operations. To further enhance our sequence data analysis capabilities, we have agreed recently to work with Active Motif to develop accelerated sequence comparison algorithms for the TimeLogic Field Programmable Gate Array (FPGA) based DeCypher® high performance biocomputing platform. Within the BMBF funded ENHANCE project, typical bioinformatics problems like sequence search or fast sequence alignment are defined and analyzed and new developments for heterogeneous compute architectures are integrated into practical bioinformatics applications.
The huge amounts of data acquired from PolyOmics technologies can only be handled with intensive bioinformatics support that has to provide an adequate data management, efficient data analysis algorithms, and user-friendly software applications. Forward-looking projects within the computational genomics group focus on the development of bioinformatics analysis workflows for rapid and parallel data interpretation, which also includes hardware accelerated software implementations. Conveyor is our novel workflow-processing engine to rapidly deploy new bioinformatics data analysis pipelines utilizing either the CeBiTec compute cluster for distributed computing or multi-core servers for local execution. Using a workflow based approach, analyses can be designed using an intuitive graphical designer application. A number of ready-to-use processing steps for bioinformatics analyses already exist with a focus on sequence analysis and sequence annotation. Using the Conveyor2Go tool, an existing workflow can easily be converted into a standalone application. The program SARUMAN is our first approach based on GPU programming that allows us to boost the performance of complete and exact short read mapping against reference genomes. It needs no server class hardware and can be run on every desktop PC with an installed NVIDIA graphic adapter. As result, various scientific applications can benefit from the parallel computing power provided by current graphics adapters found in many PCs. Results of the read mapping can subsequently be examined and analyzed in depth with our interactive visualization software VAMP. With these software tools, the basis for efficient, parallel, and data driven processing is established and will be further improved and extended in the future.