Audience and requirements:
Data from experiments, simulations and statistical processes must be
appropriately analyzed before results can be determined and conclusions
drawn. In this lecture techniques to test, analyse and interpret data
will be discussed and complemented by applications and practical
implementations with respect to selected data sets.
The lecture is designed to be accessible by higher level undergraduate
students and graduate students who are familiar with some basic
mathematical formula, such as taught in computer science, biotechnology
and physics. In particular, the course is devoted to students of computer
science regarding the subjects Bioinformatics and Applied Computer
Science in the Natural Sciences (Naturwissenschaftliche Informatik) and
to the students of biology regarding the subjects Biotechnology and
Systems Biology. The students are invited to contribute with seminars on
selected topics.
Content:
Data analysis is the process of systematically applying statistical
and logical techniques to describe, summarize, visualize and compare
data. It has become an important subject in many scientific branches
and its correct usage is essential regarding the evaluation and
presentation of scientific results. Many approaches are based on the
usage of a model. The need for model selection is an integral and
critical part of data analysis and arises when a data-based choice
among competing models has to be made.
The lecture starts with a brief discussion of those elementary
statistical concepts that provide the necessary foundations for more
specialized expertise in any area of statistical data analysis. This
contains the basic concepts for the calculus of probability theory,
various distributions and simple concepts regarding multivariate
statistics. Subsequently more advanced techniques are discussed in
detail, including moments of distributions, the variance-covariance
matrix, statistical inference, various types of density estimators,
method of least squares and other fitting procedures, discriminant
analysis tools, classification rules, clustering algorithms, tools
for pattern recognition, regression techniques, factor analysis and
finally approaches for time-series and incomplete data analysis.
Some modern approaches of exploratory data analysis, learning of data
structures, Bayesian statistical modeling and knowledge mining are
left as suggested seminar topics. Applications and practical examples
are provided in the framework of the MATLAB programming environment
and the R Statistics Software within the lecture and as possible
exercises.
For more information, including a comprehensive list of suggested
literature, please visit the webpage of the lecture at
www.degenhard.org
Frequency | Weekday | Time | Format / Place | Period | |
---|---|---|---|---|---|
weekly | Mi | 14-16 | D01-249 | 03.04.-14.07.2006 |
Degree programme/academic programme | Validity | Variant | Subdivision | Status | Semester | LP | |
---|---|---|---|---|---|---|---|
Biologie / Bachelor | (Enrollment until SoSe 2011) | Kernfach | Indiv. Erg. | Wahlpflicht | 4. | 3 | Profile A1 und A2 |
Naturwissenschaftliche Informatik / Bachelor | (Enrollment until SoSe 2011) | Wahl | 6. | ||||
Naturwissenschaftliche Informatik / Diplom | (Enrollment until SoSe 2004) | allgem.HS | HS | ||||
Physik / Diplom | (Enrollment until SoSe 2008) | für vertief. Wahlpflichtfach | HS | ||||
Systems Biology of Brain and Behaviour / Master | (Enrollment until SoSe 2012) | Modul 6 | Wahlpflicht | 2. | als Teil des Erweiterungsmoduls A denkbar | ||
Umweltwissenschaften / Bachelor | (Enrollment until SoSe 2011) | Kern- und Nebenfach | Indiv. Erg. | Wahl | 4. | 3 |