Modul CS3130-KP08

Nonstandard Databases and Data Mining (NDBDM)


1 Semester
Turnus of offer:

not available anymore
Credit points:

Course of studies, specific field and terms:
  • Bachelor Computer Science 2019 (optional subject), major subject informatics, Arbitrary semester
  • Bachelor Computer Science 2019 (compulsory), Canonical Specialization Web and Data Science, 5th semester
  • Bachelor Medical Informatics 2019 (optional subject), computer science, 4th to 6th semester
  • Bachelor Media Informatics 2014 (optional subject), computer science, 5th or 6th semester
  • Bachelor IT-Security 2016 (optional subject), computer science, Arbitrary semester
  • Bachelor Computer Science 2016 (optional subject), major subject informatics, Arbitrary semester
  • Bachelor Computer Science 2016 (compulsory), Canonical Specialization Web and Data Science, 5th semester
Classes and lectures:
  • Nonstandard Databases and Data Mining (exercise, 2 SWS)
  • Nonstandard Databases and Data Mining (lecture, 4 SWS)
  • 90 Hours in-classroom work
  • 110 Hours private studies
  • 40 Hours exam preparation
Contents of teaching:
  • Semi-structured data models (JSON, XML) and full text queries
  • Information Retrieval
  • Multidimensional index structures
  • Clustering
  • Embedding techniques
  • first-n, top-k, and skyline queries
  • Probabilistic databases, query response, query transformation, safe plan query, top-k queries (Monte Carlo simulation, Luby-Karp method, multi-simulation), open-world acceptance
  • Probabilistic modeling, Bayesian networks, query response algorithms, learning methods for models
  • Temporal databases and the relational model
  • Probabilistic Temporal Databases
  • SQL: new developments (e.g. JSON structures and arrays), time series (e.g. TimescaleDB)
  • Stream databases, principles of window-oriented incremental processing
  • Approximation techniques for stream data processing, stream mining
  • Probabilistic spatiotemporal databases and stream data processing systems: queries and index structures, spatiotemporal data mining, probabilistic skylines
  • From NoSQL to NewSQL databases, graph data in SQL, CAP theorem, CALM theorem, blockchain databases
  • Knowledge: Students can name the main features of standard databases and, in addition, can explain which nonstandard database models emerge if certain features are dropped. Students can describe the main ideas behind nonstandard databases presented in the course by explaining the main features of respective query languages (syntax and semantics) as well as the most important implementation techniques used for their practical realization.
  • Skills: Students can apply query languages for nonstandard data models introduced in the course to retrieve desired structures from sample datasets for satisfying human information needs. Students will be enabled to represent data in the relational data model using encoding techniques presented in the course such that they can demonstrate how new formalisms relate to or can be implemented in SQL (SQL-2011). In case an SQL transformation cannot be found, students can explain and apply dedicated algorithms for query answering. Students can demonstrate how index structures help answering queries fast by showing how index structures are built, updated, and exploited for query answering. The participants of the course can derive query answers by evaluating queries step by step and by deriving optimized query execution plans.
  • Social skills: Students work in teams to handle assignments, and they are encouraged to present their solution to other students in small presentations (in lab classes). In addition, self-dependence is fostered by giving pointers to query evaluation engines for various formalism presented in the lecture such that students get familiar with data models and query languages by self-controlled work.
Grading through:
  • written exam
Responsible for this module:
  • S. Abiteboul, P. Buneman, D. Suciu: Data on the Web - From Relations to Semistructured Data and XML - Morgan-Kaufmann, 1999
  • Ch. Aggarwal: Data Mining - The Textbook - Springer, 2015
  • S. Chakravarthy, Q. Jiang: Stream Data Processing - A Quality of Service Perspective - Springer, 2009
  • J. Leskovec, A. Rajaraman: Mining of Massive Datasets - Cambridge University Press, 2012
  • P. Revesz: Introduction to Databases: From Biological to Spatio-Temporal - Springer 2010
  • P. Rigaux, M. Scholl, A. Voisard: Spatial Databases With Applications to GIS - Morgan-Kaufmann, 2001
  • D. Suciu, D. Olteanu, Chr. Re, Chr. Koch: Probabilistic Databases - Morgan & Claypool, 2011
  • offered only in German

Admission requirements for taking the module:
- None (the competences of the modules mentioned under ''requires'' are needed for this module, but are not a formal prerequisite).

Admission requirements for participation in module examination(s):
- Successful completion of exercise assignments as specified at the beginning of the semester.

Module Exam(s):
- CS3130-L1: Non-Standard Databases and Data Mining, written exam, 90min, 100% of module grade.

Former name of the module: Algorithmic Data Analysis

Letzte Änderung: