• Fundamentals of modern Database Management Systems (DBMSs): storage, indexing, query optimization, transaction processing, concurrency and recovery. Row Layout (SQLlite) vs. Column File Layout (Parquet/PAX/ORC/Delta Lake, Pandas/Arrow, Spatial/GeoParquet, DuckDB) vs TimeSeries DBs (LSM/TSM-based/AVRO/Kafka). Vector Databases (Embeedings and Similarity Search using Hierarchical Navigable Small World (HNSW): Chroma/DuckDB Examples, LLMs, RAG and Vector Databases (L) • Fundamentals of Distributed DBMSs, Web Databases and Cloud Databases (NoSQL / NewSQL): Semi-structured data management (XML/JSON, XPath and XQuery), Document data-stores (i.e., CouchDB, MongoDB, RavenDB), Key-Value data-stores (e.g., BerkeleyDB, MemCached), Introduction to Cloud Computing (NFS, GFS/Hadoop HDFS, Replication/Consistency Principles), Big-data processing/analytic frameworks (Apache MapReduce/PIG, Spark/Shark), Column-stores (e.g., Google's BigTable, Apache's HBase, Apache's Cassandra), Graph databases (e.g., Twitters FlockDB) and Overview of NewSQL (Google's Spanner/F1). • Spatio-temporal data management (trajectories, privacy, analytics) and index structures (e.g., R-Trees, Grid Files) as well as other selected and advanced topics, including: Embeeded Databases (sqlite), Sensor / Smartphone / Crowd data management, Energy-aware data management, Flash storage, Stream Data Management, etc. The last part of the course will feature both invited talks from external invited speakers and the presentations of students.
