Big Data
This section groups the big data content by capability first and by technology second.
- Fundamentals covers architecture, ingestion, transformation, modeling, serving, and tool selection.
- Storage and formats covers HDFS, data lake architecture, Delta Lake, Avro, and Parquet.
- Processing engines covers Spark, Flink, and Beam.
- Streaming and messaging covers Kafka and Zookeeper.
- Lakehouse covers Apache Iceberg.
- Query and serving covers Presto and Redis.
- Governance covers data governance concepts and tools.
- Legacy keeps older Hadoop-era notes separate from the main learning path.