The Internals of Apache Spark

Taking notes about the core of Apache Spark while exploring the lowest depths of the amazing piece of software (towards its mastery)

Last updated 2 months ago


The Internals of Apache Kafka

On The Topic of Apache Kafka

Last updated 3 months ago


The Internals of Spark Structured Streaming

Notes about the internals of Spark Structured Streaming

Last updated 7 months ago


The Internals of Spark SQL

Notes about the internals of Spark SQL (the Apache Spark module for structured queries)

Last updated 2 months ago


Apache Spark - Best Practices and Tuning

This is a collections of notes about Apache Spark's best practices. The notes aim to help me design and develop better programs with Apache Spark.

Last updated 2 years ago


The Internals of Kafka Streams

Gitbook about Kafka Streams -- a library for developing distributed applications for processing record streams with Apache Kafka as the data storage for input and output records.

Last updated 10 months ago