The Internals of Apache Spark

Taking notes about the core of Apache Spark while exploring the lowest depths of the amazing piece of software (towards its mastery)

Last updated 22 days ago


Computational and Inferential Thinking

The Foundations of Data Science

Last updated 6 months ago


The Internals of Spark Structured Streaming

Notes about the internals of Spark Structured Streaming

Last updated 2 months ago


The Internals of Spark SQL

Notes about the internals of Spark SQL (the Apache Spark module for structured queries)

Last updated 10 days ago


Apache Spark - Best Practices and Tuning

This is a collections of notes about Apache Spark's best practices. The notes aim to help me design and develop better programs with Apache Spark.

Last updated 2 years ago


Hadoop and Kerberos: The Madness Beyond the Gate

Hadoop and Kerberos: The details. If you don't use Hadoop, or don't want to know about the darkness that is Kerberos, leave this book alone —it will only damage your mind.

Last updated 10 months ago