Mastering Apache Spark

Updated a month ago

candythinking (@candythinking) started discussion #133

a year ago · 3 comments


By default, a partition is created for each HDFS partition, which by default is 64MB (from Spark’s Programming Guide).

Partitions and Partitioning (Edit this file)

default is 128MB

No description provided.
ryan-factual @ryan-factual commented a year ago

In the Apache Hadoop the default block size is **64 MB **and in the Cloudera Hadoop the default is 128 MB.

Ulul @ululh commented a year ago

Hi, from what's still traceable 128 MB has been default for *Apache* Hadoop since at least 2.4.1, back in 2014 (but I think it was more 0.23/2.0 that introduced it long before that)

https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml (dfs.blocksize)

Ulul @ululh commented a year ago

BTW the Spark Programing guide pointed by the link also states 128 MB

to join this conversation on GitBook. Already have an account? Sign in to comment

You’re not receiving notifications from this thread.

3 participants