Mastering Apache Spark

Updated a month ago

pavel-lysyanskyi (@pavel-lysyanskyi) started discussion #99

2 years ago · 0 comments


The difference between cache and persist operations is purely syntactic. cache is a synonym of persist or persist(MEMORY_ONLY), i.e. cache is merely persist with the default storage level MEMORY_ONLY.

Caching and Persistence (Edit this file)

as of Spark 2.0.1 the default storage level for persist is MEMORY_ONLY_SER , same applied to the cache()

No description provided.

No comments on this discussion.

to join this conversation on GitBook. Already have an account? Sign in to comment

You’re not receiving notifications from this thread.

1 participant