posted 5 years ago
Spark got inspired for it's API from Scala's collection API that has those methods (filter, map, flatMap, reduce, etc). When invoked in a Scala collection, they run locally, returning a new collection, when using Spark, the methods will be invoked on the RDD API, and will return a transformed RDD that will run in the Spark cluster.