What dependency does Apache Spark have on Hadoop?
Theo van Kraay
posted 1 week ago
Absolutely none. The fact that Spark is able to run over Hadoop was done merely for backwards compatibility as Hadoop was the 'big thing' in distributed processing at the time. In 'standalone mode', Spark can actually run over any distributed file system... it doesn't have to be Hadoop.