Scala has been the preferred language for Spark. Spark was built using Scala. If we use Scala for Spark , it is easier to do functional programming using Functions, the code is concise and it is faster. Thr RDDs are slower with non JVM languages. Now we have other options apart from RDDs which are Dataframes and Datasets. So if we are using Dataframes/Datasets, is scala still the preferred language? Thanks.
AFAIK, Spark is still implemented in Scala, so the Scala APIs are usually delivered first and are most complete. Spark SQL, DataFrames, DataSets etc have been in Spark for a couple of years now. There is no reason to switch between languages for different Spark libraries, if the library you need is available int he language you are using.
Thanks. Yes, whether it is dataframes , datasets or the RDDs, programming in Scala is beneficial. I think because of 1) faster execution 2) easy to code (For e.g while using Functions) 3) concise code.