Win a copy of Mastering Corda: Blockchain for Java Developers this week in the Cloud/Virtualization forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Bear Bibeault
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • Tim Cooke
  • Junilu Lacar
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Jj Roberts
  • Carey Brown
Bartenders:
  • salvin francis
  • Frits Walraven
  • Piet Souris

Is Scala still the most preferred for Spark even when using Datasets and Dataframes instead of RDD?

 
Ranch Foreman
Posts: 2343
12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Scala has been the preferred language for Spark. Spark was built using Scala. If we use Scala for Spark , it is easier to do functional programming using Functions, the code is concise and it is faster. Thr RDDs are slower with non JVM languages. Now we have other options apart from RDDs which are Dataframes and Datasets. So if we are using Dataframes/Datasets, is scala still the preferred language? Thanks.
 
Ranch Hand
Posts: 32
3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
AFAIK, Spark is still implemented in Scala, so the Scala APIs are usually delivered first and are most complete.  Spark SQL, DataFrames, DataSets etc have been in Spark for a couple of years now.  There is no reason to switch between languages for different Spark libraries, if the library you need is available int he language you are using.

https://coderanch.com/t/733230/open-source/Spark-Action-Pros-Cons-language

 
Monica Shiralkar
Ranch Foreman
Posts: 2343
12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks. Yes, whether it is dataframes , datasets or the RDDs, programming in Scala is beneficial. I think because of 1) faster execution 2) easy to code (For e.g while using Functions) 3) concise code.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic