Joe Harry wrote:You are in a sense correct that it only teaches the basics of R and not about data analysis. The weekly assignments were focused on reading and interpreting some CSV files which was pretty easy. The course is purely a beginners material.
Actually I don't think it was much help to beginners either. When I did it there were a lot of people who had never programmed before, who seemed to be having real problems, and a lot of people who seemed to be attempting the course for the second time, which suggests the course is failing to get a lot of people up to speed first time around. The choice of topics was bizarre e.g. we looked at lexical scoping and memoisation, but we never touched R's excellent graphing tools. For experienced developers like you and me, this is not a problem, but the course is not aimed at experienced developers. The course tutor does not come from a CS background, and I don't think he has ever learned programming in any structured fashion, which is probably why he doesn't seem able to teach it in a structured fashion either. Being fluent in a language does not necessarily mean you are able to teach that language, whether the language is R, Java or French.
To be honest I learned more R by spending an hour or so on
Try R and reading some online tutorials. But YMMV of course!
Another particular issue with R is that unlike most other programming languages, it was largely invented and developed by maths people, not CS people. So it looks a bit odd compared to other programming languages, and can seem a bit random in its structure and libraries. It's a fully featured programming language with powerful libraries, but its real strengths are in statistics and maths. I know people who've written web applications in R, but I think that was mainly because they didn't know any other programming languages! If you want to make best use of R, I think you also need to know a bit about statistics and data analysis, so you can appreciate what R provides. That's why I think learning R alongside some basic statistics might be helpful for people who may not have much of a stats background, for example. And if you already have good stats knowledge and have already programmed before, then I'd say go and get a good book on R and learn it yourself instead.
As for R and Big Data, that seems to be kind of a mixed bag. The basic R installation is not much use for big data, because it has to load all its data into memory. However, there are libraries to help with this e.g.
the ff libraries to process data on disk, and various other options for parallel processing etc. There are R interfaces to things like Hadoop, and Oracle is pushing some of
its own R interfaces. I haven't really had a chance to look at these yet, as I've decided to use Python, Hive or Spark for the relatively straightforward data analysis I need to do at work, as these tools are more flexible and (in the case of Spark) potentially more powerful for our purposes, although some of my statistician colleagues are looking at R and ff.
And coming back to statistical knowledge, this is actually important when doing data analysis on big data, because you need to have an understanding of how to choose the right tools and techniques to suit your data, and whether you can draw particular conclusions based on that data analysis.