Pete Letkeman wrote:For this path I suggest that you try directly for the Java OCA 808 exam and then the Java OCP 809 exam.
I suggest that you take a few moments and read some of the stories found here https://coderanch.com/wiki/659980/Ocajp-Wall-Fame.
By over over these stories you can see how others prepared and what material they used.
This could save you time, money and frustration.
chris webster wrote:Hadoop is all about distributing your data and your processing across multiple cheap machines. The data is replicated so there are e.g. 3 copies of each block of data, with diifferent copies on different machines. If you have more nodes than replicas, e.g. 3 replicas across 6 nodes, then on average each node only contains half the total original data volume. Hadoop knows where your data is replicated, so it can decide to process different subsets of your data on different nodes at the same time. This is how Hadoop allows you to exploit the power of distributed processing.
If you only have two nodes, and your replication factor is 2 or more, then each node contains all your data anyway, so Hadoop cannot decide how to break up the processing in this way. And if you only have one node, then nothing is distributed at all.