The start-dfs.sh command, as the name suggests, starts the components necessary for
HDFS. This is the NameNode to manage the filesystem and a single DataNode to hold data.
The SecondaryNameNode is an availability aid that we'll discuss in a later chapter.
After starting these components, we use the JDK's jps utility to see which Java processes are
running, and, as the output looks good, we then use Hadoop's dfs utility to list the root of
the HDFS filesystem.
After this, we use start-mapred.sh to start the MapReduce components—this time the
JobTracker and a single TaskTracker—and then use jps again to verify the result.
There is also a combined start-all.sh file that we'll use at a later stage, but in the early
days it's useful to do a two-stage start up to more easily verify the cluster configuration.