CDH Hadoop相关 Linux Shark Spark 学习研究

Shark 0.9.1 Snapshot Working with CDH 4.6.0 Now

By @sskaje

The Shark 0.9.x(0.9.0, 0.9.1) are still pre-release:

Previously supplied 0.9.0 Prebuild with Hadoop2, CDH4.5.0: shark-0.9.0-hadoop2-bin is not really working with CDH 4.5.0, so I tried to compile Build Shark 0.9 for CDH 4, Install Spark/Shark on CDH 4

Some days later, 0.9.1 is out at the, the patched hive is uploaded to maven repo and can be put in lib_managed now.

I have 5 nodes for my CDH 4 test, hadoop1 – hadoop5, HDFS NameNode HA on hadoop5 + hadoop4, nameservice1.
Spark master locates on hadoop5, works on all nodes.

Shark 0.9.1 Snapshot Working with CDH 4.6.0 Now by @sskaje:
CDH Hadoop相关 HDFS 学习研究 笔记

CDH 4 HA Related Problems

By @sskaje


I meet ‘Failed to initialize High Availability state in ZooKeeper. This might be because a ZNode for this nameservice is already created. Either remove the ZNode, or to reuse the ZNode skip this step and simply start the NameNodes and Failover Controllers. To retry, use the “Initialize High Availability state in ZooKeeper” command available as a Failover Controller action.‘ When I was trying to enable automatic failover for HDFS in Cloudera Manager after HA being enabled.

Error logs:

CDH 4 HA Related Problems by @sskaje: