Fix Alternatives for Cloudera Manager + CDH

Earlier post: Fix Hadoop Conf Alternatives for CDH5

I tried to upgrade Cloudera Manager + CDH 5.0.0 beta 1 and beta 2 from CM+CDH 4 then downgrade and delete, found many alternatives were installed on my small cluster, that made my lately installed CM+CDH 4 and CM+CDH 5 not working well, all because of the dirty uninstallation of CM + CDH 5 beta-s.

To fix these alternatives, I wrote a python script, read default alternative configurations, check all currently installed alternatives and delete broken links, install defaults and bring down priority, so we can use ‘Deploy Client Configuration’ in CM to set up the correct ones.


Tested only under centos 6.

Fix Alternatives for Cloudera Manager + CDH by @sskaje:

Shark 0.9.1 Snapshot Working with CDH 4.6.0 Now

The Shark 0.9.x(0.9.0, 0.9.1) are still pre-release:

Previously supplied 0.9.0 Prebuild with Hadoop2, CDH4.5.0: shark-0.9.0-hadoop2-bin is not really working with CDH 4.5.0, so I tried to compile Build Shark 0.9 for CDH 4, Install Spark/Shark on CDH 4

Some days later, 0.9.1 is out at the, the patched hive is uploaded to maven repo and can be put in lib_managed now.

I have 5 nodes for my CDH 4 test, hadoop1 – hadoop5, HDFS NameNode HA on hadoop5 + hadoop4, nameservice1.
Spark master locates on hadoop5, works on all nodes.
Continue reading “Shark 0.9.1 Snapshot Working with CDH 4.6.0 Now” »

Shark 0.9.1 Snapshot Working with CDH 4.6.0 Now by @sskaje:

Incoming search terms:

CDH 4 HA Related Problems


I meet ‘Failed to initialize High Availability state in ZooKeeper. This might be because a ZNode for this nameservice is already created. Either remove the ZNode, or to reuse the ZNode skip this step and simply start the NameNodes and Failover Controllers. To retry, use the “Initialize High Availability state in ZooKeeper” command available as a Failover Controller action.‘ When I was trying to enable automatic failover for HDFS in Cloudera Manager after HA being enabled.

Error logs:

Continue reading “CDH 4 HA Related Problems” »

CDH 4 HA Related Problems by @sskaje:

Build Shark 0.9 for CDH 4

Cloudera provides the parcel of latest Apache Spark(0.9) for Cloudera Manager, which is incompatible with old versions of Shark (0.8.1, 0.8.0 or earlier). The official release/pre-release of Shark 0.9.0 for CDH 4 is still not available for downloading, build from source might be a choice.

Shark’s wiki: Build Shark From Source Code
This page is a little bit old but still useful.

1 Install CDH 4 + Spark

Install from parcels, follow Install Spark/Shark on CDH 4

2 Install git

Continue reading “Build Shark 0.9 for CDH 4” »

Build Shark 0.9 for CDH 4 by @sskaje:

Install Spark/Shark on CDH 4

CDH 4 is the currently stable version of Cloudera Distribution of Hadoop.
Apache Spark is a fast and general engine for large-scale data processing.
Shark is a Hive compatible query engine Based on Spark.

Cloudera provides a parcel for Apache Spark, official parcel at and you can get it from my mirror (only if you’re on CentOS/RHEL 6 x86_64) Cloudera Mirror.


CentOS 6.4 x86_64, host names hadoop1-hadoop5.
Cloudera Manager 4.8.1
CDH 4.5.0
Continue reading “Install Spark/Shark on CDH 4” »

Install Spark/Shark on CDH 4 by @sskaje:

Incoming search terms: