Categories
CDH Hadoop相关 Linux Shark Spark 学习研究

Shark 0.9.1 Snapshot Working with CDH 4.6.0 Now

By @sskaje
Link: https://sskaje.me/2014/03/shark-0-9-1-snapshot-working-cdh-4-6-0/

The Shark 0.9.x(0.9.0, 0.9.1) are still pre-release: https://github.com/amplab/shark/releases

Previously supplied 0.9.0 Prebuild with Hadoop2, CDH4.5.0: shark-0.9.0-hadoop2-bin is not really working with CDH 4.5.0, so I tried to compile Build Shark 0.9 for CDH 4, Install Spark/Shark on CDH 4

Some days later, 0.9.1 is out at the https://github.com/amplab/shark/tree/branch-0.9, the patched hive is uploaded to maven repo and can be put in lib_managed now.

I have 5 nodes for my CDH 4 test, hadoop1 – hadoop5, HDFS NameNode HA on hadoop5 + hadoop4, nameservice1.
Spark master locates on hadoop5, works on all nodes.

Shark 0.9.1 Snapshot Working with CDH 4.6.0 Now by @sskaje: https://sskaje.me/2014/03/shark-0-9-1-snapshot-working-cdh-4-6-0/
Categories
CDH Hadoop相关 Linux Shark Spark 学习研究

Build Shark 0.9 for CDH 4

By @sskaje
Link: https://sskaje.me/2014/02/build-shark-for-cdh-4/

Cloudera provides the parcel of latest Apache Spark(0.9) for Cloudera Manager, which is incompatible with old versions of Shark (0.8.1, 0.8.0 or earlier). The official release/pre-release of Shark 0.9.0 for CDH 4 is still not available for downloading, build from source might be a choice.

Shark’s wiki: Build Shark From Source Code
This page is a little bit old but still useful.

1 Install CDH 4 + Spark

Install from parcels, follow Install Spark/Shark on CDH 4

2 Install git

Build Shark 0.9 for CDH 4 by @sskaje: https://sskaje.me/2014/02/build-shark-for-cdh-4/
Categories
CDH Hadoop相关 Linux Shark Spark

Install Spark/Shark on CDH 4

By @sskaje
Link: https://sskaje.me/2014/02/install-spark-shark-cdh-4/

CDH 4 is the currently stable version of Cloudera Distribution of Hadoop. http://cloudera.com/
Apache Spark is a fast and general engine for large-scale data processing. http://spark.incubator.apache.org/
Shark is a Hive compatible query engine Based on Spark. http://shark.cs.berkeley.edu/

Cloudera provides a parcel for Apache Spark, official parcel at http://archive.cloudera.com/spark/ and you can get it from my mirror (only if you’re on CentOS/RHEL 6 x86_64) Cloudera Mirror.

Environment

CentOS 6.4 x86_64, host names hadoop1-hadoop5.
Cloudera Manager 4.8.1
CDH 4.5.0

Install Spark/Shark on CDH 4 by @sskaje: https://sskaje.me/2014/02/install-spark-shark-cdh-4/

Incoming search terms: