Install Spark/Shark on CDH 4

CDH 4 is the currently stable version of Cloudera Distribution of Hadoop. Apache Spark is a fast and general engine for large-scale data processing. Shark is a Hive compatible query engine Based on Spark. Cloudera provides a parcel for Apache Spark, official parcel at and you can get it from my … Continue reading “Install Spark/Shark on CDH 4”