Fix Hadoop Conf Alternatives for CDH5

I’m using CDH5, upgraded failed from CDH4 and then reinstalled directly.
/etc/hadoop/conf is linked to /etc/hadoop/conf/conf.cloudera.mapreduce1/.
Deploy Client Configuration does not make it right.

The way fix it is manually set a new path and remove the old one, like:

But the next time you try Deploy Client Configuration would corrupt it again.

Continue reading “Fix Hadoop Conf Alternatives for CDH5” »

Fix Hadoop Conf Alternatives for CDH5 by @sskaje: https://sskaje.me/2014/02/fix-hadoop-conf-alternatives-cdh5/

Incoming search terms:

Solving Small Files Problem on CDH4

This morning when I open my Cloudera Manager, it shows the NameNode server is ‘Concerning’ with a message like ‘The DataNode has xxx blocks. Warning threshold: 200,000 block(s).’.
I tried to google this, said that there might be too many files on HDFS, as DataNode’s default block size is 128MB on my CDH4, a single file with 1 byte would take a 128MB block.

Then I tried hdfs dfs -count to find out number of files of each directory on HDFS, about 70k files under /user/hdfs/.staging and 170k under a folder for Flume-NG.

I’m collecting logs with Flume-NG on CDH4 and trying to analyse with hive, from syslog, sink to HDFS and MySQL(infobright). The HDFS part in the configuration looks like:

Continue reading “Solving Small Files Problem on CDH4” »

Solving Small Files Problem on CDH4 by @sskaje: https://sskaje.me/2013/12/solving-small-files-problem-cdh4/

Incoming search terms:

Cloudera Archive Mirror Updated for CM5 & CDH5

Latest Updates @ https://sskaje.me/cloudera-mirror/

Cloudera just released it’s CDH 5 beta download here, this time their don’t use the beta.cloudera.com as non-release product’s repo.

URL: http://cloudera.rst.im/ RHEL/CentOS 6 Only.

You may download Cloudera Manager 4 installer from http://cloudera.rst.im/cm4/installer/latest/
And use http://cloudera.rst.im/cm4/redhat/6/x86_64/cm/4.7.2/ as your yum repo for CM 4.7.2.

Or Cloudera Manager 5 installer from http://cloudera.rst.im/cm5/installer/latest/,
And use http://cloudera.rst.im/cm4/redhat/6/x86_64/cm/5/ as your yum repo for CM 5.

Continue reading “Cloudera Archive Mirror Updated for CM5 & CDH5” »

Cloudera Archive Mirror Updated for CM5 & CDH5 by @sskaje: https://sskaje.me/2013/10/cloudera-archive-mirror-updated-for-cm5-cdh5/

Incoming search terms:

Cloudera Archive Mirror for RHEL/CentOS 6

Latest Updates @ https://sskaje.me/cloudera-mirror/

Cloudera does not provide any mirror sites for its parcels/packages downloading, but fortunately the directory list on archive.cloudera.com is on, just create my own!

URL: http://cloudera.rst.im/

You may download CM installer from http://cloudera.rst.im/cm4/installer/latest/

And use http://cloudera.rst.im/cm4/redhat/6/x86_64/cm/4.7.0/ as your yum repo for CM 4.7.0.(No ‘latest’ stuff this time :P)

Continue reading “Cloudera Archive Mirror for RHEL/CentOS 6” »

Cloudera Archive Mirror for RHEL/CentOS 6 by @sskaje: https://sskaje.me/2013/09/cloudera-archive-mirror-for-rhelcentos-6/

Incoming search terms: