Newer Documentation for HttpFS(Hadoop HDFS over HTTP)

I was trying to make rsyslog v8 communicating with hadoop hdfs directly via omhdfs, but failed as it’s said officially that omhdfs is not working with rsyslog v8 by now. UPDATE: OmHTTPFS, Another Rsyslog HDFS Output Plugin, I was recommended to use HttpFS when setting up Hue in CDH. HttpFS is a http gateway … Continue reading “Newer Documentation for HttpFS(Hadoop HDFS over HTTP)”

Fix Hadoop Conf Alternatives for CDH5

I’m using CDH5, upgraded failed from CDH4 and then reinstalled directly. /etc/hadoop/conf is linked to /etc/hadoop/conf/conf.cloudera.mapreduce1/. Deploy Client Configuration does not make it right. The way fix it is manually set a new path and remove the old one, like:

But the next time you try Deploy Client Configuration would corrupt it again.

Continue reading “Fix Hadoop Conf Alternatives for CDH5”

Infobright 企业版数据导入和数据擦写实验

拿到一个IEE的试用版证书,试了下作为日志存储和计算的方案。统计数据查询就不用测了,ICE试试就能感受出来,比hive反正快了不少。 这里主要还是想测试 INSERT / UPDATE / DELETE。 实验环境的日志系统使用 rsyslog -> flume-ng -> IEE/HDFS. 使用Flume-ng自带的HDFS Sink写HDFS的方案一直很稳定,目录按天分,写脚本预先创建目录、加Hive分区,使用hive进行分析。 但是由于可能对当天数据有统计需求hdfs.rollInterval设的比较小,目前是2分钟,每天都会有大量小文件,hive处理速度十分慢。 Flume-ng 找人写了个简单的入mysql的插件,单加了一个队列,把日志文件切分后按列送进mysql,插件要求数据库insert使用prepare批量处理insert。 Incoming search terms:infobright 压缩率Link to this post!

YARN NodeManager Failed to Start

I upgraded my CDH, one of my NodeNamager cannot be brought up. NullPointer Exceptions were found in error log:

I tried deleting all ZooKeeper-related configs(which you can find it from Manually Upgrade CDH 5.2 in CM 5, exact the YARN part), not working. Deleted the NodeManager instance and then reinstall, same. Many ‘Recovering application’ … Continue reading “YARN NodeManager Failed to Start”

Manually Upgrade CDH 5.2 in CM 5

I was interrupted again when upgrading CDH. HDFS This time, NameNode was not started, I have to bring them up and resume the upgrade progress. I didn’t save any log about NN’s error, stop all HDFS components and ran ‘Upgrade HDFS Metadata‘, then start HDFS.

YARN Next, YARN. I started YARN, and then all … Continue reading “Manually Upgrade CDH 5.2 in CM 5”