Categories
CDH Hadoop相关 Hive

Hive ODBC Connection ETIMEDOUT

By @sskaje
Link: https://sskaje.me/2015/03/hive-odbc-etimedout/

I was trying to connect TableAU to Cloudera Hadoop on Windows, every time I got an error saying ETIMEDOUT, no other error message can be found.

To solve this, make sure HiveServer 2 can be selected, and for Authentication Mechanism, choose Username and then type ‘hive‘ in the Username text.

Hive ODBC Connection ETIMEDOUT by @sskaje: https://sskaje.me/2015/03/hive-odbc-etimedout/

Incoming search terms:

Categories
CDH Hadoop相关 HDFS Hive 学习研究

Manually Upgrade CDH 5.2 in CM 5

By @sskaje
Link: https://sskaje.me/2014/10/manually-upgrade-cdh-5-2-cm-5/

I was interrupted again when upgrading CDH.

HDFS

This time, NameNode was not started, I have to bring them up and resume the upgrade progress.
I didn’t save any log about NN’s error, stop all HDFS components and ran ‘Upgrade HDFS Metadata‘, then start HDFS.

YARN

Next, YARN.
I started YARN, and then all other services. Hive went down, then YARN.
I checked CM’s monitor:

I found both instances of ResourceManager were ‘Standby‘.

Here is what I found from /var/log/hadoop-yarn/hadoop-cmf-yarn-RESOURCEMANAGER-hadoop4.xxx.com.log.out

Google helps a lot: http://community.cloudera.com/t5/Cloudera-Manager-Installation/CDH-5-YARN-Resource-Manager-HA-deadlock-in-Kerberos-cluster/td-p/14396

In /opt/cloudera/parcels/CDH/lib/zookeeper/bin/zkCli.sh,
Do

one by one, because zkCli.sh does not have wildcard support.

Hive

I just guessed that hive didn’t work because of YARN, but I was wrong.
I checked all hive related commands executed by CM:

So I stopped Hive, ran Update Hive Metastore NameNodes and Upgrade Hive Metastore Database Schema, none of them worked but with the error message above.
I got more from logs:

The schemaTool reminded me, I manually upgraded hive metastore in Feb: Hive MetaStore Schema Upgrade Failed When Upgrading CDH5.
But this time, dbType should be postgres instead of derby.(Derby is not supported by Impala, that’s why I changed to postgresql embedded in Cloudera Manager.)

I cann’t find the terminal output, but when I ran:

I found a similar output (only first few lines) to the blog post above, saying schemaTool was trying to connect to derby

I re-deploy hive’s client configuration, and checked /etc/hive/conf/hive-site.xml, and compared with /var/run/cloudera-scm-agent/process/4525-hive-HIVEMETASTORE/hive-site.xml,
xml under /etc uses hive metastore’s thrift server and that under CM’s running folder speicified the exact database connection. schemaTool uses the /etc one.
So I replaced /etc one with CM’s, and then redo upgradeSchema:

Same error as I saw in CM’s log, plpgsql does not exist. Fix this by:

You can find password from the xml I mentioned above of file like

If you meet error message saying OWNER_NAME or OWNER_TYPE already exists in table DBS, open /opt/cloudera/parcels/CDH/lib/hive/scripts/metastore/upgrade/postgres/016-HIVE-6386.postgres.sql and comment/delete the two ALTER TABLE lines.

Manually Upgrade CDH 5.2 in CM 5 by @sskaje: https://sskaje.me/2014/10/manually-upgrade-cdh-5-2-cm-5/

Incoming search terms:

Categories
Hadoop相关 Hive Impala PHP 学习研究

PHP ODBC Connect Cloudera Impala and Hive

By @sskaje
Link: https://sskaje.me/2014/07/php-odbc-connect-cloudera-impala-hive/

Environment

CentOS 5.5
PHP 5.3.10
(This article also works for PHP 5.3.3 on CentOS 6).

Dependencies

UnixODBC

UnixODBC can be installed from yum repo

I built a unixODBC 2.3.2 from source, installed to /usr/local/unixODBC

ODBC Connectors

Cloudera offers ODBC libs for both Hive and Impala:
http://www.cloudera.com/content/support/en/downloads/connectors/impala/impala-odbc-v2-5-15.html
http://www.cloudera.com/content/support/en/downloads/connectors/hive/hive-odbc-v2-5-9.html

Follow the install guide on urls above, only wget and yum –nogpgcheck localinstall xxx.rpm required.

PHP ODBC Connect Cloudera Impala and Hive by @sskaje: https://sskaje.me/2014/07/php-odbc-connect-cloudera-impala-hive/

Incoming search terms:

Categories
Hadoop相关 Hive Impala PrestoDB 学习研究

MySQL/Hive/Presto/Impala Transposition

By @sskaje
Link: https://sskaje.me/2014/02/mysql-hive-presto-impala-transposition/

Rows to Columns

Rows to Comma Separated String

MySQL

Use GROUP_CONCAT(). This function also works in Infobright.

MySQL/Hive/Presto/Impala Transposition by @sskaje: https://sskaje.me/2014/02/mysql-hive-presto-impala-transposition/

Incoming search terms:

Categories
Hadoop相关 HDFS Hive 学习研究 项目、研究

Project: Merge small files on HDFS for Hive table

By @sskaje
Link: https://sskaje.me/2013/12/project-merge-small-files-hdfs-hive-table/

Project: Merge small files on HDFS for Hive table

Introduction

Github: https://github.com/sskaje/hive_merge

This is a solution for small file problems on HDFS, but Hive table only.

Here is why I wrote this project: Solving Small Files Problem on CDH4.

This script simply INSERT the requested table/partition to a new table, let data be merged by Hive itself, then INSERT back with compression.

Project: Merge small files on HDFS for Hive table by @sskaje: https://sskaje.me/2013/12/project-merge-small-files-hdfs-hive-table/

Incoming search terms: