Impala 安装
1 准备
1.1 新增Impala用户并加入sudo权限
chmod u+w /etc/sudoers
vi /etc/sudoers
impala ALL=(ALL) NOPASSWD: NOPASSWD: ALL
chmodu-w /etc/sudoers
1.2准备rpm安装文件和jar包
1.2.1 见附件内的Hadoop5.0.1Impala1.3.1\1.3.1_RPM(impala需要的rpm包)
1.2.2 见附件内的Hadoop5.0.1Impala1.3.1\1.3.1Lib_ALL(impala需要的jar包)
1.2.3 比如放置在 /home/impala/ Hadoop5.0.1Impala1.3.1内
1.3服务规划
Impala有三个服务,impala-server,impala-catalog,impala-state-store.
Impla-server需要安装在hadoop的每个数据节点上
impala-catalog,impala-state-store可以安装在某一个数据节点或则根集群通的机器.
2 安装
2.1安装impala-server
sudo rpm -ivh bigtop-utils-0.7.0+cdh5.0.1+0-1.cdh5.0.1.p0.31.el6.noarch.rpm
sudo rpm -ivh --nodeps impala-1.3.1+cdh5.0.1+0-1.cdh5.0.1.p0.42.el6.x86_64.rpm
sudo rpm -ivh impala-server-1.3.1+cdh5.0.1+0-1.cdh5.0.1.p0.42.el6.x86_64.rpm
sudo rpm -ivh impala-shell-1.3.1+cdh5.0.1+0-1.cdh5.0.1.p0.42.el6.x86_64.rpm
2.2安装impala-catalog impala-state-store
sudo rpm -ivh bigtop-utils-0.7.0+cdh5.0.1+0-1.cdh5.0.1.p0.31.el6.noarch.rpm
sudo rpm -ivh impala-1.3.1+cdh5.0.1+0-1.cdh5.0.1.p0.42.el6.x86_64.rpm
sudo rpm -ivh impala-catalog-1.3.1+cdh5.0.1+0-1.cdh5.0.1.p0.42.el6.x86_64.rpm
sudo rpm -ivh impala-state-store-1.3.1+cdh5.0.1+0-1.cdh5.0.1.p0.42.el6.x86_64.rpm
3 配置
3.1 本地包确认
先删除/usr/lib/impala/lib/*.so.*
rm -rf /usr/lib/impala/lib/ libhadoop.so.1.0.0
rm -rf /usr/lib/impala/lib/ libhadoop.so
rm -rf /usr/lib/impala/lib/ libhdfs.so.1.0.0
rm -rf /usr/lib/impala/lib/ libhdfs.so
再确认 $HADOOP_HOME/lib/native的是否存在libhadoop, libhdfs文件,如果不存在需要拷贝,反之不需要操作该步.
Hadoop5.0.1Impala1.3.1\1.3.1_RPM\libhadoop.so.1.0.0
Hadoop5.0.1Impala1.3.1\1.3.1_RPM\libhdfs.so.0.0.0
并创建创建链接
ln -s libhadoop.so.1.0.0 libhadoop.so
ln -s libhdfs.so.0.0.0 libhdfs.so
3.2 拷贝so文件至impala lib下
sudo cp $HADOOP_HOME/lib/native/*.so* /usr/lib/impala/lib/
3.3 拷贝依赖的第三方jar至impala lib下
删除前最好备份到一个目录下
sudo rm -rf /usr/lib/impala/lib/hive*.jar &&
sudo rm -rf /usr/lib/impala/lib/hbase*.jar &&
sudo rm -rf /usr/lib/impala/lib/hadoop*.jar &&
sudo rm -rf /usr/lib/impala/lib/sentry*.jar &&
sudo rm -rf /usr/lib/impala/lib/zookeeper*.jar &&
sudo rm -rf /usr/lib/impala/lib/avro*.jar &&
sudo rm -rf /usr/lib/impala/lib/parquet-hadoop-bundle.jar
sudo cp /home/impala/Hadoop5.0.1Impala1.3.1/1.3.1Lib_ALL/*/usr/lib/impala/lib/
3.4 修改hadoop相关配置
3.4.1 建立短路读的相关目录和权限
sudomkdir -p /var/run/hadoop-hdfs &&
sudochmod u+x /var/run/hadoop-hdfs &&
sudochmod g+x /var/run/hadoop-hdfs &&
sudochmod g+w /var/run/hadoop-hdfs &&
sudochown -R impala:root /var/run/hadoop-hdfs &&
sudo usermod -a -G root impala
3.4.2 追加如下内容到hadoop的hdfs-site.xml
<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>
<property>
<name>dfs.domain.socket.path</name>
<value>/var/run/hadoop-hdfs/dn._PORT</value>
</property>
<property>
<name>dfs.client.file-block-storage-locations.timeout</name>
<value>3000</value>
</property>
<property>
<name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.file-block-storage-locations.timeout.millis</name>
<value>10000</value>
</property>
3.4.3 拷贝配置文件至impala
sudo cp $HADOOP_HOME/etc/hadoop/hdfs-site.xml /etc/impala/conf
sudo cp $HADOOP_HOME/etc/hadoop/core-site.xml /etc/impala/conf
sudo cp $HIVE_HOME/conf/hive-site.xml /etc/impala/conf
3.5 修改 impala catalog和state store参数
sudo vi /etc/default/impala
IMPALA_CATALOG_SERVICE_HOST=wxdb01
IMPALA_STATE_STORE_HOST=wxdb01
备注: 这里需要写impala catalog和state store服务所在的主机名
3.6 拷贝hive元数据的驱动
示例:
sudo cp $HIVE_HOME/lib/mysql-connector-java-5.1.26-bin.jar/var/lib/impala/
3.7.1 启动 impala-catalog impala-state-store
sudoservice impala-catalog start
sudoservice impala-state-store start
sudo service impala-catalog status
sudo service impala-state-store status
3.7.2 启动 impala-server
sudoservice impala-server start
sudo service impala-server status
service impala-server status &&service impala-catalog status && service impala-state-store status
4 问题排查
4.1 impala日志
目录在 /var/log/impala内
4.2 hadoop日志
目录在 $HADOOP_HOME/logs内
4.3 hadoop日志
Hdfs赋权 /hive/warehouse777.
-----------------Impala安装问题记录:
Unable to findJava. JAVA_HOME should be set in /etc/default/bigtop-utils
解决办法:采用rpm方式安装jdk,
rpm -ivh oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
安装路径为: /usr/java/jdk1.7.0_67-cloudera