在数据治理环节,需要对表进行溯源,或者对字段进行溯源,简单说 多个表组成一个表时,可以追溯到表中的字段来自具体的哪个表,以后数据进行变动可以通过血缘关系图形直观看出。本文主要介绍atlas集成hiverserver2操作步骤
HiveServer2服务⼀般启⽤了HA,⾄少两个不同服务器;两台HiveServer2服务是相同的操作,唯⼀注意的是keytab⽂件是与服务器主机绑定的,所以两台服务器的keytab不⼀样。
1.查看已有的HS2服务
[root@cth04 ~]# which hive
/usr/local/hive/bin/hive
[root@cth04 ~]# ps -ef | grep -i hiveserver2
hive 148110 1 0 6⽉26 ? 00:43:35 /usr/jdk64/current/bin/java -Dproc_jar -
Djava.net.preferIPv4Stack=true -Xloggc:/var/log/hive/hiveserver2-gc-%t.log -XX:+UseG1GC -
XX:G1HeapRegionSize=16m -XX:MetaspaceSize=512m -XX:MaxMetaspaceSize=1024m -
XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCCause -XX:+UseGCLogFileRotation -
[root@cth04 ~]# lsof -i:10000
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 148110 hive 508u IPv4 459259 0t0 TCP *:ndmp (LISTEN)
----------HS2服务是hive⽤户启动的,切换到hive⽤户操作
[root@cth04 ~]# su - hive
2.查看已有的keytabs配置
[hive@cth04 ~]$ ll /etc/security/keytabs/
[hive@cth04 ~]$ kinit -kt /etc/security/keytabs/hive.keytab
hive/cth04@xxxxxx.ECOM.CN
3.atlas hook与hive集成的完整jar依赖
[hive@cth04 ~]$ which hive
/usr/local/hive/bin/hive
[hive@cth04 ~]$ cd /usr/local/hive
[hive@cth04 hive]$ ll
-r-------- 1 root root 10471435 Jul 4 19:38 apache-atlas-2.2.0-hive-hook.tar.gz
drwxr-xr-x 3 root root 4096 Jul 5 17:10 apache-atlas-hive-hook-2.2.0
drwxr-xr-x 3 root root 4096 Jun 26 21:08 bin
drwxr-xr-x 2 root root 4096 Jun 27 16:27 conf
drwxr-xr-x 3 root root 4096 Jul 4 19:37 hook
-------Hive Hook JAR包存放位置
[hive@cth04 hive]$ ll hook/hive/
total 36
drwxr-xr-x 2 root root 4096 Jul 4 19:37 atlas-hive-plugin-impl
-rw-r--r-- 1 root root 17512 Jul 4 19:37 atlas-plugin-classloader-2.2.0.jar
-rw-r--r-- 1 root root 11562 Jul 4 19:37 hive-bridge-shim-2.2.0.jar
4.atlas hook与hive.client和hive.server2服务集成
4.1 查看hive部署路径
[hive@cth04 ~]$ which hive
/usr/local/hive/bin/hive
[hive@cth04 ~]$ cd /usr/local/hive
[hive@cth04 hive]$ pwd
/usr/local/hive
4.2 增加jar包环境,修改hive-env.sh中HIVE_AUX_JARS_PATH
[root@cth04 conf]# vim hive-env.sh
[root@cth04 conf]# cat /usr/local/hive/conf/hive-env.sh
#atlas hook
export HIVE_AUX_JARS_PATH=/usr/local/hive/hook/hive
4.3 增加启用hook配置
[root@cth04 conf]# cat hive-site.xml | grep -i -C1 atlas
</property>
<!--hive.client客户端&hive.hiveserver2服务端 启⽤atlas hook-->
<property>
<name>hive.exec.post.hooks</name>
<value>org.apache.atlas.hive.hook.HiveHook</value>
</property>
<!--hive.metastore服务端 启⽤atlas hook-->
<property>
<name>hive.metastore.event.listeners</name>
<value>org.apache.atlas.hive.hook.HiveMetastoreHook</value>
</property>
4.4 增加atlas hook相关配置,连接kafka集群等
[root@cth04 conf]# cd /usr/local/hive/conf
[root@cth04 conf]# vim atlas-application.properties
主要注意atlas.jaas.KafkaClient.option.principal中keytab相关配置,要与实际服务器上配置一致