一、升级HDFS
上传安装包、编辑主机列表
1.准备hadoop安装包hadoop-3.3.3.tar.gz、zookeeper3.7.1安装包、主机清单openssl1.1,上传到/home/op/cth3_it_3.3.3路径
2.分发解压安装包hadoop-3.3.3.tar.gz
ansible -b -i hosts/hosts_hadoop all -m unarchive -a "src=hadoop-3.3.3.tar.gz dest=/usr/local/ owner=root group=root"
3.检查namenode状态
hdfs haadmin -ns ctyunns1 -getServiceState nn1
hdfs haadmin -ns ctyunns1 -getServiceState nn2
hdfs haadmin -ns ctyunns2 -getServiceState nn3
hdfs haadmin -ns ctyunns2 -getServiceState nn4
hdfs haadmin -ns ctyunns3 -getServiceState nn5
hdfs haadmin -ns ctyunns3 -getServiceState nn6
hdfs haadmin -ns ctyunns4 -getServiceState nn7
hdfs haadmin -ns ctyunns4 -getServiceState nn8
4.清理过期edits(慎重操作)
在每组ns的activenamenode上执行如下命令
hdfs dfsadmin -safemode enter
hdfs dfsadmin -saveNamespace
mv /data01/hadoop/hdfs/namenode/current/edits_0000000000017* /tmp/data01_edits/
mv /data02/hadoop/hdfs/namenode/current/edits_0000000000017* /tmp/data02_edits/
hdfs dfsadmin -safemode leave
5.升级备namenode
HDFS Namenode因faimage版本升级, 需要滚动升级
rollingUpgrade 备份
hdfs dfsadmin -rollingUpgrade prepare
hdfs dfsadmin -rollingUpgrade query
切换到hdfs用户rollingUpgrade中间状态启动
systemctl stop hadoop-hdfs-namenode
unlink /usr/local/hadoop3
ln -s /usr/local/hadoop-3.3.3 /usr/local/hadoop3
ll /usr/local/
hdfs namenode -rollingUpgrade started
切换主从, 主由nn2切换为nn1
hdfs haadmin -ns ctyunns1 -failover nn2 nn1
将此时未升级的备机停⽌, 升级服务后, 重新启动
systemctl stop hadoop-hdfs-namenode
unlink /usr/local/hadoop3
ln -s /usr/local/hadoop-3.3.3 /usr/local/hadoop3
ll /usr/local/
sudo -u hdfs hdfs namenode -rollingUpgrade started
6.依次在所有的ns上执行此操作
二、升级datanode(nodemanager)
1.在一台hadoop节点上执行
cat /tmp/hosts_dn|while read line ;do hdfs dfsadmin -getDatanodeInfo $line:9867;done
cat /tmp/hosts_dn|while read line ;do hdfs dfsadmin -shutdownDatanode $line:9867 upgrade;done
ansible -b -i hosts_dn all -m shell -a "unlink /usr/local/hadoop3" -f 20
ansible -b -i hosts_dn all -m shell -a "ln -s /usr/local/hadoop-3.3.3 /usr/local/hadoop3" -f 20
ansible -i hosts_dn all -m shell -a "/bin/cp -a /usr/local/hadoop-3.2.1/etc/hadoop/* /usr/local/hadoop-3.3.3/etc/hadoop/"
ansible -i hosts_dn all -m shell -a "ls -l /usr/local/"
cat /tmp/hosts_dn|while read line ;do hdfs dfsadmin -getDatanodeInfo $line:9867;done
ansible -i hosts_dn all -m shell -a 'source /etc/profile;hadoop version|grep 3.2.1' -f 100
#升级安装包, 重启对应节点datanode
ansible -b -i hosts_dn all -m shell -a "systemctl start hadoop-hdfs-datanode" -f 100
2.升级datanode完成后,在所有的active namenode节点上执行
hdfs dfsadmin -rollingUpgrade finalize
3.升级journalnode
systemctl stop hadoop-hdfs-journalnode
unlink /usr/local/hadoop3
ln -s /usr/local/hadoop-3.3.3 /usr/local/hadoop3
ll /usr/local/
systemctl start hadoop-hdfs-journalnode
三、升级resourcemanager
1.升级resourcemanager可执行文件
unlink /usr/local/hadoop3
ln -s /usr/local/hadoop-3.3.3 /usr/local/hadoop3
ll /usr/local/
systemctl restart hadoop-yarn-resourcemanager.service
2.获取 rm状态
yarn rmadmin -getAllServiceState
发现两个rm都是standby状态,需要升级zookeeper
四、升级zookeeper-3.5.6
1.分发并解压软件包
ansible -b -i hosts/hosts_zookeeper all -m unarchive -a "src=apache-zookeeper-3.5.6-bin.tar.gz dest=/usr/local/ owner=root group=root"
2.替换zookper
ansible -b -i hosts/hosts_zookeeper all -m shell -a " unlink /usr/local/zookeeper"
ansible -b -i hosts/hosts_zookeeper all -m shell -a "ln -s /usr/local/apache-zookeeper-3.5.6-bin /usr/local/zookeeper"
3.还原zookeeper配置
ansible -b -i hosts_zookeeper all -m shell -a " /bin/cp -a /usr/local/zookeeper-3.4.14_cth3_1.0.0/conf/* /usr/local/zookeeper/conf/“
4.重启zookeeper集群
ansible -b -i hosts_zookeeper all -m shell -a " systemctl restart zookeeper"
四、升级所有的nodemanager并启动
1.由于3.3.3版本的container-executor依赖libcrypto.so.1.1(该动态库包含在openssl1.1中,可通过编译openssl获取)
[root@paas-cl-eda-hadoopdn-4 ~]# ldd /usr/local/hadoop3/bin/container-executor
linux-vdso.so.1 => (0x00007fff065b0000)
libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007f9c3f1d5000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f9c3efb9000)
libc.so.6 => /lib64/libc.so.6 (0x00007f9c3ebec000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f9c3e9e8000)
/lib64/ld-linux-x86-64.so.2 (0x00007f9c3f883000)
2.分发libcrypto.so.1.1动态库
ansible -b -i hosts_all all -m copy -a "src=../openssl_lib_1.1.0/lib/libcrypto.so.1.1 dest=/lib64/"
五、重启yarn服务
ansible -b -i hosts_rm all -m shell -a " systemctl status hadoop-yarn-resourcemanager"
yarn rmadmin -getAllServiceState
ansible -b -i hosts_nm all -m shell -a "systemctl restart hadoop-yarn-nodemanager" -f 20
yarn node --list |sort
systemctl status hadoop-mapred-historyserver.service