-
- 参考https://hadoop.apache.org/docs/r2.10.2/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
- 软件准备
- #新版本准备
wget https://archive.apache.org/dist/hadoop/common/hadoop-2.10.2/hadoop-2.10.2.tar.gz
tar -zxvf hadoop-2.10.2.tar.gz -C ../program/
cp ${HADOOP_HOME}/etc/hadoop/*-site.xml ${DIR}/hadoop-2.10.2/etc/hadoop/
cp ${HADOOP_HOME}/etc/hadoop/slaves ${DIR}/hadoop-2.10.2/etc/hadoop/
#拷贝到其他机器
scp -r ${ip}:${DIR}/hadoop-2.10.2 ${DIR}/hadoop-2.10.2 - 准备滚动升级
- 运行“ hdfs dfsadmin -rollingUpgrade prepare ”来创建用于回滚的 fsimage。
运行“ hdfs dfsadmin -rollingUpgrade query ”来检查回滚映像的状态。等待并重新运行该命令,直到显示“ Proceed with rolling upgrade ”消息。
- 升级主备神经网络
- 关闭并升级NN2。(如果DN和NN在一台机子上面DN也要关闭,因为要改环境变量)
${HADOOP_HOME}/sbin/hadoop-daemon.sh stop namenode - 使用“ -rollingUpgrade started ”选项将NN2作为备用启动。
$HADOOP_HOME/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode -rollingUpgrade started - 从NN1故障转移到NN2 ,以便NN2变为活动状态,NN1变为备用状态。
hdfs haadmin -failover nn2 nn1 - 关闭并升级NN1。
${HADOOP_HOME}/sbin/hadoop-daemon.sh stop namenode
使用“ -rollingUpgrade started ”选项将NN1作为备用启动。
$HADOOP_HOME/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode -rollingUpgrade started - 注意点
- 启动前要切换HADOOP_HOME
- vim ~/.bash_profile
export HADOOP_HOME=${DIR}/hadoop-2.10.1
改成
export HADOOP_HOME=${DIR}/hadoop-2.10.2
source ~/.bash_profile - 如果遇到no namenode to stop
- 修改HADOOP_PID_DIR配置
- mkdir -p ~/hadoop-data/pids
vim ${HADOOP_HOME}/etc/hadoop/hadoop-env.sh,更改HADOOP_PID_DIR的值
export HADOOP_PID_DIR=~/hadoop-data/pids
vim ${HADOOP_HOME}/etc/hadoop/hdfs-site.xml
dfs.ha.automatic-failover.enabled改成dfs.ha.automatic-failover.enabled.${nameservice的id} - 用jps查看现有的hadoop进程,并且kill掉
jps | grep -E ' NameNode|NodeManager|DataNode|JobHistoryServer|Jps|JournalNode' | awk '{print $1}' | xargs kill
jps | grep -E ' NodeManager|JournalNode' | awk '{print $1}' | xargs kill
- 重新启动原来的hadoop程序
#Start the HDFS NameNode with the following command on the designated node as hdfs:
$HADOOP_HOME/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode -rollingUpgrade started#Start a HDFS DataNode with the following command on each designated node as hdfs:
$HADOOP_HOME/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode#If etc/hadoop/slaves and ssh trusted access is configured (see Single Node Setup), all of the HDFS processes can be started with a utility script. As hdfs:
# $HADOOP_HOME/sbin/start-dfs.sh#Start the YARN with the following command, run on the designated ResourceManager as yarn:
$HADOOP_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager#Run a script to start a NodeManager on each designated host as yarn:
$HADOOP_HOME/sbin/yarn-daemons.sh --config $HADOOP_CONF_DIR start nodemanager#Start a standalone WebAppProxy server. Run on the WebAppProxy server as yarn. If multiple servers are used with load balancing it should be run on each of them:
$HADOOP_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start proxyserver#If etc/hadoop/slaves and ssh trusted access is configured (see Single Node Setup), all of the YARN processes can be started with a utility script. As yarn:
# $HADOOP_HOME/sbin/start-yarn.sh#Start the MapReduce JobHistory Server with the following command, run on the designated server as mapred:
$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh --config $HADOOP_CONF_DIR start historyserver#查看集群状态
hdfs haadmin -getAllServiceState- 如果两个NN都是standby,就要强制设置主节点
- hdfs haadmin -transitionToActive --forcemanual nn1
#${HADOOP_HOME}/bin/hdfs zkfc -formatZK - 升级DN
- 选择一小部分数据节点(例如特定机架下的所有数据节点)。
- 运行“ hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade ”以关闭所选数据节点之一。
运行“ hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT> ”检查并等待数据节点关闭。
升级并重启数据节点。
$HADOOP_HOME/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
对子集中所有选定的数据节点并行执行上述步骤。 - 重复上述步骤,直到集群中的所有datanode都升级完毕。
- 完成滚动升级
- 运行“ hdfs dfsadmin -rollingUpgrade finalize ”以完成滚动升级。
- 启动hive的元数据$HIVE_HOME/bin/hive --service metastore &