searchusermenu
  • 发布文章
  • 消息中心
点赞
收藏
评论
分享
原创

centos7环境hadoop3.2.1部署

2023-12-12 08:01:54
7
0

1.服务器信息

部署所有虚拟机是通过VMware Workstation Pro搭建的三台虚拟机。

虚拟机信息如下:

角色 ip 系统  
master 192.168.67.151 centos 7.6 hadoop3.2.1
work1 192.168.67.152 centos 7.6 hadoop3.2.1
work2 192.168.67.153 centos 7.6 hadoop3.2.1

 

2.环境准备

2.1 从官网下载hadoop

 

2.2 关闭selinux ,防火墙,hostname,hosts,配置ssh互信

 

关闭selinux(3台)

#修改selinux配置,修改SELINUX=disabled
vim /etc/selinux/config

 

关闭防火墙(3台)

#关闭防火墙
systemctl stop firewalld
#设置防火墙开启不启动
systemctl disable firewalld

 

修改主机名称(3台)

 #其他节点修改成work1~2
hostnamectl  set-hostname master

 

添加主机解析(3台)

192.168.67.151 master
192.168.67.152 work1
192.168.67.153 work2

 

设置ssh互信

ssh-keygen   #一路回车,生成秘钥
ssh-copy-id  root@worker1  #将公钥拷贝到worker1~2
ssh-copy-id  root@worker2

 

3. hadoop安装

3.1 安装java(3台)

#!/bin/bash

echo "安装JDK"
filePath=./jdk-8u191-linux-x64.tar.gz
installDir=/usr/local/java
mkdir -vp $installDir
#rm -rf /usr/local/java/jdk*
tar -zxvf $filePath -C $installDir

cat >> /etc/profile << EOF

export JAVA_HOME=/usr/local/java/jdk1.8.0_191
export JRE_HOME=/usr/local/java/jdk1.8.0_191/jre

EOF

echo -e '\nexport CLASSPATH=$JRE_HOME/lib:$JAVA_HOME/lib:$JRE_HOME/lib/charsets.jar' >> /etc/profile
echo -e '\nexport PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH' >> /etc/profile

source /etc/profile
chmod -R 755 $installDir

echo "安装JDK完成"

安装完成需要执行下 source /etc/profile

 

3.2 解压hadoop,修改配置文件

cd /data/hadoop/src/hadoop-3.2.1/etc/hadoop
vi hadoop-env.sh   #添加java_home
export JAVA_HOME=/usr/local/java/jdk1.8.0_191

vi yarn-env.sh  #添加java_home
export JAVA_HOME=/usr/local/java/jdk1.8.0_191

vi workers
worker1
worker2

 

修改core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/data/hadoop/tmp</value>
    </property>
</configuration>

 

修改hdfs-site.xml

<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>

 

修改yarn-site.xml

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.
mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>

 

修改启动脚本 添加如下内容,3.2版本用root用户启动会有报错

cd /data/hadoop/src/hadoop-3.2.1/sbin
vi start-dfs.sh
##添加如下内容
HDFS_DATANODE_USER=root  
HDFS_DATANODE_SECURE_USER=hdfs  
HDFS_NAMENODE_USER=root  
HDFS_SECONDARYNAMENODE_USER=root

vi stop-dfs.sh
#添加如下内容
HDFS_DATANODE_USER=root  
HDFS_DATANODE_SECURE_USER=hdfs  
HDFS_NAMENODE_USER=root  
HDFS_SECONDARYNAMENODE_USER=root

vi start-yarn.sh
#添加如下内容
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

vi stop-yarn.sh
#添加如下内容
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

 

添加hadoop环境变量

vi  /etc/profile
HADOOP_HOME=/data/hadoop/src/hadoop-3.2.1/
export PATH=$PATH:$HADOOP_HOME/bin

更新环境变量

source /etc/profile

 

将hadoop目录拷贝到worker节点上 

scp -r  /data/hadoop/   root@192.168.67.152:/data
scp -r  /data/hadoop/   root@192.168.67.153:/data

 

初始化hadoop集群

hadoop  namenode  -format

 

启动hadoop集群

/data/hadoop/src/hadoop-3.2.1/sbin/start-all.sh 

 

测试集群

上传一个文件到集群

#上传
hadoop fs -put test02.txt  /

#查看文件
hadoop fs -ls /

 

执行hadoop job

cd /data/hadoop/src/hadoop-3.2.1/share/hadoop/mapreduce/
hadoop jar ./hadoop-mapreduce-examples-3.2.1.jar pi 5 10
Number of Maps  = 5
Samples per Map = 10
2023-12-12 15:15:42,094 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Wrote input for Map #0
2023-12-12 15:15:42,238 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Wrote input for Map #1
2023-12-12 15:15:42,262 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Wrote input for Map #2
2023-12-12 15:15:42,690 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Wrote input for Map #3
2023-12-12 15:15:42,775 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Wrote input for Map #4
Starting Job
2023-12-12 15:15:42,845 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
2023-12-12 15:15:42,894 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2023-12-12 15:15:46,226 INFO mapred.LocalJobRunner: Finishing task: attempt_local427066532_0001_r_000000_0
2023-12-12 15:15:46,226 INFO mapred.LocalJobRunner: reduce task executor complete.
2023-12-12 15:15:46,425 INFO mapreduce.Job:  map 100% reduce 100%
2023-12-12 15:15:46,425 INFO mapreduce.Job: Job job_local427066532_0001 completed successfully
2023-12-12 15:15:46,434 INFO mapreduce.Job: Counters: 36
	File System Counters
		FILE: Number of bytes read=1913980
		FILE: Number of bytes written=5033552
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=2360
		HDFS: Number of bytes written=3755
		HDFS: Number of read operations=89
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=45
		HDFS: Number of bytes read erasure-coded=0
	Map-Reduce Framework
		Map input records=5
		Map output records=10
		Map output bytes=90
		Map output materialized bytes=140
		Input split bytes=715
		Combine input records=0
		Combine output records=0
		Reduce input groups=2
		Reduce shuffle bytes=140
		Reduce input records=10
		Reduce output records=0
		Spilled Records=20
		Shuffled Maps =5
		Failed Shuffles=0
		Merged Map outputs=5
		GC time elapsed (ms)=39
		Total committed heap usage (bytes)=3155165184
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=590
	File Output Format Counters 
		Bytes Written=97
Job Finished in 3.642 seconds
Estimated value of Pi is 3.28000000000000000000
0条评论
0 / 1000
天凉好个秋
8文章数
0粉丝数
天凉好个秋
8 文章 | 0 粉丝
原创

centos7环境hadoop3.2.1部署

2023-12-12 08:01:54
7
0

1.服务器信息

部署所有虚拟机是通过VMware Workstation Pro搭建的三台虚拟机。

虚拟机信息如下:

角色 ip 系统  
master 192.168.67.151 centos 7.6 hadoop3.2.1
work1 192.168.67.152 centos 7.6 hadoop3.2.1
work2 192.168.67.153 centos 7.6 hadoop3.2.1

 

2.环境准备

2.1 从官网下载hadoop

 

2.2 关闭selinux ,防火墙,hostname,hosts,配置ssh互信

 

关闭selinux(3台)

#修改selinux配置,修改SELINUX=disabled
vim /etc/selinux/config

 

关闭防火墙(3台)

#关闭防火墙
systemctl stop firewalld
#设置防火墙开启不启动
systemctl disable firewalld

 

修改主机名称(3台)

 #其他节点修改成work1~2
hostnamectl  set-hostname master

 

添加主机解析(3台)

192.168.67.151 master
192.168.67.152 work1
192.168.67.153 work2

 

设置ssh互信

ssh-keygen   #一路回车,生成秘钥
ssh-copy-id  root@worker1  #将公钥拷贝到worker1~2
ssh-copy-id  root@worker2

 

3. hadoop安装

3.1 安装java(3台)

#!/bin/bash

echo "安装JDK"
filePath=./jdk-8u191-linux-x64.tar.gz
installDir=/usr/local/java
mkdir -vp $installDir
#rm -rf /usr/local/java/jdk*
tar -zxvf $filePath -C $installDir

cat >> /etc/profile << EOF

export JAVA_HOME=/usr/local/java/jdk1.8.0_191
export JRE_HOME=/usr/local/java/jdk1.8.0_191/jre

EOF

echo -e '\nexport CLASSPATH=$JRE_HOME/lib:$JAVA_HOME/lib:$JRE_HOME/lib/charsets.jar' >> /etc/profile
echo -e '\nexport PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH' >> /etc/profile

source /etc/profile
chmod -R 755 $installDir

echo "安装JDK完成"

安装完成需要执行下 source /etc/profile

 

3.2 解压hadoop,修改配置文件

cd /data/hadoop/src/hadoop-3.2.1/etc/hadoop
vi hadoop-env.sh   #添加java_home
export JAVA_HOME=/usr/local/java/jdk1.8.0_191

vi yarn-env.sh  #添加java_home
export JAVA_HOME=/usr/local/java/jdk1.8.0_191

vi workers
worker1
worker2

 

修改core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/data/hadoop/tmp</value>
    </property>
</configuration>

 

修改hdfs-site.xml

<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>

 

修改yarn-site.xml

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.
mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>

 

修改启动脚本 添加如下内容,3.2版本用root用户启动会有报错

cd /data/hadoop/src/hadoop-3.2.1/sbin
vi start-dfs.sh
##添加如下内容
HDFS_DATANODE_USER=root  
HDFS_DATANODE_SECURE_USER=hdfs  
HDFS_NAMENODE_USER=root  
HDFS_SECONDARYNAMENODE_USER=root

vi stop-dfs.sh
#添加如下内容
HDFS_DATANODE_USER=root  
HDFS_DATANODE_SECURE_USER=hdfs  
HDFS_NAMENODE_USER=root  
HDFS_SECONDARYNAMENODE_USER=root

vi start-yarn.sh
#添加如下内容
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

vi stop-yarn.sh
#添加如下内容
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

 

添加hadoop环境变量

vi  /etc/profile
HADOOP_HOME=/data/hadoop/src/hadoop-3.2.1/
export PATH=$PATH:$HADOOP_HOME/bin

更新环境变量

source /etc/profile

 

将hadoop目录拷贝到worker节点上 

scp -r  /data/hadoop/   root@192.168.67.152:/data
scp -r  /data/hadoop/   root@192.168.67.153:/data

 

初始化hadoop集群

hadoop  namenode  -format

 

启动hadoop集群

/data/hadoop/src/hadoop-3.2.1/sbin/start-all.sh 

 

测试集群

上传一个文件到集群

#上传
hadoop fs -put test02.txt  /

#查看文件
hadoop fs -ls /

 

执行hadoop job

cd /data/hadoop/src/hadoop-3.2.1/share/hadoop/mapreduce/
hadoop jar ./hadoop-mapreduce-examples-3.2.1.jar pi 5 10
Number of Maps  = 5
Samples per Map = 10
2023-12-12 15:15:42,094 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Wrote input for Map #0
2023-12-12 15:15:42,238 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Wrote input for Map #1
2023-12-12 15:15:42,262 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Wrote input for Map #2
2023-12-12 15:15:42,690 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Wrote input for Map #3
2023-12-12 15:15:42,775 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
Wrote input for Map #4
Starting Job
2023-12-12 15:15:42,845 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
2023-12-12 15:15:42,894 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2023-12-12 15:15:46,226 INFO mapred.LocalJobRunner: Finishing task: attempt_local427066532_0001_r_000000_0
2023-12-12 15:15:46,226 INFO mapred.LocalJobRunner: reduce task executor complete.
2023-12-12 15:15:46,425 INFO mapreduce.Job:  map 100% reduce 100%
2023-12-12 15:15:46,425 INFO mapreduce.Job: Job job_local427066532_0001 completed successfully
2023-12-12 15:15:46,434 INFO mapreduce.Job: Counters: 36
	File System Counters
		FILE: Number of bytes read=1913980
		FILE: Number of bytes written=5033552
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=2360
		HDFS: Number of bytes written=3755
		HDFS: Number of read operations=89
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=45
		HDFS: Number of bytes read erasure-coded=0
	Map-Reduce Framework
		Map input records=5
		Map output records=10
		Map output bytes=90
		Map output materialized bytes=140
		Input split bytes=715
		Combine input records=0
		Combine output records=0
		Reduce input groups=2
		Reduce shuffle bytes=140
		Reduce input records=10
		Reduce output records=0
		Spilled Records=20
		Shuffled Maps =5
		Failed Shuffles=0
		Merged Map outputs=5
		GC time elapsed (ms)=39
		Total committed heap usage (bytes)=3155165184
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=590
	File Output Format Counters 
		Bytes Written=97
Job Finished in 3.642 seconds
Estimated value of Pi is 3.28000000000000000000
文章来自个人专栏
天凉好个秋-软件开发
8 文章 | 1 订阅
0条评论
0 / 1000
请输入你的评论
0
0