sqoop job可以将执行的语句(导入导出工作)变成一个job,在需要的时候执行。一般而言,在非单独使用sqoop的时候,使用调度器进行管理。job可以创建、执行、删除以及查看等。
本文是按照job的基本用法顺序列出7个部分的目录。比较简单,不再详细赘述。
一、job 语法
$ sqoop job (generic-args) (job-args) [-- [subtool-name] (subtool-args)]
$ sqoop-job (generic-args) (job-args) [-- [subtool-name] (subtool-args)]
二、创建job
创建一个名为testsqoopjob,这可以从RDBMS表的数据导入到HDFS作业。
下面的命令用于创建一个从DB数据库的person表导入到HDFS文件的作业。
sqoop job --create testsqoopjob \
-- import --connect "jdbc:mysql://192.168.10.44:3306/test?useUnicode=true&characterEncoding=utf-8" --username root --password 888888 \
--target-dir /sqoop/testsqoopjob \
--table person --m 1
# 注意import前要有空格
三、查看job列表
‘–list’ 参数是用来验证保存的作业。下面的命令用来验证保存Sqoop作业的列表。
sqoop job --list
[root@server7 lib]# sqoop job --list
Warning: /opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
22/11/09 10:58:02 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7-cdh6.2.1
22/11/09 10:58:03 INFO manager.SqlManager: Using default fetchSize of 1000
Available jobs:
testsqoopjob
四、查看job详细信息
‘–show’ 参数用于检查或验证特定的工作,及其详细信息。以下命令和样本输出用来验证一个名为testsqoopjob的作业。
sqoop job --show testsqoopjob
[root@server7 lib]# sqoop job --show testsqoopjob
Warning: /opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
22/11/09 10:59:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7-cdh6.2.1
22/11/09 10:59:27 INFO manager.SqlManager: Using default fetchSize of 1000
Enter password:
Job: testsqoopjob
Tool: import
Options:
----------------------------
reset.onemapper = false
codegen.output.delimiters.enclose = 0
sqlconnection.metadata.transaction.isolation.level = 2
codegen.input.delimiters.escape = 0
parquet.configurator.implementation = HADOOP
codegen.auto.compile.dir = true
accumulo.batch.size = 10240000
codegen.input.delimiters.field = 0
accumulo.create.table = false
mainframe.ftp.buffersize = 32760
mainframe.input.dataset.type = p
enable.compression = false
skip.dist.cache = false
accumulo.max.latency = 5000
db.username = root
sqoop.throwOnError = false
db.clear.staging.table = false
codegen.input.delimiters.enclose = 0
hdfs.append.dir = false
import.direct.split.size = 0
hcatalog.drop.and.create.table = false
codegen.output.delimiters.record = 10
codegen.output.delimiters.field = 44
hdfs.target.dir = /sqoop/testsqoopjob
hbase.bulk.load.enabled = false
mapreduce.num.mappers = 1
export.new.update = UpdateOnly
db.require.password = true
hive.import = false
customtool.options.jsonmap = {}
hdfs.delete-target.dir = false
codegen.output.delimiters.enclose.required = false
direct.import = false
codegen.output.dir = .
hdfs.file.format = TextFile
hive.drop.delims = false
codegen.input.delimiters.record = 0
db.batch = false
codegen.delete.compile.dir = false
split.limit = null
hcatalog.create.table = false
hive.fail.table.exists = false
hive.overwrite.table = false
incremental.mode = None
temporary.dirRoot = _sqoop
verbose = false
hbase.null.incremental.mode = Ignore
mainframe.ftp.transfermode = ascii
import.max.inline.lob.size = 16777216
import.fetch.size = null
codegen.input.delimiters.enclose.required = false
relaxed.isolation = false
sqoop.oracle.escaping.disabled = true
db.table = person
hbase.create.table = false
codegen.compile.dir = /tmp/sqoop-root/compile/00b80a02ee498278d7f361500aba22ed
codegen.output.delimiters.escape = 0
db.connect.string = jdbc:mysql://192.168.10.44:3306/test?useUnicode=true&characterEncoding=utf-8
五、执行job
‘–exec’ 选项用于执行保存的作业。下面的命令用于执行保存的作业称为testsqoopjob。
sqoop job --exec testsqoopjob
六、免密执行job
sqoop在创建job时,使用–password-file参数,可以避免输入mysql密码,如果使用–password将出现警告,并且每次都要手动输入密码才能执行job,sqoop规定密码文件必须存放在HDFS上,并且权限必须是400。并且检查sqoop的sqoop-site.xml是否存在如下配置:
<property>
<name>sqoop.metastore.client.record.password</name>
<value>true</value>
<description>If true, allow saved passwords in the metastore.
</description>
</property>
----------创建job
sqoop job --create testsqoopjob2 \
-- import --connect jdbc:mysql://192.168.10.44:3306/test --username root \
--password-file /sqoop/password/10_44_mysqlpwd.txt \
--target-dir /sqoop/testsqoopjob2 \
--table person --m 1
--------------mysql数据库密码存放位置:/sqoop/password/10_44_mysqlpwd.txt
[root@server7 sqoop]# hadoop fs -cat /sqoop/password/10_44_mysqlpwd.txt
888888
-----------执行job
sqoop job --exec testsqoopjob2
七、删除job
sqoop job --delete testsqoopjob
sqoop job --delete testsqoopjob2
以上,简单的介绍了sqoop的作业使用。