本文为在实际操作中,使用oozie做任务调度,并在action中调用Python脚本。
workflow.xml的配置如下:
<workflow-app xmlns="uri:oozie:workflow:0.4" name="full-demo">
<start to="python-node"/>
<!-- Python action -->
<action name="python-node">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker></job-tracker>
<name-node></name-node>
<configuration>
<property>
<name></name>
<value></value>
</property>
</configuration>
<exec>test.py</exec>
<file><span >test.py</span><span ></file></span>
</shell>
<ok to="send-email"/>
<error to="fail"/>
</action>
<action name="send-email">
<email xmlns="uri:oozie:email-action:0.1">
<to>
<cc>
<subject>Email notifications for </subject>
<body>The wf successfully completed.</body>
</email>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Python action failed, error message[]</message>
</kill>
<end name="end"/>
</workflow-app>
job.properties文件配置如下:
nameNode=hdfs://test:8020
jobTracker=test:8050
queueName=default
examplesRoot=oozie
oozie.use.system.libpath=true
oozie.wf.application.path=/user/hdfs/
oozie.wf.rerun.failnodes=false
start=2016-09-01T01:34Z
end=2016-09-01T08:45Z
timezone=UTC
frequency=*/30 * * * *
这里有一点要注意的是,test.py文件中可以使用import或from ...import从Python系统中导入py文件,但对于自己写的py文件,从test.py中导入执行不了,这个问题目前还没有找到解决方案。