Preparation

create user oozie at all the nodes

Oozie server will be installed at the namenode1

Oozie clients will be installed at r01edge and r02edge


export OOZIE_URL=http://namenode1:11000/oozie


update namenode1 core-site.xml by adding the following stanzas and restart the cluster:

<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>*</value>
</property>


tar xzvf oozie-5.2.0.tar.gz

Compile Oozie from source

cd oozie-5.2.0

bin/mkdistro.sh -Phadoop-3 -Ptez -Puber -DskipTests

cp -r /home/oozie/oozie-5.2.0/distro/target/oozie-5.2.0-distro/oozie* oozie


Create shared libraries

cd /home/oozie/oozie

mkdir libext

pscp postgresql-42.2.14.jar oozie@namenode1:/home/oozie/oozie/libext

pscp ext-2.2.zip oozie@namenode1:/home/oozie/oozie/libext


oozie-setup.sh sharelib create -fs hdfs://namenode1:8020/user/oozie/share/lib

log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Found Hadoop that does not support Erasure Coding. Not taking any action.
the destination path for sharelib is: /user/oozie/share/lib/lib_20200621141415

this will put the console and postgresql driver in war.


Configuration

as root:

cp -r /home/oozie/oozie  /app/oozie

chown -R oozie:oozie /app/oozie

as oozie

cd /app/oozie/conf


vi oozie-site.xml

<configuration>

<property>
<name>oozie.service.JPAService.jdbc.driver</name>
<value>org.postgresql.Driver</value>
<description>
JDBC driver class.
</description>
</property>

<property>
<name>oozie.service.JPAService.jdbc.url</name>
<value>jdbc:postgresql://r02edge:5432/oozie</value>
<description>
JDBC URL.
</description>
</property>

<property>
<name>oozie.service.JPAService.jdbc.username</name>
<value>oozie</value>
<description>
DB user name.
</description>
</property>

<property>
<name>oozie.service.JPAService.jdbc.password</name>
<value>mYpAAssWrd</value>
<description>
DB user password.

</description>
</property>

</configuration>

Database selection

su - postgres

psql
create a new user:
create role oozie with login password 'sOmePasWD' valid until 'infinity';

create a database for the user:
postgres=# create database oozie with encoding='UTF8' owner=oozie connection limit=-1;
CREATE DATABASE

grant all privileges:
postgres=# grant all privileges on database oozie to oozie;
GRANT
postgres=# \q



su - oozie

populate the database schema:

bin/ooziedb.sh create -sqlfile oozie.sql -run

........................

DONE
Creating composite indexes
DONE
Create OOZIE_SYS table
DONE

Oozie DB has been created for Oozie version '5.2.0'


The SQL commands have been written to: oozie.sql


Starting the server


bin/oozied.sh start

............................

inflating: /app/oozie/embedded-oozie-server/webapp/ext-2.2/air/src/Sound.js
creating: /app/oozie/embedded-oozie-server/webapp/ext-2.2/air/src/sql/
inflating: /app/oozie/embedded-oozie-server/webapp/ext-2.2/air/src/sql/AirConnection.js
inflating: /app/oozie/embedded-oozie-server/webapp/ext-2.2/air/src/sql/Connection.js
inflating: /app/oozie/embedded-oozie-server/webapp/ext-2.2/air/src/sql/Proxy.js
inflating: /app/oozie/oozie/embedded-oozie-server/webapp/ext-2.2/air/src/sql/Table.js
inflating:/app/oozie/oozie/embedded-oozie-server/webapp/ext-2.2/air/src/SystemMenu.js
inflating: /app/oozie/embedded-oozie-server/webapp/ext-2.2/air/src/SystemTray.js
inflating: /app/oozie/embedded-oozie-server/webapp/ext-2.2/adapter/yui/yui-utilities.js

INFO: Oozie is ready to be started

Setting up oozie DB

Validate DB Connection
DONE
DB schema exists

The SQL commands have been written to: /tmp/ooziedb-5847608831844515011.sql

Oozie server started - PID: 5077.

bin/oozie admin -oozie http://r02edge:11000/oozie -status
log4j:WARN No appenders could be found for logger (org.apache.hadoop.security.authentication.client.KerberosAuthenticator).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
System mode: NORMAL
[oozie@r02edge oozie]$


Check the web console:

http://namenode1:11000

Check the shared libraries available to Oozie:

oozie admin -shareliblist
log4j:WARN No appenders could be found for logger (org.apache.hadoop.security.authentication.client.KerberosAuthenticator).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
[Available ShareLib]
hive
distcp
git
mapreduce-streaming
spark
oozie
hcatalog
hive2
sqoop
pig

UPDATING shared libraries.

  1. unpack oozie-sharelib-5.2.0.tar.gz in /app/oozie
  2. go to /app/oozie/share/lib/spark and replace the libraries with the versions for spark 3 and scala 2.12
  3. go to  /app/oozie/share/lib/hive2 and replace the hive libraries with the versions for Hive 3
  4. Update the shared libraries:

oozie-setup.sh sharelib create -fs hdfs://namenode1:8020/user/oozie/share/lib -locallib /app/oozie/share

oozie 

check that the 2.12 versions are now installed in oozie shared libs folder:

hdfs dfs -ls hdfs://namenode1:8020/user/oozie/share/lib/lib_20200715181359/spark

oozie admin -sharelibupdate

[ShareLib update status]
sharelibDirOld = hdfs://namenode1:8020/user/oozie/share/lib/lib_20200715181359
host = http://0.0.0.0:11000/oozie
sharelibDirNew = hdfs://namenode1:8020/user/oozie/share/lib/lib_20200716114006
status = Successful


restart the server:

oozied.sh stop

oozied.sh start

oozie admin -shareliblist spark

oozie admin -shareliblist hive2


copy the client to r02edge into the directory /app/oozie

set the PATH

export PATH=/app/oozie/bin:${PATH}

set the URL

export OOZIE_URL=http://namenode1:11000/oozie

verify connection to Oozie server

oozie admin -status

System mode: NORMAL

File system action

test oozie with filesystem action:

mkdir -p projects/filetest

cd filetest

[oozie@r02edge filetest] vi job.properties

nameNode=hdfs://namenode1:8020
jobTracker=namenode2.custom-built-apps.com:8032
bundle_root=${nameNode}/user/oozie/filetest
wf_root=${bundle_root}/workflows
oozie.wf.application.path=${wf_root}/filetest.xml
oozie.use.system.libpath=true


[oozie@r02edge filetest] vi run.sh

#!/bin/sh

oozie job -config job.properties -run

[oozie@r02edge filetest] chmod 755 run.sh

[oozie@r02edge filetest] mkdir workflows

[oozie@r02edge filetest] cd workflows

[oozie@r02edge filetest] vi filetest.xml

<workflow-app name="wfl_makefile" xmlns="uri:oozie:workflow:0.5">
<start to="fsaction"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="fsaction">
<fs>
<touchz path='${nameNode}/user/oozie/myfile' />
</fs>
<ok to="End"/>
<error to="Kill" />
</action>
<end name="End"/>
</workflow-app>

[oozie@r02edge filetest] oozie validate filetest.xml

Valid workflow-app

cd $HOME/projects

hdfs dfs -put filetest

hdfs dfs -ls -R filetest

cd filetest

./run.sh

job: 0000000-200704150138282-oozie-oozi-W

[oozie@r02edge filetest]$ oozie job -info 0000000-200704150138282-oozie-oozi-W
log4j:WARN No appenders could be found for logger (org.apache.hadoop.security.authentication.client.KerberosAuthenticator).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Job ID : 0000000-200704150138282-oozie-oozi-W
------------------------------------------------------------------------------------------------------------------------------------
Workflow Name : wfl_makefile
App Path : hdfs://namenode1:8020/user/oozie/filetest/workflows/filetest.xml
Status : SUCCEEDED
Run : 0
User : oozie
Group : -
Created : 2020-07-04 20:07 GMT
Started : 2020-07-04 20:07 GMT
Last Modified : 2020-07-04 20:07 GMT
Ended : 2020-07-04 20:07 GMT
CoordAction ID: -

Actions
------------------------------------------------------------------------------------------------------------------------------------
ID Status Ext ID Ext Status Err Code
------------------------------------------------------------------------------------------------------------------------------------
0000000-200704150138282-oozie-oozi-W@:start: OK - OK -
------------------------------------------------------------------------------------------------------------------------------------
0000000-200704150138282-oozie-oozi-W@fsaction OK - OK -
------------------------------------------------------------------------------------------------------------------------------------
0000000-200704150138282-oozie-oozi-W@End OK - OK -
------------------------------------------------------------------------------------------------------------------------------------


hdfs dfs -ls 

...

myfile

....


Check the Oozie webconsole: http://namenode1:11000/oozie


shell action

out of the box problem the action is running 10 minutes before transition.

temporary solution: decrease the Action checker interval(default is 600 sec, exactly 10 minutes)

vi oozie-site.xml

<property>
<name>oozie.service.ActionCheckerService.action.check.delay</name>
<value>30</value>
</property>


mkdir $HOME/ projects/oozie/shell

cd $HOME/ projects/oozie/shell

create a job.properties file:

vi job.properties

nameNode=hdfs://namenode1:8020
jobTracker=namenode2.custom-built-apps.com:8032
bundle_root=${nameNode}/user/dataexplorer1/oozie/shell
wf_root=${bundle_root}/workflows
oozie.wf.application.path=${wf_root}/copyfile.xml
oozie.use.system.libpath=true
user.name=dataexplorer1
filename=zfile


mkdir workflows

cd workflows

create a shell script

vi copyFile.sh

#!/bin/sh
filename=$1
hdfs dfs -cp /user/dataexplorer1/test /user/dataexplorer1/${filename}
exit 0


create a workflow.xml:

vi copyfile.xml


<workflow-app name="wfl_copyfile" xmlns="uri:oozie:workflow:0.5">
<start to="shellCopy1"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>


<action name="shellCopy1">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>copyFile.sh</exec>
<argument>${filename}</argument>
<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
<file>${wf_root}/copyFile.sh#copyFile.sh</file>
<capture-output/>
</shell>
<ok to="shellCopy2"/>
<error to="Kill"/>
</action>


<action name="shellCopy2">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>copyFile.sh</exec>
<argument>${filename}_copy</argument>
<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
<file>${wf_root}/copyFile.sh#copyFile.sh</file>
<capture-output/>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>

<end name="End"/>
</workflow-app>


Put the files into hdfs:


cd $HOME/projects/oozie

hdfs dfs -put shell oozie

Run the application:

cd $HOME/projects/oozie/shell

oozie job -config job.properties -run

job: 0000001-200716154123757-oozie-oozi-W

check the progress:

[dataexplorer1@r02edge shell]$ oozie job -info 0000001-200716154123757-oozie-oozi-W
Job ID : 0000001-200716154123757-oozie-oozi-W
------------------------------------------------------------------------------------------------------------------------------------
Workflow Name : wfl_copyfile
App Path : hdfs://namenode1:8020/user/dataexplorer1/oozie/shell/workflows/copyfile.xml
Status : SUCCEEDED
Run : 0
User : dataexplorer1
Group : -
Created : 2020-07-16 19:43 GMT
Started : 2020-07-16 19:43 GMT
Last Modified : 2020-07-16 19:45 GMT
Ended : 2020-07-16 19:45 GMT
CoordAction ID: -

Actions
------------------------------------------------------------------------------------------------------------------------------------
ID Status Ext ID Ext Status Err Code
------------------------------------------------------------------------------------------------------------------------------------
0000001-200716154123757-oozie-oozi-W@:start: OK - OK -
------------------------------------------------------------------------------------------------------------------------------------
0000001-200716154123757-oozie-oozi-W@shellCopy1 OK application_1593903918800_0138SUCCEEDED -
------------------------------------------------------------------------------------------------------------------------------------
0000001-200716154123757-oozie-oozi-W@shellCopy2 OK application_1593903918800_0139SUCCEEDED -
------------------------------------------------------------------------------------------------------------------------------------
0000001-200716154123757-oozie-oozi-W@End OK - OK -
------------------------------------------------------------------------------------------------------------------------------------

[dataexplorer1@r02edge shell]$ hdfs dfs -ls

-rw-r--r-- 3 dataexplorer1 hadoop 83 2020-07-16 15:43 zfile
-rw-r--r-- 3 dataexplorer1 hadoop 83 2020-07-16 15:44 zfile_copy


Email action

mkdir -p $HOME/oozie/email/workflows

cd $HOME/oozie/email/workflows

vi email.xml

<workflow-app name="wf_gmail" xmlns="uri:oozie:workflow:0.1">
<start to="sendEmail"/>
<action name="sendEmail">
<email xmlns="uri:oozie:email-action:0.1">
<to>boris.alexandrov@hotmail.ca</to>
<cc>boris.alexandrov@outlook.com</cc>
<subject>Email notifications for ${wf:id()}</subject>
<body>The wf ${wf:id()} successfully completed.</body>
</email>
<ok to="End"/>
<error to="Kill"/>
</action>
<kill name="Kill"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill>
<end name="End"/>
</workflow-app>


cd ..

vi job.properties

nameNode=hdfs://namenode1:8020
jobTracker=namenode2.custom-built-apps.com:8032
bundle_root=${nameNode}/user/dataexplorer1/oozie/email
wf_root=${bundle_root}/workflows
oozie.wf.application.path=${wf_root}/email.xml
oozie.use.system.libpath=true
user.name=dataexplorer1

hdfs dfs -put email oozie




oozie-site.xml

<property>
<name>oozie.email.smtp.host</name>
<value>smtpout.secureserver.net</value>
</property>

<property>
<name>oozie.email.smtp.port</name>
<value>587</value>
</property>

<property>
<name>oozie.email.from.address</name>
<value>info@custom-built-apps.com</value>
</property>

<property>
<name>oozie.email.smtp.auth</name>
<value>true</value>
</property>

<property>
<name>oozie.email.smtp.starttls.enable</name>
<value>true</value>
</property>

<property>
<name>oozie.email.smtp.username</name>
<value>info@custom-built-apps.com</value>
</property>
<property>
<name>oozie.email.smtp.password</name>
<value>Password</value>
</property>

Attachments: