Despite the fact Oozie has an email action, it sometimes does not work due to the misconfiguration of the mail server or TTLS protocol on the relay server issuing the dreadful "cannot issue start TTLS command" and just hangs there.
First, we will attempt to configure oozie to connect to the mail server by setting the following parameters in the oozie-site.xml on the server:
oozie.email.smtp.host=smtp.gmail.com
oozie.email.smtp.port=587
oozie.email.from.address=your_address@gmail.com
oozie.email.smtp.auth=true
oozie.email.smtp.starttls.enable=true
oozie.email.smtp.username=your_address@gmail.com
oozie.email.smtp.password=YourPassWord
Then we can try a simple email action and if it works, the problem is resolved.
If the configuration issues are not resolved, then we will have to resolve them programmatically.
The choice of language is Python.
1. We will be creating the program which is reading the sendmail.ini file with settings and generates the message body and attaches the files if needed. Then it sends email using gmail server to inform the user of an error in the workflow
2. The Python scripts with the program modules will be archived into a zip
3. After testing from the command line, the workflow shell action will be created calling the python modules from the archive
Python program:
Please note, that in your google account you will need to allow "less secure applications " for this to work. It also is possible to use outlook server (smtp-mail.outlook.com)
Settings file sendmail.ini
RECIPIENTS:boris.alexandrov@outlook.com:boris.alexandrov@hotmail.ca:someone@gmail.com
CC:borisalexandrov264@gmail.com
SUBJECT:workflow failure notification
BODY:FILE:message.txt
USER:youruser@gmail.com
PASSWORD:your_gmail_pwd
SMTPHOST:smtp.gmail.com
PORT:587
ATTACHMENT:workflow.txt
iniFileReader.py
#!/usr/bin/python
# Author : Boris Alexandrov
# Date : February 2, 2017
# Purpose : the class parses the ini file for email application into series of lists
# : which will later be accessed downstream
##################################################################################class iniFileReader:
def __init__ (self,filename):
self.fileName=filename
fp=open(self.fileName)
self.lines=fp.read().splitlines()
fp.close()
self.recipients=self.lines[0].split(':')
del self.recipients[0]
self.ccrecipients=self.lines[1].split(':')
del self.ccrecipients[0]
line=self.lines[2].split(':')
self.subject=line[1].rstrip()
line=self.lines[3].split(':')
if (line[1].rstrip() == 'FILE'):
filename=line[2].rstrip()
fp1=open(filename)
self.body=fp1.read()
fp1.close()
else:
self.body=line[1]
line=self.lines[4].split(':')
self.username=line[1].rstrip()
line=self.lines[5].split(':')
self.password=line[1].rstrip()
line=self.lines[6].split(':')
self.smtphost=line[1].rstrip()
line=self.lines[7].split(':')
self.port=line[1].rstrip()
self.attachments=self.lines[8].split(':')
del self.attachments[0]
cbasendmail.py
#!/usr/bin/python
import smtplib
import mimetypes
from email.mime.multipart import MIMEMultipart
from email import encoders
from email.mime.text import MIMEText
from email.mime.image import MIMEImage
from email.mime.audio import MIMEAudio
from email.mime.base import MIMEBase
from email.message import Message
from email.mime.message import MIMEMessage
from iniFileReader import iniFileReader
def addAttachment(msg,fileToSend):
ctype,encoding = mimetypes.guess_type(fileToSend)
if ctype is None or encoding is not None:
ctype = "application/octet-stream"
maintype, subtype = ctype.split("/",1)
if maintype == "text":
fp=open(fileToSend)
attachment=MIMEText(fp.read(),_subtype=subtype)
fp.close()
if maintype == "image":
fp=open(fileToSend,"rb")
attachment=MIMEImage(fp.read(),_subtype=subtype)
fp.close()
if maintype == "audio":
fp=open(fileToSend,"rb")
attachment=MIMEAudio(fp.read(),_subtype=subtype)
fp.close()
else:
fp=open(fileToSend,"rb")
attachment=MIMEBase(maintype,subtype)
attachment.set_payload(fp.read())
fp.close()
encoders.encode_base64(attachment)
attachment.add_header("Content-Disposition","attachment",filename=fileToSend)
msg.attach(attachment)
fileName='sendmail.ini'
reader = iniFileReader(fileName)
try:
msg= MIMEMultipart()
msg["From"] = reader.username
msg["To"] = ",".join(reader.recipients)
msg["Subject"]=reader.subject
#BODY
body=MIMEText(reader.body);
msg.attach(body)
# IF THERE ARE ATTACHMENTS deal with them here
for fileToSend in reader.attachments:
if(len(reader.attachments)==0):
break
#there is one or more ATTACHMENTs
addAttachment(msg,fileToSend)
# SMTP server info
s = smtplib.SMTP(host=reader.smtphost,port=reader.port)
s.starttls()
s.login(reader.username,reader.password)
s.sendmail(reader.username,reader.recipients,msg.as_string())
s.quit()
print('Successfully sent email')
except smtplib.SMTPException:
print('Error: unable to send email')
genErrorMessage.py
import sys
import subprocess
line ="""Please be advised there has been a failure while running {0}""".format(sys.argv[1])
cmd = """echo '{0}'|hdfs dfs -put -f - {1}""".format(line,sys.argv[2])
returned_value = subprocess.call(cmd, shell=True) # returns the exit code in unix
Create a zip with the files:
mkdir workflows
cp genErrorMessage.py workflows
cp cbasendmail.py workflows
cp iniFileReader.py workflows
cd workflows
zip email.zip ./*
test the python script with the command line interpreter
echo “this is the message”>message.txt
echo “this is the attachment”>workflow.txt
run the module cbasendmail from command line:
PYTHONPATH=email.zip python -m cbasendmaill
If everything is correct you will receive an email
Prepare Oozie workflow
Job.porperties
nameNode=hdfs://nameservice1
jobTracker=datanode1.custom-built-apps.com:8032
oozie.use.system.libpath=true
username=dataexplorer1
workflow_name=workflow1
sparkMaster=yarn
bundle_root=${nameNode}/user/dataexplorer1/oozie/email
wfroot=${bundle_root}/workflows
oozie.wf.application.path=${wfroot}/sendErrorEmail.xml
Workflows:
sendErrorEmail.xml
<workflow-app name="wfl_sendErrorEmail" xmlns="uri:oozie:workflow:0.5">
<start to="shell-email-generate"/>
<action name="shell-email-generate">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>python</exec>
<argument>-m</argument>
<argument>genErrorMessage</argument>
<argument>${workflow_name}</argument>
<argument>${wfroot}/message.txt</argument>
<env-var>PYTHONPATH=pyEmail</env-var>
<env-var>HADOOP_USER_NAME=${username}</env-var>
<file>${wfroot}/email.zip#pyEmail</file>
<file>${wfroot}/sendmail.ini#sendmail.ini</file>
<file>${wfroot}/message.txt#message.txt</file>
<file>${wfroot}/workflow.txt#workflow.txt</file>
<capture-output/>
</shell>
<ok to="shell-email-send"/>
<error to="Kill"/>
</action>
<action name="shell-email-send">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>python</exec>
<argument>-m</argument>
<argument>cbasendmail</argument>
<env-var>PYTHONPATH=pyEmail</env-var>
<file>${wfroot}/email.zip#pyEmail</file>
<file>${wfroot}/sendmail.ini#sendmail.ini</file>
<file>${wfroot}/message.txt#message.txt</file>
<file>${wfroot}/workflow.txt#workflow.txt</file>
<capture-output/>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="End"/>
</workflow-app>
Put cbasendmail.ini into the workflows folder
put message.txt into the workflows folder(can be empty)
put workflow.txt into the workflows folder
the directory structure :
email--
-job.properties
-workflows
--sendErrorEmail.xml
--email.zip
--worklow.txt
--message.txt
--sendmail.ini
deploy the folder to hdfs:
hdfs dfs -put email /user/dataexplorer1/oozie/email
cd email
export OOZIE_URL=http://datanode1.custom-built-apps.com:11000/oozie
oozie job -config job.properties -run