The Custom Built Apps workflow application is an application which is to be used for generating data pipelines. It addresses the frequently encountered issues while running datapipelines:
- execution of spark jars , so the developers need to use some shell workarounds to submit their jars for execution.
- starting all the nodes at the same time and waiting for the dependencies to be resolved, thus there is a multitude of nodes which are consuming the system resources in a loop waiting for their turn to execute.
A proprietary algorithm for starting the nodes in levels is an integral part of the cbaWorkflow application. The nodes which are not be involved in the immediate execution are not started , thus the system resources are saved.
This document is keeping track of development, testing , deployment and documentation of the Custom Built Apps cbaWorkflow product.
- Development environment setup
- Hadoop cluster software setup
- Custom Built App Workflow Development
- Quality Assurance of cbaWorkflow
- Hadoop cluster hardware setup