The state-of-the-art data processing systems currently deployed in astronomy are designed to handle data approximately two to three orders of magnitude smaller than the SKA. To tackle this challenge, we have developed the Data-Activated Flow Graph Engine or in short DALiuGE (Liu is the Chinese word for `Flow'). DALiuGE aims to provide a distributed data management platform and a scalable pipeline execution environment to support continuous, time and power bounded, data-intensive processing for producing SKA science-ready products. DALiuGE significantly differs from many existing processing frameworks in several aspects. First, DALiuGE allows data items to trigger events, which in turn activate cascaded execution of parallel processing tasks. Second, DALiuGE integrates data-lifecycle management into the data processing framework. Last, DALiuGE explicitly decouples the logical view of a problem and its realization or run-time deployment. This not only separates the concerns of different stakeholders such as telescope operators, pipeline developers, and astronomers but, more importantly, allows them to collectively optimise data processing at multiple levels in a harmonized way, while letting the framework optimize the generation of physical execution plans using resources and profiling information.
On the logical level, facing the scientists, we are developing a visual, domain specific programming language similar to workflow editors like Simulink, Kepler or Yabi. The prototype already exists, but it is lacking more advanced functionality and it is also pretty `ugly'. The CITS3200 project's goal is to implement some of the more advanced features and interfaces as well as a more professional and modern look-and-feel as well as an access control mechanism. It will involve web-technologies, database technologies (potentially graph-databases) and JSON based interfaces to the underlying DALiuGE. The web interface will also have to interact with very advanced scheduling and optimisation algorithms, but the project will not have to implement any of those. Currently the main language used for DALiuGE is Python 2.7.12 and the prototype editor is using the GoJS Javascript library. Other solutions are possible, but will require well-established advantages over the existing technologies. We are working in a fully established continuous integration environment using a professional tool-chain (Atlassian JIRA, Confluence and Jenkins) and are expecting that environment to be used for the development of this project as well.