How is it done?
Datamung is built on AWS Simple Workflow , which provides strong guarantee that steps in predefined workflow is executed during time. When Datamung executes a workflow that backs up database, the workflow runs in Datamung's own AWS account, while the tasks that touch users resource such as replicating database, launching EC2 instance and uploading result to S3 run in user's AWS account with AWS credentials user provides.
High level diagram
Following drawing shows an overview of components in Datamung.
With layout above, Datamung runs one of predefined workflows to backup database. Following sections elaborate each of these workflows.
These are several important workflow definitions defined with Simple Workflow.
Direct export data workflow
The workflow defines cross-account IAM role, launches EC2 instance in user's AWS account, which executes mysqldump command against given MySQL instance. After the work is done or fails, workflow terminates EC2 instance and delete cross-account IAM role. The source code of workflow definition is here .
Since the workflow is driven by AWS Simple Workflow, every step is bound with timeout and retried up failure. The overall workflow execution is bound with timeout as well.
Convert snapshot workflow
This workflow converts an RDS snapshot into MySQL dump in S3, by restoring snapshot into an RDS instance, run direct export data workflow against it and terminate RDS instance. Source code is here .
Export instance workflow
The export instance workflow is the top level workflow, that either invokes direct export data workflow, or take snapshot of database instance to backup, run convert snapshot workflow and delete snapshot. The source code is here .