Executing ad-hoc jobs with Amazon Step Functions

We came across a requirement from a client to process thousands of data records in a database, generate a report, and present it to the client. Following are some of the requirements for this:

  • Filter out the data based on the request
  • Monitor the state of every request
  • There could be multiple requests at a time
  • Processing a single request could take anywhere between 5 minutes to 5 hours

TL;DR ⚡

We create an AWS Step Function which takes the input and generate the custom reports from database using ECS Tasks.

Solution 🔥

Suppose we receive a request from Client A, which has id 1266cd5a-1dfa-44f2-83ca-534c30b38555 in our system. The client wants to generate a report on all the pending transactions in December 2021. This input from the client looks as follows:

{
    "filter": {
        "client_id": "1266cd5a-1dfa-44f2-83ca-534c30b38555",
        "start_date": "2021-01-01",
        "end_date": "2021-12-31",
        "status": "pending"
    }
}

On the code side, we take this input as an environment variable REPORT_REQUEST and send this request for another RESTful API to get the required data:

import os, json

def main():
    data_request_str = os.environ.get('REPORT_REQUEST', None)
    if data_request_str is None:
        raise EnvironmentError("REPORT_REQUEST isn't set")
    data_request = json.loads(data_request_str)
    # Get data using this data request

if __name__ == '__main__':
    main()

We Dockerize above mentioned script and push it to Amazon ECR which is AWS managed Docker container registry. We create a task definition which lets AWS know what kind of resources we need to run this Docker container, for example, we need 4 GB of RAM.

We create a Step Function, which takes a JSON input, Stringifies it and sets it as ECS Task environment variable REPORT_REQUEST. The Step Function definition looks like following:

{
  "Comment": "Report Task",
  "StartAt": "main",
  "States": {
    "main": {
      "Type": "Task",
      "Resource": "arn:aws:states:::ecs:runTask.sync",
      "Parameters": {
        "LaunchType": "FARGATE",
        "Cluster": "arn:aws:ecs:us-west-2:123456789012:cluster/reporting-cluster",
        "TaskDefinition": "arn:aws:ecs:us-west-2:123456789012:task-definition/client-report:1",
        "NetworkConfiguration": {
          "AwsvpcConfiguration": {
            "Subnets": [
              "subnet-0e365e9c4570c3fd2"
            ],
            "SecurityGroups": [
              "sg-07066bac0ae8b88t2"
            ],
            "AssignPublicIp": "DISABLED"
          }
        },
        "Overrides": {
          "ContainerOverrides": [
            {
              "Name": "main",
              "Environment": [
                {
                  "Name": "REPORT_REQUEST",
                  "Value.$": "States.JsonToString($)"
                }
              ]
            }
          ]
        }
      },
      "End": true
    }
  }
}

The runTask.sync blocks the step and waits for the Ecs Task to stop execution.

Now we just start a new state machine with the above input, name it client-a-report and run it by clicking Start execution.

The function States.JsonToString($) converts request JSON to a string.

If we receive another request from Client B, we can just start another state machine, which takes care of starting another state machine and both reports can run in parallel and we can monitor the state of all reports in the Amazon Step Function console.

Key Takeaways 🗒️

This solution helps to automate the repetitive adhoc tasks. We can achieve the same end result without using Step Functions, by directly using ECS run-task api. Step Functions help us to visualize the status of tasks and helps us view input in a more human readable manner. We can also add retries in case of failures directly from Amazon Step Function definition.

Leave a Comment

Your email address will not be published. Required fields are marked *