Skip to content

feat: allow additional permissions to StepFunctions plus defaults #6759#6779

Draft
MengLinMaker wants to merge 3 commits intoanomalyco:devfrom
MengLinMaker:redrive-step-function
Draft

feat: allow additional permissions to StepFunctions plus defaults #6759#6779
MengLinMaker wants to merge 3 commits intoanomalyco:devfrom
MengLinMaker:redrive-step-function

Conversation

@MengLinMaker
Copy link
Copy Markdown

@MengLinMaker MengLinMaker commented Apr 20, 2026

Addresses #6759 as current permissions are to restrictive and cannot be extended.

This PR adds:

  • redrive and stop enabled by default.
  • ability to extend permissions.
  • ability to attach role (unsure if this is a good idea).
  • add example to test redrive ability.

More details are listed in issue.

@MengLinMaker MengLinMaker marked this pull request as ready for review April 20, 2026 22:49
@MengLinMaker MengLinMaker marked this pull request as draft April 20, 2026 23:21
@vimtor
Copy link
Copy Markdown
Collaborator

vimtor commented Apr 23, 2026

i'm not super familiar with step functions, but i believe just adding the stop and redrive permissions should be enough

also, i believe you can already attach permissions by doing something like this:

sst.aws.StepFunctions.lambdaInvoke({
  name: "LambdaInvoke",
  function: {
    handler: "src/index.handler"
    timeout: "60 seconds",
    permissions: [...]
  }
});

@MengLinMaker
Copy link
Copy Markdown
Author

MengLinMaker commented Apr 23, 2026

@vimtor Yep the stop and redrive permission is enough for now. But it also cannot be extended nicely currently if requirements change.

Unfortunately the code you suggested is for running Lambda inside step Functions and alters the permission of the Lambda only.

Step Function (orchestrator for running workflows) currently doesn't have permission to redrive failures (manually continue from where errors occurred). Large workflows like the several hour workflows I'm frequently running benefit from this feature, especially with spot instances.

It's possible to override the IAM role via transform in a messy way:

// Similar IAM role to what SST provides
const StepFunctionsScrapePipelineRole = new aws.iam.Role('StepFunctionsScrapePipelineRole', {
    assumeRolePolicy: aws.iam.assumeRolePolicyForPrincipal({
        Service: 'states.amazonaws.com',
    }),
    inlinePolicies: [
        {
            name: 'inline',
            policy: aws.iam.getPolicyDocumentOutput({
                statements: [
                    {
                        actions: ['events:*'],
                        resources: ['*'],
                    },
                    {
                        actions: [
                            'logs:CreateLogDelivery',
                            'logs:CreateLogStream',
                            'logs:GetLogDelivery',
                            'logs:UpdateLogDelivery',
                            'logs:DeleteLogDelivery',
                            'logs:ListLogDeliveries',
                            'logs:PutLogEvents',
                            'logs:PutResourcePolicy',
                            'logs:DescribeResourcePolicies',
                            'logs:DescribeLogGroups',
                        ],
                        resources: ['*'],
                    },
                    {
                        actions: [
                            'states:StartExecution',
                            'states:DescribeExecution',
                            // I only need this action for redriving
                            'states:RedriveExecution',
                        ],
                        resources: ['*'],
                    },
                    ...StepScrapePipelineDefinition.getRoot().getPermissions(),
                ],
            }).json,
        },
    ],
})
const StepFunctionsScrapePipeline = new sst.aws.StepFunctions('StepFunctionsScrapePipeline', {
    definition: StepScrapePipelineDefinition,
    // What a mess here, but it works
    transform: {
        stateMachine: (args) => {
            args.roleArn = StepFunctionsScrapePipelineRole.arn
            args.definition = pulumi.output(args.definition).apply((definition) => {
                const parsed = JSON.parse(definition)
                const mapState = parsed.States?.[StepMapScrapeLocality.name]
                // Would be good if this could be specified in `sst.aws.StepFunctions.map`
                if (mapState?.Type === 'Map') mapState.ToleratedFailurePercentage = 5
                return JSON.stringify(parsed)
            })
        },
    },
})

Another issue is that ToleratedFailurePercentage for step functions to run also requires very hacky code:

if (mapState?.Type === 'Map') mapState.ToleratedFailurePercentage = 5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants