Skip to content

Implement split(string, separator) function to create arrays in workflows #331

@shoogle

Description

@shoogle

In workflows, the existing join() function concatenates an array of strings to create one long string.

I want to do the opposite. I want to split a long string into an array of shorter strings.

The problem

Workflows triggered via workflow_dispatch or workflow_call can accept inputs.

  • An input must be of type string, number, or boolean.
  • It cannot be an array.

That's a problem because arrays are needed to create a jobs matrix:

jobs:
  generate-translations:
    strategy:
      matrix:
        language: ${{ inputs.languages }}  # [fr, de, pt, pt_BR]
    steps:
      - run: echo "Generate ${{ matrix.language }} translation"

Also, arrays are useful to run workflow jobs or steps conditionally via contains():

steps:

  - if: ${{ contains(inputs.languages, 'pt') }}
    run: echo "Generate Portuguese translation"

  - if: ${{ contains(inputs.languages, 'pt_BR') }}
    run: echo "Generate Brazilian Portuguese translation"

Note: The above examples assume inputs.languages is an array, which is not currently possible.

Desired solution

Allowing arrays as input is probably quite difficult to implement, hence I am NOT asking for that.

Instead, I can already pass a string like this as input: fr de pt pt_BR, so I just need a function to split this string into an array, using space (or whatever character(s)) as the item separator.

I would use the split() function as follows to create a jobs matrix:

jobs:
  generate-translations:
    strategy:
      matrix:
        language: ${{ split(inputs.languages, ' ') }}  # fr de pt pt_BR
    steps:
      - run: echo "Generate ${{ matrix.language }} translation"

Or to run jobs or steps conditionally:

steps:

  - if: ${{ contains(split(inputs.languages, ' '), 'pt') }}
    run: echo "Generate Portuguese translation"

  - if: ${{ contains(split(inputs.languages, ' '), 'pt_BR') }}
    run: echo "Generate Brazilian Portuguese translation"

Note: The above examples assume inputs.languages is an ordinary string, such as fr de pt pt_BR.

Additional context

There are workarounds that allow something like the above to be achieved already. However, the workarounds are sub-optimum.

Workarounds

WORKAROUND 1: Use contains() on a string instead of an array

Set inputs.languages to a string like ;fr;de;pt;pt_BR;, where ; (semicolon) is the separator.

In the workflow:

steps:

  - if: ${{ contains(inputs.languages, ';pt;') }}
    run: echo "Generate Portuguese translation"

  - if: ${{ contains(inputs.languages, ';pt_BR;') }}
    run: echo "Generate Brazilian Portuguese translation"

Caveats:

  1. The input string must start and end with the separator, which is unwieldy and easy to forget when typing in the workflow_dispatch dialog.
  2. Space cannot be used as the separator; a leading or trailing space might be stripped by the browser or by the workflow_dispatch dialog.
  3. This method only works for conditionals; it doesn't allow you to create a jobs matrix.

Notes:

  1. We can't do if: ${{ contains(inputs.languages, 'pt') }} (no separators) because that would allow an input like pt_BR to trigger the (non-Brazilian) Portuguese translation.
  2. We can't use regex (e.g. if: ${{ contains(inputs.languages, /(^| )pt( |$)/) }}) because it's not supported by the contains() function.

This is method we used in MuseScore Studio for our inputs.platforms variable, although in that case we were able to name the platforms carefully to make it possible to use contains() without separators.

WORKAROUND 2: Pass a JSON array as input

Set inputs.languages to a string like ["fr", "de", "pt", "pt_BR"]. (Note: this is all one string, not multiple strings.)

In the workflow, making use of the fromJSON() function:

jobs:
  generate-translations:
    strategy:
      matrix:
        language: ${{ fromJSON(inputs.languages) }}
    steps:
      - run: echo "Generate ${{ matrix.language }} translation"

Or:

steps:

  - if: ${{ contains(fromJSON(inputs.languages), 'pt') }}
    run: echo "Generate Portuguese translation"

  - if: ${{ contains(fromJSON(inputs.languages), 'pt_BR') }}
    run: echo "Generate Brazilian Portuguese translation"

Caveats:

  1. It's a pain to type JSON manually in the workflow_dispatch dialog.
  2. It's easy to make mistakes, like forgetting to close quotes, or adding a comma after the final list item, etc.

WORKAROUND 3: Construct a JSON array in an earlier job or step

Set inputs.languages to a string like fr de pt pt_BR, where (space) is the separator.

In the workflow, again making use of the fromJSON() function:

jobs:

  construct-json-job:
    runs-on: ubuntu-slim
    outputs:
      languages: ${{ steps.construct-json-step.outputs.languages }}
    steps:
      - id: construct-json-step
        run: echo "languages=$(jq -Rnc --arg str '${{ inputs.languages }}' '$str | split(" ")')" >> $GITHUB_OUTPUT

  generate-translations-job:
    needs: construct-json-job
    runs-on: ubuntu-latest
    strategy:
      matrix:
        language: ${{ fromJSON(needs.construct-json-job.outputs.languages) }}
    steps:
      - run: echo "Generate ${{ matrix.language }} translation"

Or:

steps:

  - id: construct-json
    run: echo "languages=$(jq -Rnc --arg str '${{ inputs.languages }}' '$str | split(" ")')" >> $GITHUB_OUTPUT

  - if: ${{ contains(fromJSON(steps.construct-json.outputs.languages), 'pt') }}
    run: echo "Generate Portuguese translation"

  - if: ${{ contains(fromJSON(steps.construct-json.outputs.languages), 'pt_BR') }}
    run: echo "Generate Brazilian Portuguese translation"

Caveats:

  1. This is a lot of extra code just to perform a simple string operation.
  2. Adding extra steps / jobs makes the build log more complicated.
  3. The multi-jobs method (needed to construct a matrix) involves spinning up another runner, which is a huge waste of resources in order to perform such a trivial task.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions