Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow using Pydantic I/O in with-based code #1192

Open
alicederyn opened this issue Sep 5, 2024 · 2 comments · May be fixed by #1189
Open

Allow using Pydantic I/O in with-based code #1192

alicederyn opened this issue Sep 5, 2024 · 2 comments · May be fixed by #1189
Labels
type:enhancement A general enhancement

Comments

@alicederyn
Copy link
Collaborator

alicederyn commented Sep 5, 2024

Is your feature request related to a problem? Please describe.
Migrating to the new decorator syntax (or, presumably, off it) is made more challenging by having to completely rewrite pipelines from the old arguments= syntax to the new Pydantic syntax in one go.

Describe the solution you'd like
Allow a mix of the new Pydantic syntax and the old arguments= syntax within "legacy" with-based pipelines. The returned object should work the same way as in decorator syntax, i.e. it should mimic a Pydantic object, its fields should return strings like "{{steps.writer.outputs.artifacts.int-artifact}}", and it should create automatic dependencies between DAG nodes. This would let users verify that they are creating the same YAML at more points along the journey, catching mistakes earlier.

This way of invoking Pydantic-based @script-decorated functions is currently broken (the argument is silently ignored), so existing usage should not be impacted.

Describe alternatives you've considered
We could allow a mixture inside the new syntax instead (or as well). I haven't investigated how much work doing it this way around would be.

Additional context
With this feature, the script_runner_io.py example workflow could be rewritten as:

with Workflow(generate_name="pydantic-io-") as w:
    with Steps(name="use-pydantic-io"):
        write_step = writer()
        pydantic_io(
            MyInput(
                artifact_int="{{steps.writer.outputs.artifacts.int-artifact}}",
                param_int=101,
                an_object=MyObject(a_dict={"my-new-key": "my-new-value"}),
            )
        )

The artifact_int parameter isn't ideal, though there may be a cleaner way to write it. I've been rewriting workflows top-down however, which makes this simpler as you could have used write_step.int_artifact if write_step had been migrated first.

@alicederyn alicederyn added the type:enhancement A general enhancement label Sep 5, 2024
@alicederyn alicederyn linked a pull request Sep 5, 2024 that will close this issue
4 tasks
@elliotgunton
Copy link
Collaborator

elliotgunton commented Sep 10, 2024

Hey @alicederyn, I've had some more time to think about this - I don't think we can add this enhancement without adding more configuration and also generally confusing users - there is no path for it to be a simple backwards-compatible enhancement (e.g. if x is a Task today in x = my_script_function(), and then None after a release (although it would be behind a config), that would not be good).

The new syntax is still new, unstable, and is fundamentally different from the old syntax, so I would prefer to keep them separate. Even allowing mixing the two syntaxes with the newer decorator feature flag as done in the PR will likely result in user code being stuck between the two syntaxes and then us being unable to graduate the decorator feature flag in future without breaking user code (and having angry users 🙂).

I would prefer to wait for user requests to gauge interest and figure out a better migration path. An alternative I'm thinking about is being about to pass Input objects to the arguments parameter, i.e.

with Workflow(generate_name="pydantic-io-") as w:
    with Steps(name="use-pydantic-io"):
        write_step = writer()
        pydantic_io(
            arguments=MyInput(
                artifact_int=write_step.get_artifact("int-artifact").with_name("artifact-input"),
                param_int=101,
                an_object=MyObject(a_dict={"my-new-key": "my-new-value"}),
            )
        )

Which is a backwards-compat stepping stone (and enhancement in general to the pydantic IO feature) to the new syntax. However, as noted in our slack convo, the type validation in the code above will most likely be broken or need bypassing.

w = Workflow(generate_name="pydantic-io-")
@w.steps()
def use_pydantic_io():
    write_step = writer()
    pydantic_io(
        MyInput(
            artifact_int=write_step.int_artifact,
            param_int=101,
            an_object=MyObject(a_dict={"my-new-key": "my-new-value"}),
        )
    )

@elliotgunton
Copy link
Collaborator

Copying PR comment here

Hey @alicederyn, we discussed this PR and motivation behind it in the maintainer's meeting (notes here), @samj1912 agreed that we don't want to mix the two syntaxes as that will confuse users, so I've moved the PR into the "draft" state.

It seems the main goal is to avoid a big-bang migration, or at least have a checkpoint approach as you explained. An alternative would be to work on #711 as the new decorators mean we can create a full Argo Workflow spec as Python stubs in Hera. This should mean we can construct the DAG too, so the only work to do would be convert old script templates to use Pydantic IO classes, which should match the YAML (because it would be generated from it (super awesome!!)) and then you'd just copy/paste the function body into the stubs.

Let's discuss when you're back, no rush to reply! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:enhancement A general enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants