Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalized program tree #1076

Closed
wants to merge 1 commit into from
Closed

Conversation

Lupus
Copy link

@Lupus Lupus commented Dec 13, 2020

This PR is a part of effort to address #1075. Scope of this PR is limited to only introduce the new IR type. Most of the work presented in this PR is due to @jordwalke (see rehp, I only brushed it up it and prepared for upstreaming.

@hhugo
Copy link
Member

hhugo commented Dec 13, 2020

Thanks. Would you mind restricting this PR To just introducing an additional IR. I don't want to review too many moving pieces at once.

@Lupus
Copy link
Author

Lupus commented Dec 13, 2020 via email

@Lupus
Copy link
Author

Lupus commented Dec 14, 2020

I've trimmed this PR down to only introducing an additional IR.

@Lupus Lupus force-pushed the generalized-program-tree branch 2 times, most recently from 20d1fec to 0d49ce5 Compare December 14, 2020 05:31
| RawSubstitution of expression

and expression =
| ERaw of raw_segment list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave ERaw to a later PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified ERaw as much as possible, but I'm not sure how to avoid it when plugging the new IR. caml_js_expr is currently parsing the JS code into the tree and embeds that directly. Converting JS tree to IR does not sound like a great idea.

compiler/lib/id.ml Outdated Show resolved Hide resolved
compiler/lib/id.ml Outdated Show resolved Hide resolved
compiler/lib/ir.ml Show resolved Hide resolved
@hhugo
Copy link
Member

hhugo commented Dec 14, 2020

I meant, restrict the PR to a new IR and plug it in - but exclude things that change the behavior

@Lupus Lupus force-pushed the generalized-program-tree branch 2 times, most recently from 2ac01a3 to 4fa830a Compare December 14, 2020 09:37
@Lupus
Copy link
Author

Lupus commented Dec 14, 2020

I meant, restrict the PR to a new IR and plug it in - but exclude things that change the behavior

Gotcha, I've restored the PR to original state and tweaked it a bit to produce exactly the same output. Tweaks included using EqEq in arity test and still generating 1 - x in generate.ml.

| Continue_statement of Javascript.Label.t option
| Break_statement of Javascript.Label.t option
| Return_statement of expression option
| Labelled_statement of Javascript.Label.t * (statement * Loc.t)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the IR contain primitives like this that aren't present in OCaml?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These control flow structures are well recognized and exist in some form in all high-level languages. What would you suggest for the IR? I didn't want to make non-cosmetic changes to generate.ml, at least in this PR. These primities don't exist in OCaml, but control flow deconstruction produces them right now and they are not too much specific for JavaScript (in Golang backend, these got transformed to Go AST seamlessly).

@Lupus
Copy link
Author

Lupus commented Jan 18, 2021

Hey @hhugo, have you had a chance to look more at this?

@mnxn
Copy link

mnxn commented Jan 18, 2021

I'm writing this as an outsider but is this new IR type sufficiently general?

I wouldn't expect constructs like Typeof, ERegexp, and ERaw to be present in a general IR. Similarly, I don't think JS-specific primitives/externs should be handled in generate.ml.

Is it feasible to handle the JS features in javascript_from_ir.ml instead?

@Lupus
Copy link
Author

Lupus commented Jan 19, 2021

ERaw is essential for interop with target language. Special primitive takes source code in target language as a string, possibly interpolates it with arguments and injects into IR, which is then processed by language backend, and target language code gets emitted during generation.

As of Typeof and friends, generate.ml has a number of primitives declared like ones below:

  register_bin_prim "caml_js_instanceof" `Pure (fun cx cy _ ->
      bool (J.EBin (J.InstanceOf, cx, cy)));
  register_un_prim "caml_js_typeof" `Pure (fun cx _ -> J.EUn (J.Typeof, cx))

I'm not completely sure if these could be somehow moved away from generate.ml, may be we could use ERaw and make Javascript backend register these primitives as macros instead.

I'm all for making the IR as general as possible, just wanted to take baby steps in the right direction :)

@Lupus
Copy link
Author

Lupus commented Jan 19, 2021

Or what you suggest @mnxn is to emit some EPrim in generate.ml and offload handling of primitives other than a set of well-known ones to target language IR processor? And consequently move all JS primitive related code from generate.ml into javascript_from_ir.ml?

@mnxn
Copy link

mnxn commented Jan 19, 2021

Or what you suggest @mnxn is to emit some EPrim in generate.ml and offload handling of primitives other than a set of well-known ones to target language IR processor? And consequently move all JS primitive related code from generate.ml into javascript_from_ir.ml?

Exactly.
I think that the primitives that start with caml_js should be handled by the javascript_from_ir module. Then the IR can be simplified and remove many JS variants.

@Lupus
Copy link
Author

Lupus commented Jan 19, 2021

Exactly.
I think that the primitives that start with caml_js should be handled by the javascript_from_ir module. Then the IR can be simplified and remove many JS variants.

Sounds good to me! Question to maintainers if this belongs to this PR, or better organized as subsequent one.

@Lupus
Copy link
Author

Lupus commented Feb 13, 2021

I'm looking into moving JS specific primitives outside of generate.ml as this makes sense in terms of generalization of IR.

Some are quite obvious like Prim (Extern "debugger", _). Some are not, like %closure.

Ideally primitives should be handled at two levels - first one would be generate.ml handling some built-in/special primitives so as not to repeat this code in each backend generator. And second one would be in specific backend, handling primitives that make sense only for one target language.

Probably passing a primitive from one level to another can be plugged in somewhere here:

let prim = Share.get_prim (runtime_fun ctx) name ctx.Ctx.share in
. Instead of generating generic ECall for primitives that we do not know - we can emit some EPrim IR node which can then be processed later by the target language backend.

I have not yet fully understood what that Share thing is doing, but it does track some primitives, in tandem with %closure that gets introduced in inliner (inline.ml). Splitting primitives in two levels would make target-language-specific primitives invisible for logic in generate.ml, they won't be tracked in Share. I'm not sure if that could hurt kittens. @hhugo could you please advise on this part? And do you think it makes sense to follow this path in general?

@hhugo
Copy link
Member

hhugo commented Dec 2, 2023

Is there still interest in such change ? Should it be rebased or closed ?

@Lupus
Copy link
Author

Lupus commented Dec 2, 2023

Not from my end unfortunately. Let's close this probably.

@Lupus Lupus closed this Dec 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants