Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured argument parsing #1408

Closed
wants to merge 18 commits into from
Closed

Structured argument parsing #1408

wants to merge 18 commits into from

Conversation

iwahbe
Copy link
Member

@iwahbe iwahbe commented Sep 28, 2023

This is part of the fix for pulumi/pulumi-libvirt#335.

2620ecf adds the new tests on the old parser. This lets 00a1cde show the change in the parse.

pulumi/pulumi-libvirt#335 needs mode.* fields to be parsed as associated with mode. Currently, each mode option is parsed as a sub-field. A future PR will roll unexpected arguments into their parents (actually fixing pulumi/pulumi-libvirt#335), but that requires the correct that this PR gives.


The effects of this PR can be observed from this test run: https://github.com/pulumi/pulumi-terraform-bridge/actions/runs/6910135157.

Note: this run happened before 3583b12

@iwahbe iwahbe self-assigned this Sep 28, 2023
@github-actions
Copy link

Diff for pulumi-random with merge commit 6a6e393

@github-actions
Copy link

Diff for pulumi-azuread with merge commit 6a6e393

@codecov-commenter
Copy link

codecov-commenter commented Sep 28, 2023

Codecov Report

Attention: 15 lines in your changes are missing coverage. Please review.

Comparison is base (fdc251f) 57.96% compared to head (3583b12) 58.28%.

Files Patch % Lines
pkg/tfgen/docs.go 94.94% 12 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    pulumi/pulumi-terraform-bridge#1408      +/-   ##
==========================================
+ Coverage   57.96%   58.28%   +0.31%     
==========================================
  Files         286      286              
  Lines       39695    39938     +243     
==========================================
+ Hits        23009    23276     +267     
+ Misses      15344    15326      -18     
+ Partials     1342     1336       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions
Copy link

Diff for pulumi-random with merge commit dbb032f

@github-actions
Copy link

Diff for pulumi-azuread with merge commit dbb032f

@iwahbe iwahbe force-pushed the iwahbe/1358 branch 2 times, most recently from 34c52fd to 6028865 Compare September 28, 2023 23:32
@github-actions
Copy link

Diff for pulumi-random with merge commit 083e3f1

@github-actions
Copy link

Diff for pulumi-azuread with merge commit 083e3f1

@github-actions
Copy link

Diff for pulumi-random with merge commit 45bdbb9

@github-actions
Copy link

Diff for pulumi-azuread with merge commit 45bdbb9

@github-actions
Copy link

Diff for pulumi-random with merge commit 4845dd6

@github-actions
Copy link

Diff for pulumi-azuread with merge commit 4845dd6

@github-actions
Copy link

Diff for pulumi-random with merge commit da7052b

@github-actions
Copy link

Diff for pulumi-azuread with merge commit da7052b

iwahbe added a commit to pulumi/pulumi-gcp that referenced this pull request Sep 29, 2023
@iwahbe iwahbe requested review from mjeffryes and a team September 29, 2023 00:41
"description": "Either `address`, `domain`, or both must be set"
},
"dns.hosts": {
"description": "a DNS host entry block. You can have one or more of these\nblocks in your DNS definition. You must specify both `ip` and `hostname`.\n\nAn advanced example of round-robin DNS (using DNS host templates) follows:\n\n```hcl\nresource \"libvirt_network\" \"my_network\" {\n...\ndns {\nhosts { flatten(data.libvirt_network_dns_host_template.hosts.*.rendered) }\n}\n...\n}\n\ndata \"libvirt_network_dns_host_template\" \"hosts\" {\ncount = var.host_count\nip = var.host_ips[count.index]\nhostname = \"my_host\"\n}\n```\n\nAn advanced example of setting up multiple SRV records using DNS SRV templates is:\n\n```hcl\ndata \"libvirt_network_dns_srv_template\" \"etcd_cluster\" {\ncount = var.etcd_count\nservice = \"etcd-server\"\nprotocol = \"tcp\"\ndomain = var.discovery_domain\ntarget = \"${var.cluster_name}-etcd-${count.index}.${discovery_domain}\"\n}\n\nresource \"libvirt_network\" \"k8snet\" {\n...\ndns {\nsrvs = [ flatten(data.libvirt_network_dns_srv_template.etcd_cluster.*.rendered) ]\n}\n...\n}\n```"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are hcl blocks so this is before example rendering? Trying to wrap my head around sequencing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We have a parse phase and then a render phase, with intermediate data represented in entityDocs. This change only effects the parse: raw markdown ([]byte) to entityDocs.


"github.com/pulumi/pulumi-terraform-bridge/v3/pkg/tfbridge"
"github.com/pulumi/pulumi-terraform-bridge/v3/pkg/tfgen/internal/testprovider"
)

func TestEntityDocsParsing(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much needed!

}
}
case ast.KindText:
if genericNestedRegexp.Match(node.Text(src)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does node.Text() do? It's been a while since I touched goldmark, I more recently wrestled with blackfriday. It can be really difficult to work with.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodes are flyweight constructs. They don't have access to the text that they represent. node.Text(src []byte) []byte returns the text from src that node represents.


for _, v := range ret.Arguments {
v.description = strings.TrimRightFunc(v.description, unicode.IsSpace)
func renderMdNode(src []byte, n ast.Node) []byte {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've run into this before; so I can guess. why are we rendering back? Trying to run regexes on a subtree?

I think I had to do something like this before an it was very disappointing that markdown library didn't provide a full featured AST->markdown renderer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, the MD parser doesn't provide clean spans back into the original text. Having walked the parsed nodes to find what we want to extract, we need to render back to text the selected bit.

If it was possible, I would like this function to read return src[n.Span().Start:n.Span().End], but that isn't supported.

@t0yv0
Copy link
Member

t0yv0 commented Sep 29, 2023

GCP changes don't look good. Some undesired changes are being added to GCP.

» $schema diff resources gcp:accesscontextmanager/egressPolicy:EgressPolicy properties egressPolicyName description
+ 
+ 
+ - - -

@t0yv0
Copy link
Member

t0yv0 commented Sep 29, 2023

This kind of code is going to be very scary/ difficult to roll out based on the strength of unit tests alone. These are heuristic codes that clean up textual data. We don't have good automated data quality checks.

We need to slow down to figure out how we're rolling this out, which probably involves eyeballing the change carefully for every Tier 1 major provider. We can talk about automated DQ checks but we don't have those yet so this is probably going to need eyeballs anyway for the rollout.

I'm tempted to do some weekend improvements to my pus tool that helps categorize JSON diffs, or we could try Daniel's explode tools.

It's reasonable to budget 1d for every major provider to make sure it rolls out safe. We should probably also flag this so rollout is per provider.

The change is good and I think we should do it time permitting but I'd like to not break a lot of docs when fixing a few of them.

That sounds reasonable? Or we're feeling lucky here. CC @mjeffryes @mikhailshilkov

@iwahbe
Copy link
Member Author

iwahbe commented Sep 29, 2023

GCP changes don't look good. Some undesired changes are being added to GCP.

» $schema diff resources gcp:accesscontextmanager/egressPolicy:EgressPolicy properties egressPolicyName description
+ 
+ 
+ - - -

I don't know what those are, I think they are thematic breaks. schema has the diff backwards. This change is removing the - - -, not adding them.

@t0yv0
Copy link
Member

t0yv0 commented Sep 29, 2023

Oh we have that in the docs currently :) that's terrible. Getting back to this shortly, I'd like to eyeball the diff some more.

@github-actions
Copy link

Diff for pulumi-random with merge commit e2726a7

@github-actions
Copy link

Diff for pulumi-azuread with merge commit e2726a7

@github-actions
Copy link

Diff for pulumi-azuread with merge commit a3a0662

@github-actions
Copy link

Diff for pulumi-random with merge commit 89d84d4

@github-actions
Copy link

Diff for pulumi-azuread with merge commit 89d84d4

@github-actions
Copy link

Diff for pulumi-random with merge commit 385bf17

@github-actions
Copy link

Diff for pulumi-azuread with merge commit 385bf17

@github-actions
Copy link

github-actions bot commented Oct 2, 2023

Diff for pulumi-random with merge commit c34fbfc

@github-actions
Copy link

github-actions bot commented Oct 2, 2023

Diff for pulumi-azuread with merge commit c34fbfc

t0yv0 added a commit to pulumi/upgrade-provider that referenced this pull request Oct 14, 2023
Part of enabling better bridge downstream checks.

Example PR generated with this version:

pulumi/pulumi-aiven#395

```
(cd ~/code/pulumi-aiven && ~/code/upgrade-provider/upgrade-provider \
    pulumi/pulumi-aiven \
    --kind=bridge \
    --target-bridge-version=$(cd ~/code/pulumi-terraform-bridge && git rev-parse HEAD) \
    --pr-reviewers "t0yv0" \
    --pr-description "This is some more PR description linking back to pulumi/pulumi-terraform-bridge#1408 that is also expressible as pulumi/pulumi-terraform-bridge#1408")

```
We had a argument:

```md
* `reference_file_schema_uri` - (Optional) When creating an external table, the user can
  provide a reference file with the table schema. This is enabled for the following formats:
  AVRO, PARQUET, ORC.
```

Since this matched a nested object regexp: "`([a-z_]+)`.*following:?", we erroneously
tagged it as a new nesting, causing us to mislabel subsequent items. The new heuristic
doesn't allow list items to anchor nestings unless the list item matches the whole
regex. This is enforced by one of our tests.
Before we only removed from a set of known contents. This worked well for most providers,
but doesn't work for all providers. To avoid a regression, we revert to old behavior.
@iwahbe
Copy link
Member Author

iwahbe commented Apr 15, 2024

This PR is out of date and has been largely superseded by @guineveresaenger's doc work.

@iwahbe iwahbe closed this Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants