-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Structured argument parsing #1408
Conversation
Diff for pulumi-random with merge commit 6a6e393 |
Diff for pulumi-azuread with merge commit 6a6e393 |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master pulumi/pulumi-terraform-bridge#1408 +/- ##
==========================================
+ Coverage 57.96% 58.28% +0.31%
==========================================
Files 286 286
Lines 39695 39938 +243
==========================================
+ Hits 23009 23276 +267
+ Misses 15344 15326 -18
+ Partials 1342 1336 -6 ☔ View full report in Codecov by Sentry. |
Diff for pulumi-random with merge commit dbb032f |
Diff for pulumi-azuread with merge commit dbb032f |
34c52fd
to
6028865
Compare
Diff for pulumi-random with merge commit 083e3f1 |
Diff for pulumi-azuread with merge commit 083e3f1 |
Diff for pulumi-random with merge commit 45bdbb9 |
Diff for pulumi-azuread with merge commit 45bdbb9 |
Diff for pulumi-random with merge commit 4845dd6 |
Diff for pulumi-azuread with merge commit 4845dd6 |
Diff for pulumi-random with merge commit da7052b |
Diff for pulumi-azuread with merge commit da7052b |
"description": "Either `address`, `domain`, or both must be set" | ||
}, | ||
"dns.hosts": { | ||
"description": "a DNS host entry block. You can have one or more of these\nblocks in your DNS definition. You must specify both `ip` and `hostname`.\n\nAn advanced example of round-robin DNS (using DNS host templates) follows:\n\n```hcl\nresource \"libvirt_network\" \"my_network\" {\n...\ndns {\nhosts { flatten(data.libvirt_network_dns_host_template.hosts.*.rendered) }\n}\n...\n}\n\ndata \"libvirt_network_dns_host_template\" \"hosts\" {\ncount = var.host_count\nip = var.host_ips[count.index]\nhostname = \"my_host\"\n}\n```\n\nAn advanced example of setting up multiple SRV records using DNS SRV templates is:\n\n```hcl\ndata \"libvirt_network_dns_srv_template\" \"etcd_cluster\" {\ncount = var.etcd_count\nservice = \"etcd-server\"\nprotocol = \"tcp\"\ndomain = var.discovery_domain\ntarget = \"${var.cluster_name}-etcd-${count.index}.${discovery_domain}\"\n}\n\nresource \"libvirt_network\" \"k8snet\" {\n...\ndns {\nsrvs = [ flatten(data.libvirt_network_dns_srv_template.etcd_cluster.*.rendered) ]\n}\n...\n}\n```" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are hcl
blocks so this is before example rendering? Trying to wrap my head around sequencing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We have a parse phase and then a render phase, with intermediate data represented in entityDocs
. This change only effects the parse: raw markdown ([]byte
) to entityDocs
.
|
||
"github.com/pulumi/pulumi-terraform-bridge/v3/pkg/tfbridge" | ||
"github.com/pulumi/pulumi-terraform-bridge/v3/pkg/tfgen/internal/testprovider" | ||
) | ||
|
||
func TestEntityDocsParsing(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much needed!
pkg/tfgen/docs.go
Outdated
} | ||
} | ||
case ast.KindText: | ||
if genericNestedRegexp.Match(node.Text(src)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does node.Text() do? It's been a while since I touched goldmark, I more recently wrestled with blackfriday. It can be really difficult to work with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node
s are flyweight constructs. They don't have access to the text that they represent. node.Text(src []byte) []byte
returns the text from src
that node
represents.
|
||
for _, v := range ret.Arguments { | ||
v.description = strings.TrimRightFunc(v.description, unicode.IsSpace) | ||
func renderMdNode(src []byte, n ast.Node) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've run into this before; so I can guess. why are we rendering back? Trying to run regexes on a subtree?
I think I had to do something like this before an it was very disappointing that markdown library didn't provide a full featured AST->markdown renderer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, the MD parser doesn't provide clean spans back into the original text. Having walked the parsed nodes to find what we want to extract, we need to render back to text the selected bit.
If it was possible, I would like this function to read return src[n.Span().Start:n.Span().End]
, but that isn't supported.
GCP changes don't look good. Some undesired changes are being added to GCP.
|
This kind of code is going to be very scary/ difficult to roll out based on the strength of unit tests alone. These are heuristic codes that clean up textual data. We don't have good automated data quality checks. We need to slow down to figure out how we're rolling this out, which probably involves eyeballing the change carefully for every Tier 1 major provider. We can talk about automated DQ checks but we don't have those yet so this is probably going to need eyeballs anyway for the rollout. I'm tempted to do some weekend improvements to my It's reasonable to budget 1d for every major provider to make sure it rolls out safe. We should probably also flag this so rollout is per provider. The change is good and I think we should do it time permitting but I'd like to not break a lot of docs when fixing a few of them. That sounds reasonable? Or we're feeling lucky here. CC @mjeffryes @mikhailshilkov |
I don't know what those are, I think they are thematic breaks. |
Oh we have that in the docs currently :) that's terrible. Getting back to this shortly, I'd like to eyeball the diff some more. |
Diff for pulumi-random with merge commit e2726a7 |
Diff for pulumi-azuread with merge commit e2726a7 |
Diff for pulumi-azuread with merge commit a3a0662 |
Diff for pulumi-random with merge commit 89d84d4 |
Diff for pulumi-azuread with merge commit 89d84d4 |
Diff for pulumi-random with merge commit 385bf17 |
Diff for pulumi-azuread with merge commit 385bf17 |
Diff for pulumi-random with merge commit c34fbfc |
Diff for pulumi-azuread with merge commit c34fbfc |
Part of enabling better bridge downstream checks. Example PR generated with this version: pulumi/pulumi-aiven#395 ``` (cd ~/code/pulumi-aiven && ~/code/upgrade-provider/upgrade-provider \ pulumi/pulumi-aiven \ --kind=bridge \ --target-bridge-version=$(cd ~/code/pulumi-terraform-bridge && git rev-parse HEAD) \ --pr-reviewers "t0yv0" \ --pr-description "This is some more PR description linking back to pulumi/pulumi-terraform-bridge#1408 that is also expressible as pulumi/pulumi-terraform-bridge#1408") ```
We had a argument: ```md * `reference_file_schema_uri` - (Optional) When creating an external table, the user can provide a reference file with the table schema. This is enabled for the following formats: AVRO, PARQUET, ORC. ``` Since this matched a nested object regexp: "`([a-z_]+)`.*following:?", we erroneously tagged it as a new nesting, causing us to mislabel subsequent items. The new heuristic doesn't allow list items to anchor nestings unless the list item matches the whole regex. This is enforced by one of our tests.
Before we only removed from a set of known contents. This worked well for most providers, but doesn't work for all providers. To avoid a regression, we revert to old behavior.
This PR is out of date and has been largely superseded by @guineveresaenger's doc work. |
This is part of the fix for pulumi/pulumi-libvirt#335.
2620ecf adds the new tests on the old parser. This lets 00a1cde show the change in the parse.
pulumi/pulumi-libvirt#335 needs
mode.*
fields to be parsed as associated withmode
. Currently, eachmode
option is parsed as a sub-field. A future PR will roll unexpected arguments into their parents (actually fixing pulumi/pulumi-libvirt#335), but that requires the correct that this PR gives.The effects of this PR can be observed from this test run: https://github.com/pulumi/pulumi-terraform-bridge/actions/runs/6910135157.
Note: this run happened before 3583b12