Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: use gobwas/ws instead of gorilla/websocket #13932

Closed
wants to merge 5 commits into from

Conversation

Gekko0114
Copy link
Contributor

@Gekko0114 Gekko0114 commented Apr 29, 2023

Fixes #13824
Related to knative/pkg#2730

Proposed Changes

https://github.com/gorilla/websocket is deprecated and unmaintained.
We should switch to something that is and ensure everything works as expected (ie. draining in queue proxy etc).

  • replaced gorilla/websocket with gobwas/ws
  • 🧹 Update or clean up current behavior
    /kind deprecation

Release Note

use `gobwas/ws` instead of `gorilla/websocket`

@knative-prow knative-prow bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 29, 2023
@knative-prow
Copy link

knative-prow bot commented Apr 29, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Gekko0114
Once this PR has been reviewed and has the lgtm label, please assign pradnyavmw for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 29, 2023
@knative-prow
Copy link

knative-prow bot commented Apr 29, 2023

Welcome @Gekko0114! It looks like this is your first PR to knative/serving 🎉

@knative-prow
Copy link

knative-prow bot commented Apr 29, 2023

Hi @Gekko0114. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow knative-prow bot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. area/autoscale area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features labels Apr 29, 2023
@Gekko0114 Gekko0114 marked this pull request as ready for review April 29, 2023 13:48
@knative-prow knative-prow bot requested review from mgencur and ReToCode April 29, 2023 13:48
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 29, 2023
@Gekko0114
Copy link
Contributor Author

Hi @dprotaso,

I've created a WIP in the downstream serving repo to test the changes I've made.
Could you have a look?
Thanks,

@nak3 nak3 added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 30, 2023
@Gekko0114
Copy link
Contributor Author

/retest

@Gekko0114
Copy link
Contributor Author

/retest

@dprotaso
Copy link
Member

dprotaso commented May 2, 2023

Looks like the websocket tests are failing

@codecov
Copy link

codecov bot commented May 2, 2023

Codecov Report

Patch coverage: 61.53% and project coverage change: -0.02 ⚠️

Comparison is base (ea2a6c8) 86.25% compared to head (48c6162) 86.24%.

❗ Current head 48c6162 differs from pull request most recent head 0ec26f6. Consider uploading reports for the commit 0ec26f6 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13932      +/-   ##
==========================================
- Coverage   86.25%   86.24%   -0.02%     
==========================================
  Files         199      199              
  Lines       14749    14753       +4     
==========================================
+ Hits        12722    12723       +1     
- Misses       1726     1728       +2     
- Partials      301      302       +1     
Impacted Files Coverage Δ
pkg/activator/stat_reporter.go 41.66% <0.00%> (ø)
pkg/autoscaler/statserver/server.go 76.59% <62.50%> (-1.51%) ⬇️
pkg/autoscaler/statforwarder/processor.go 94.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Gekko0114
Copy link
Contributor Author

Yes, the websocket tests in e2e testing are failing. The Dialer in gobwas/ws does not have a Proxy field, which I think is causing the problem, but I am not sure. If you have any suggestions, please let me know.

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 2, 2023
@Gekko0114 Gekko0114 force-pushed the websocket branch 6 times, most recently from e1d1d7a to e311403 Compare May 7, 2023 04:03
@skonto
Copy link
Contributor

skonto commented May 8, 2023

@Gekko0114 I run the PR locally. So it seems you dont need the proxy stuff (websocket_test.go). However you need to have the host passed.
Otherwise you get :

    websocket.go:70: Connecting using websocket: url=ws://10.98.128.178:80/?delay=2, host=web-socket-with-timeout-does-not-exceed-t-ryusmvxm.serving-tests.example.com
Connecting using websocket: url=ws://10.98.128.178:80/?delay=2, host=web-socket-with-timeout-does-not-exceed-t-ryusmvxm.serving-tests.example.com

With c, resp, err := dialer.Dial(u.String(), http.Header{"Host": {domain}}) //, http.Header{"Host": {domain}})
tests pass:

$ go test -v -tags=e2e -count=1 ./test/e2e -run ^TestWebSocketWithTimeout$
...
t-with-timeout-does-not-exceed-t-lepqcrbf] Reconcile succeeded
    stream.go:305: I 11:12:57.247 controller-5bb8f4cf98-nt2bm [knative.dev.serving.pkg.reconciler.service.Reconciler] [serving-tests/web-socket-with-timeout-does-not-exceed-t-lepqcrbf] Reconcile succeeded
--- PASS: TestWebSocketWithTimeout (8.30s)
    --- PASS: TestWebSocketWithTimeout/does_not_exceed_timeout_seconds (8.30s)
PASS
ok  	knative.dev/serving/test/e2e	8.327s

@skonto
Copy link
Contributor

skonto commented May 8, 2023

So here is why the new framework does not work from the client side. The old framework uses direct http calls
that are being upgraded and it sets correctly the host header. So at the envoy side you should see:

[2023-05-08T14:20:07.071Z] "GET /?delay=2 HTTP/1.1" 101 DC 22 18 2005 - "-" "-" "58d25238-8bb8-417a-89eb-4b3c4d32adf2" "web-socket-with-timeout-ws-does-not-excee-xnsxlsss.serving-tests.example.com" "10.244.0.6:8012"

The new framework though since it uses tcp directly and upgrades later by editing the tcp con stuff directly
it only supports stuff without proxying anything. It means that if I set:

		header := http.Header{}
		header.Set("Host", domain)
...
		dialer := &ws.Dialer{
			Timeout: 45 * time.Second,
			OnStatusError: func(status int, reason []byte, resp io.Reader) {
				var b bytes.Buffer
				io.Copy(&b, resp)
				t.Logf("HTTP connection failed: %v Response=%+v. ResponseBody=%q", string(reason), status, b.String())
			},
			Header: ws.HandshakeHeaderHTTP(header),
		}

I get:

[2023-05-08T12:12:49.346Z] "GET /?delay=2 HTTP/1.1" 404 NR 0 0 0 - "-" "-" "ebcd34aa-8fa4-42a2-b9f0-106c484ce2f6" "10.98.128.178:80,web-socket-with-timeout-ws-does-not-excee-hbeckqih.serving-tests.example.com" "-"

Hosts are merged. I managed to make it work by commenting this line out:

httpWriteHeader(bw, headerHost, host)
and setting the header manually as above:

    stream.go:305: I 14:20:09.215 controller-5bb8f4cf98-nt2bm [knative.dev.serving.pkg.reconciler.route.Reconciler] [serving-tests/web-socket-with-timeout-ws-does-not-excee-xnsxlsss] Reconcile succeeded
--- PASS: TestWebSocketWithTimeoutWS (7.25s)
    --- PASS: TestWebSocketWithTimeoutWS/does_not_exceed_timeout_seconds (7.24s)
PASS
ok  	knative.dev/serving/test/e2e	7.269s

This is not to be used directly of course for our tests it is a proof of concept.

@Gekko0114
Copy link
Contributor Author

@skonto ,

I see. Thank you so much for your investigation!

This is not to be used directly of course for our tests it is a proof of concept.

So, we should fix this issue on gobwas/ws first, right?

@skonto
Copy link
Contributor

skonto commented May 8, 2023

Yes could you try start the discussion or do a PR there to get things going? What I would propose is that if user sets the host header the library should pass it during connection upgrades and still use the url for direct connections to some target (in this case it is the ingress ip that is being target). I am also a bit concerned about this inflexibility of the new lib.

cc @nak3 @dprotaso may have more to add here.

@Gekko0114
Copy link
Contributor Author

@skonto,
Sure. I will create an issue. Maybe I will create PR as well.
Also I have a bit concern that this lib doesn't seem to be updated so often.

@AlexVulaj
Copy link

Hey all - sorry to jump in here as I've been informed of this work by @skonto . I'm not sure if this changes your plans for this work at all, but a group I've started at Red Hat is currently in active discussions with the original maintainers of the Gorilla Web Toolkit to adopt the project and reboot it. While there are no hard timelines at this time, feel free to contact me if you have any questions or would like to know more.

@Gekko0114
Copy link
Contributor Author

Gekko0114 commented May 10, 2023

Hi @AlexVulaj
Thank you for the update.
Does this mean that we no longer need to replace gorilla/websocket with gobwas/ws?
Personally, I am fine with it since I found gorilla/websocket is a great library (also bobwas/ws is great).
If so, I will stop working on this project. Please keep me informed if there are any further updates regarding this matter. Thanks!

@Gekko0114
Copy link
Contributor Author

FYI
I created an issue about gobwas/wss bug.
gobwas/ws#170

@AlexVulaj
Copy link

@Gekko0114 Yes, it should be the case that you won't need to replace it; however, I will caution that I have no promises on an official timeline of exactly when that reboot will occur. What I can say is that there is interest from both parties (the Gorilla maintainers as well as our group of RH engineers) to move forward. Let me know if I can answer anything else!

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 12, 2023
@knative-prow-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dprotaso
Copy link
Member

Closing this for now - seems like RedHat might become owners of gorilla/websocket - so we'll pause this for now

#13824 (comment)

@dprotaso dprotaso closed this May 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/autoscale area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Figure out a plan for https://github.com/gorilla/websocket
6 participants