-
Notifications
You must be signed in to change notification settings - Fork 3
requests for particular CDNs #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@claustromaniac looking at https://www.cdnplanet.com and https://www.cdnoverview.com and https://www.webpagetest.org/ it reveals myriads of CDN. Researching each and every one would be a gargantuan task and rather unlikely to be achievable. Thus wondering whether it would not make sense to team up with them (perhaps via an API), e.g. https://github.com/turbobytes/cdnfinder or https://github.com/WPO-Foundation/webpagetest |
TL;DR: The main goal of this extension is to raise awareness. If you want more precise information you can always use those tools on your own. Be aware that they are not failsafe either, though. I was aware of those, and what you suggest is reasonable. To be honest, I already asked myself if I should take that route at some point. However, my decision has always been to stick to my current path for a very simple reason: I don't want to rely on third parties. That's the whole point of this extension. I can't afford to make queries to third parties on each single request. That would be expensive. Furthermore, the scope of this extension is different than the scope of those web tools. If anything, I'd say those guys are in favor of CDNs. They even use CDNs themselves, which means that I would be making queries to CDNs just to detect other CDNs. See the problem? I wouldn't give up my privacy just to detect CDNs. That doesn't make sense to me. My privacy is the very concern that led me to create this extension in the first place. That being said, I admit that having hundreds of CDNs to research is not the only trade-off of my current path. Not relying on third parties also means that I have less resources to work with. The extension has very limited information to analyze, which means there are things it cannot do, and it misses and will always miss a number CDNs. Moreover, I don't even want to start using an internal database of IP ranges, because I can barely maintain the extension as it is. That's the very reason I added heuristics, and I intend to keep improving that feature as much as possible. It won't tell you who is the middle man, but it is a pretty reliable indicator otherwise. |
Valid points for sure. What I meant for the API was not realtime pulling from a 3rd party but say in frequent periods pull their available header information and incorporate in this WX, to lighten the load on the research. I would not mind to lend a hand in expanding the header data on the detection and thus wondering whether the heuristics could be expanded to the browser's developer panel and to highlight in the network tab the row with the heuristic detection and when clicking on such row highlight in the header tab -> the header(s) triggering the heuristics. Or perhaps an own True View tab in the developer tools, sort of this one in GC https://chrome.google.com/webstore/detail/is-it-cached/naikbjeckbmjhngcejdmcjhoedhckglk I could then take a look and report particular headers for specific CDN and thus help in expanding the list of CDN list |
I've come up with similar ideas myself but, for now, that wouldn't be practical. Even if you provided me with header data gathered by yourself, I'd still want to analyze it, and my stockpile of data to analyze is large enough as it is. Besides, it may not seem to you like my methods are efficient, but the truth is I've had scarce time to work on this, and so far I have invested a lot of that time coding (not researching). You'll just have to be patient. I will reconsider the idea sometime in the future, after the extension has matured enough. I appreciate your continued interest and willingness to help, though. Thanks for that. |
You're not a moron :( You're 👖! The extension treats GitHub as a CDN because it (I can go on As for why it shows while you're on GitHub... that's because I didn't bother implementing exceptions, since I don't consider that necessary.
I hope it's not too bad 😿 |
Let's say it's sort of a special case, but I considered it one worth adding because ... https://whotracks.me/trackers.html (look where github stands in that ranking). It's not like GitHub offers reverse proxy services or the like, if that's what you were wondering. (It does use reverse proxies tho). Plus... Microsoft. |
OK, thanks. I totally get the github pages, but to me that is not a CDN (well not if they use the github.io, but I see they can be anything, so yeah). Soz for the spam and stoopid Q's ... I'm a special kind of |
Nah, don't say that. Your question was actually good. As I said before, it's a special case. GitHub is not technically a CDN and AFAIK they don't officially offer such services. There have been ways to take advantage of their infrastructure for similar goals, but that's all. You made me start to think that I should probably move it out of that fieldset in the options page, because lumping it with the other (actual) CDNs doesn't do it justice. |
Similarly, I could make the extension detect corporations that threaten our privacy and/or security (even if they don't qualify as CDNs), and list them separately in the options... |
I would prefer for Github to remain as CDN as long as it acts one (for 3rd party domains) and thus falling into the category of a CDN (notwithstanding being owned by MS). To make that distinction (detect corporations that threaten our privacy and/or security) would be helpful for the user but how much extra work/effort to put on your plate? Loading fonts, libraries, css (and media files) from a CDN is probably less threatening than login credentials or otherwise sensitive personal data being decrypted at the edge server. That is back to what was discussed earlier - where domain protective services (e.g. firewall or geo location obfuscation with SNI certificates) gets mixed with CDN services and thus blurring the lines. |
If GitHub is a special case... (if want this in FORTRAN or COBOL, I can do it for you)
But then if you start to expand to "blurred lines" material, seems like a lot of work. I kinda like the idea of a separate category |
Technically, GitHub is a platform for hosting Git repos, and GitHub Pages are meant to be just static pages hosted on GitHub's servers. That's what I mean when I say GitHub is not a CDN. The thing is, anyone can quite easily host static content on GitHub and then load it up from somewhere else. That would be the simplest way I can think of to use GitHub as a CDN. When it comes to GitHub Pages, that service is different than your typical CDN in that GitHub is merely hosting the sites. We can be (pretty darn) sure GitHub is the one at the other end of the communication (the very end, after all intermediary proxies, etc), which makes this somewhat less potentially risky than caching proxies offered by CDNs. Still, there are many legit reasons for wanting to detect content served by GitHub, that's why I added it. I'm just wondering if I should move it out of that group of options to avoid giving people the false impression that GitHub offers CDN services.
I meant that I could once more broaden the scope of this extension a bit, and have it not only focus on detecting CDNs but also hosting sites (like GitHub) and such. If I did that, I should try to separate the items somehow so it becomes easier for users to understand what the extension is detecting. For starters, I should create a separate category in the options page for sites like GitHub, but I could also add some more information about each service somewhere... I'll have to think this well before I decide what to do.
Sure, it should be less risky in general, but that's when you think mostly about security. From a privacy standpoint, third-party content served by CDNs is just as potentially dangerous, if not more. @:jeans: If I list GitHub in the options as a hosting service or so, seeing it detected here shouldn't be confusing anymore (right?). Would you still prefer to not see it in the popup here? Just to be clear, I can do it like you said, I just don't personally care for it. Besides, not adding such exceptions could be useful: if you're on GitHub and the extension doesn't detect it, it could suggest something weird is going on (like phishing or something else). Whatcha think? |
Whatever you do, just be consistent. If GitHub is put in a new category, then treat it the same as others you would put in there re: badge counter etc
It's not worth it the extra work, especially as the items in the new category grows. The distinction has already been made by being in a new category! cogito ergo drinkies 🍺 night night PS: would we get a shiny new color (green, glorious green https://en.wikipedia.org/wiki/Money_(Blackadder)) |
How is the risk to user privacy, as in user tracking/profiling, elevated compared to a domain utilizing user profiling/tracking without a CDN involved?
That seems inconsistent thus - GitHub is serving content to 3rd part domains and is owned by MS and thus could potentially pose a risk to user privacy (in your own words). If Github gets excluded from detection, that incl. heuristics, the WX would start loosing its credibility.
In which case is transforms to a content delivery network for domains unaffiliated to Github. In Wikipedia CDN is an umbrella term spanning different types of content delivery services: video streaming, software downloads, web and mobile content acceleration, licensed/managed CDN, transparent caching, and services to measure CDN performance, load balancing, multi-CDN switching and analytics and cloud intelligence. CDN vendors may cross over into other industries like security, with DDoS protection and web application firewalls (WAF), and WAN optimization. |
loading libraries from CDN may actually pose a security risk unless protected by SRI |
My bad, I misread what you said before. For some weird reason I thought you were comparing third-party vs first-party, while you were actually only comparing the content type. I haven't slept well lately, sorry. 😅
I never said anything about excluding GitHub. I only said I considered it more appropriate to move GitHub into a separate category in the options page/menu, because GitHub does not (AFAIK) offer CDN services to web developers. If you visit a site and GitHub is detected, you know GitHub is at the other end of the communication. That's the key difference. Services like Cloudflare are used by good and bad people alike, but GitHub is always GitHub.
It is indeed a pretty broad term, but GitHub is clearly different than the thirty-something CDNs already detected by TS, and it doesn't even sell itself as a CDN. It's only a hosting service, at least for now. That's why I consider it appropriate to make that distinction somewhere, or at the very least it should be clearly stated somewhere that some hosting services are being thrown in that same bag. |
😟 |
Maybe in the next release. |
Use this issue for requests regarding individual CDNs.
Notes
The text was updated successfully, but these errors were encountered: