Keynote "Bridging the gap between anti-censorship research and real world problems" @ FOCI 2023 #2623

xiaokangwang · 2023-07-25T15:01:59Z

xiaokangwang
Jul 25, 2023
Maintainer

A V2Ray developer have delivered a keynote speech at Free and Open Communications on the Internet 2023. Here is the script and slides.

Server name: "Which low-hanging fruits are better on the ground?" Host:"Bridging the gap between anti-censorship research and real world problems."

Yeah, that is a long title.

If one ever asks you why you are into anti-censorship, what would be your answer? Here are some candidates.
Stars and destiny, euros and cryptocoins, degree and tenure. Or maybe, to some degree: make a difference? This presentation is for you if you are considering making a difference by rolling out your research to the real world environment and helping real users.

There isn’t a lack of research when it comes to the anti-censorship field. However, a lot of research stays in their paper and does not receive wide adaptation. This is not great if the impact in the real world is what you are looking for. Here are some suggestions from me about how to find research topics that are more in need of communities impacted by internet censorship, and could create a more powerful change in the real world. 27s

I would like to say there are two clouds on the anti-censorship sky, but actually, there are 4 of them.

First, let's talk about the Stateful censorship of network addresses used by anti-censorship tools.
Network addresses like IP addresses or domain names are identifiers of network resources that are persistent and costly to replace. These addresses are often blocked with stateful censorship that outlives any individual connections. Understanding how these blocks happen, and in what condition they are lifted will help users reduce the cost of operating their proxy, and help anti-censorship tool designers design better tools. On the other hand, this kind of censorship has a few advantages to the censors, as the ability to inflict monetary damage allows censors to limit the use of anti-censorship technology, as well as combining information gathered from multiple connections into block decisions. We have heard a lot of reports from users about the use of certain anti-censorship tools or protocols resulting in the IP address or port being blocked, without a way to reproduce them reliably. The research into blockage of this kind is intrinsically difficult as they are costly to reproduce and discover concrete causal relationships and their necessary and sufficient conditions. To be specific, the wanted academic research includes analysis of the block of TLS-based proxy with persistent stateful network address censorship and the condition for the lift of restriction for blocked IP addresses and ports for anti-censorship tools.

Existing researches have made great discoveries and analysis on protocols that are not designed to be censorship resistant like DNS or HTTP or vanilla Tor relay service and in the case of ICLab, even build a monitoring system. That being said, for the users use anti-censorship tools, any website that is not hosted in their region will be proxied by their anti-censorship tool of choice, how well these websites are blocked don’t matter that much.

Okay, Here are some practical examples:
There is an ongoing discussion about the detection and block of TLS-based proxy protocol in China, with more than 150 pages of discussion, yes… 150 pages, and is still inconclusive.
net4people/bbs#129
Here is a quick summary of it:
In October of 2022, a wave of blocking of TLS based censorship circumvention protocols in China were spotted. TLS based censorship circumvention protocols are a cluster of censorship resistant protocols that try to avoid detection by tunneling the proxied traffic within a TLS connection. Since a censor without a TLS session key could not observe the content of the traffic, even if the inner protocol is identifiable for the censor, tunneling it with TLS makes it more similar to any other TLS traffic. Conveniently, TLS is also used to protect one of the most common protocols on the internet, HTTP protocol. This enabled a proxy to act as an HTTP service when reacting to an active probe, allowing it to blend into a vast amount of HTTPS services. And even more conveniently, HTTP 1.1 itself allows the connection to be upgraded into a new protocol like WebSocket, effectively allowing a standard load balancer forward proxy traffic to the anti-censorship tool, while working as an ordinary web server otherwise.
When I added Websocket transport into V2Ray 7 years ago, I had no idea this design will eventually be adapted into so many anti-censorship tools and with the censor's eye on it, its life is finite.

Based on user reports, when a TLS based proxy is set up, it will work for a while, and after a period of usage, the Port or IP could be blocked by GFW. The block of IP will persist for days to months, thus significantly longer than any connections. However, the exact mechanism for this to happen is still unclear. There are conflicting reports that hint at the possibility of TLS client and server hello fingerprinting, traffic shape analysis, traffic statistics analysis, and active probing. However, there is yet to be any conclusive and reproducible analysis. This is mainly due to conflicting and incomplete reports. For this reason, controlled experiments conducted with scientific methods are required for a more conclusive finding. Research into this field could guide the development of TLS based proxies, and solve a mystery.
The research about active probing of shadowsocks is one of the research focused on blocks of one specific fully encrypted protocol. It would be nice if there is such research on the blockage of TLS based proxies as well.

The second practical example is about the need of understanding the lift of bans on IP addresses once it is blocked when the IP address is blocked for anti-censorship usage. Because an IP address can change its users from time to time, and the censor would need to lift the ban after the unwanted service is no longer served on the IP address. Otherwise, the censor would eventually ban services it is not intended to ban. However, the exact mechanism for this process is not clear.

One might ask, why would one care about when the IP address is unblocked, if with some monetary cost, the IP address can be replaced, why would one worry about when will the unban happens? That’s because IP addresses are finite, and with enough of them blocked, it would become more than likely to receive an IP address that is already blocked when a random IP address is allocated from a service provider’s IP pool. At the same time, a lot of self-hosted users are interested in understanding the way to accelerate this unblock process if getting a new IP address is too expensive in their case. This is not a theoretical problem for the IP pool to be eventually filled with blocked addresses, with Tor’s dynamic bridge system. The design was simple, we host a set of bridges that are monitored for their blockage. As soon as a bridge is found to be blocked, it will be recreated with a new IP address. Does this work, yes, yet less so over time. With the IP address pool getting more blocked addresses, increasingly frequently, a freshly set up bridge will be dead on arrival with a blocked address. The question is, with a long enough time, will the number of blocked addresses eventually reach equilibrium? Will all of them get eventually blocked? And on a smaller scale, if a user has a VPS with a blocked IP address, what will be the best way to trick the censor to remove the block to restore it? All of these questions remain to be answered.

One of the important priorities of on-the-ground anti-censorship tools is asymmetry. This means the tool should be easy to deploy, yet hard to block even if the design is public.
Specifically, easy to deploy means the anti-censorship tool should ideally only depend on commonly available services and platforms. For example, many users depend on commercially available smartphones that don’t provide raw socket access or unrestricted memory usage. As a result, anti-censorship design that requires significant resource or special access in general is less likely to be widely adopted.

There are two factors that contributed to this preference. To understand this, it is necessary to understand the protocol adoption pipeline, and the parties involved in this process. This pipeline acts as a sequence of filters, and only the protocols that pass these filters can be widely used. The first stage of the protocol design is the prototype phase, the researcher or anti-censorship developer creates a reference implementation of the anti-censorship design. So far so good, and at this stage any design that works, will work. Then it will come to the seed user stage, where expert or eager users will self-host these protocols to start the initial adoption of the protocol. In this stage, any protocols that require unusual or complex setup are less likely to pass this filter, as the users are in general less likely to have the specialized setup environment or knowledge. Once these filters are passed, the more challenging parts arrive. The next step is client support, and unlike the initial reference implementation, this step would usually require a separate developer’s time and effort. This is typically the more difficult part. Without the original developer’s enthusiasm, the client developer often requires the protocol to already have a sufficient number of users and demands to justify the cost of development, while at the same time, the users are expecting client support to start using the protocol, thus creating a catch 22 situation. One of the most notable filter here is iOS support, unlike the Android system where only technological barriers exist, on iOS there is also administrative and monetary barriers. Sometimes the need to setup a company and payment to satisfy Apple’s rule without revealing the developer’s identity and evade know your customer rule is no less challenging than writing a functional proxy software. At this stage, more complex anti-censorship design or those requiring special access is unlikely to pass this filter. And there are a lot of graphical clients to write to cover users’ commonly used platforms. Finally, it is the service operator support, where service operators start to offer managed proxy service based on the provided protocol. As you would imagine, the service operator support would also require management software, and would also require time and effort from a separate developer. This final stage paves the way for the proxy protocol to become mainstream.
So, the adaption of a protocol requires the work of a lot of developers and users, thus, it is necessary to satisfy the interest of more than one party to receive mainstream status. It would be necessary to consider what would happen to a protocol further down the pipeline to increase the chance of a protocol passing more filters down the line if that is what one has in mind.

For full volunteer open source projects, there is no monetary leverage to encourage the adoption of any protocol. The protocol will need to satisfy the need of all the parties involved on its own, by its design itself without any management leverage.

This same thing would also apply to anti-censorship design from the academic community, with the additional difficulty of the lack of long term maintenance commitment.

This is a long process and with that, any anti-censorship protocol can take significant times in the scale of years to receive adoption, however, there is a shortcut that can accelerate this process. Currently, there is a rising interest in multiprotocol anti-censorship projects like V2Ray. This means the application will bundle in more than one protocol and allow them to be used in the same way, allowing the efforts to build related software ecosystems and user communities to be shared among different protocols. This allows the existing software projects to serve as a promotion platform for new protocols and fast forward the adoption process of anti-censorship protocols. It takes more than one year for V2Ray to receive its first Android client prototype, and right now, it takes less than a week to get a new protocol supported on android platform.

As you might have imagined we are also trying to get it supported on iOS platform with an open source client. However, as you imagined, this took a little longer than expected.

If one is looking to make it easier to promote an anti-censorship design, consider contributing it to an existing anti-censorship project with an existing community, rather than let it stays a separate project. Keep in mind that all projects have their coding style and design goals, it may not always be feasible to accept changeset that significantly impact the overall design. Discuss your change with the maintainer before committing significant time into integration.

As for hard to block with a public design, the focus is on increasing the cost for the censor to design a system that could accurately identify the network traffic associated with the proposed anti-censorship design, even if the full research and anti-censorship tool is public. This means tools relying on the implementation imperfection of the censor is unlikely to be cost-effective for the widespread deployment, as the censor could simply update their implementation to fill the hole.

Some examples of anti-censorship tools that rely on the glitch of censor include tcp stack desynchronization with incorrect checksum as demonstrated in the geneva research, or specific allow listing rules that could work well on a small scale. As you would imagine, with increasing popularity, protocols using this kind of trick would be a prioritized target. Without asymmetry, protocol implementation effort could receive insufficient payoff before the circumvention method is banned. As a result, volunteer or even full time anti-censorship engineers will not be able to compete with state sponsored censors in a euro for euro flight. And this requires the usage of high collateral damage anti-censorship designs that blend into background traffic.

What does a high collateral damage channel look like? It should allow a third party to rely on traffic in a way that is indistinguishable from traffic generated by other programs used by important services. This principle is characterised by domain fronting. However, with the wide adoption of this technology, more and more CDN providers are no longer allowing traffic with mismatching host and server name. This give community the reason to find a new high collateral damage channel. Currently, there are a few notable candidates that could become the next domain fronting.

The first one is function hosting service and other serverless platforms. There is a rising interest in creating network services without managing the servers. And one of the interesting effects is some of these services did not take censor friendliness into consideration when designing the architecture of their service. This allows the users of serverless platforms to invoke functions without revealing persistent operator identifiers to the observer. This allows either domain fronting like function calls without blockable identifiers, or identifiers that can be reset without financial cost.

Sometimes, these serverless platforms have a unique advantage in that there is no cost associated with creating new identifiers like server names or holding more than one of them. Learning from how zlibrary deals with censorship by assigning users with a personal domain name, we could apply the same principle to server names of serverless services. In fact, each user or each cluster of users can receive a unique identifier, avoiding censors from discovering the server name of other users. That being said they can always DDoS the domain to kick the service out of business with an unexpected bill, so countermeasures against traffic flooding is always recommended and should be taken into consideration.

The second one is miscellaneous apis that provide a way to tunnel traffic. Some of the explored ideas include dns over https and amp cache. There are more services that could serve the function of traffic relay, including image proxy, hosted queue and publish subscriber. These services are individually easy to block with limited asymmetry. However, with engineering design to reduce marginal cost for supporting new services, the scale of the balance of power can be tilted to the anti-censorship side.

There isn’t a lack of research on traffic recognition. Just search “tunnel traffic recognition application”. However, when it comes to how to make it harder to identify traffic based on its shape, there is a lack of practical research that makes a balance between 3 conflicting goals: conserve traffic thus minimize protocol overhead including padding(or have a low overhead), transmit payload traffic as soon as possible to facilitate interactive connection, and avoid traffic shape based detection. Currently, anti-censorship tools often reduce the amount of padding to reduce overhead, at the cost of reduced traffic shape analysis evasion.

Traffic shape recognition is based on the principle that, without specialized design, an encrypted tunnel does not hide the length, timing and direction of underlying traffic. Just like wrapping an item with a trash bag usually doesn't hide what is inside, without appropriate countermeasures, the application protocol inside a tunnel is not hidden. Blocking SSH dynamic port forwarding isn’t anything new. In fact, according to some research(https://sci-hub.st/10.1109/MILCOM47813.2019.9020938), it is discovered that the application protocol itself has more identifiable fingerprint than the tunnelling protocol,(pause) with traffic shape analysis.

There are 4 primary ways to avoid traffic shape recognition in anti-censorship protocols, and there is currently a need to find a way to make the effective combination and organization of these methods to create a system that can reduce the traffic shape fingerprint while satisfying the deployment requirements.

The first method is padding. From a protocol designer's point of view, adding padding isn’t a hard thing. The hard thing is deciding when and how much padding to add. And there is a lot of design choice to make. In general, there are 2 kinds of padding schemes that currently exist in anti-censorship designs: the first is random padding, typically by padding packets to a randomly or procedurally generated length, or WITH a random length of data. The second kind is length replaying. By recording a sequence of traffic that the anti-censorship protocol is attempting to imitate, and then sending traffic that is exactly this length when the anti-censorship tool is relaying the traffic while replacing the payload content with traffic being relayed. However, neither of these methods satisfies all 3 goals.

For random padding, the issue was 2 fold. If the random padding is insufficient, then it has little effect as the application’s traffic has patterns that could not be adequately masked by insufficient random padding. On the other hand, if the padding length is sufficiently large, the traffic will have the shape of randomized padding. If a random stream looking fully encrypted traffic can be blocked, then too can random looking lengths. This is compounded by the fact most padding is added when there is payload to send, instead of sent independently of payload traffic, resulting in leaking traffic timing and roundtrip order information. For this reason, random padding as they are currently implemented is not sufficient to deal with traffic shape recognition.

On the other hand, there is length replaying. It currently has the shortcoming of being unable to scale based on the current traffic load. This means for users, if a low bandwidth traffic is replayed, the payload connection will be slow and sometimes unresponsive. On the other hand, replaying a high bandwidth traffic will result in high bandwidth consumption, even if the user currently does not have a lot of things to send.

Since none of them is perfect, it might be necessary to find a new way to construct a padding pattern in order to combine both methods and allow a plausible traffic shape to be generated on the fly. I know generative adversarial networks, but it is unlikely to work on mobile or constrained environments, so something less heavy like Markov chain with ngram could work better in the real world.

The second way to deal with traffic shape analysis is multiplexing the traffic. This means, more than one payload connection is combined into a single proxy connection, thus increasing the difficulty to match any application level traffic shape. This method has the issue of increasing the difficulty of protocol implementation and making it less likely to be found in non-reference implementations of software. And yes, especially when it is optional.

The difficulty of adding traffic multiplexing is more of an engineering one than a research one, since the non-existence of a standardized way to multiplex the traffic is the primary reason for lacking support in the client since the client developers usually don’t care to spend their development time in rewriting complex pieces of libraries that is not absolutely necessary to get connections to work.

The third way to deal with traffic shape analysis is traffic splitting. Although traditional proxy software often tries to tunnel the traffic of any given payload connection in the same proxy connection, it is also possible to split up the traffic into more than one relay connection. However, there is currently no research on how to efficiently work with more than one reliable transmission channel to transfer a single reliable connection. I know I know one can treat these connections as unreliable and send protocols designed for unreliable channels into it, however, this is not ideal as the retransmission messages will be stuck in the retransmission queue as well, resulting in connections that are prone to melt down when network quality decreases. This should be solved with more advanced retransmission protocols that take advantage of the underlying connection’s reliability. And this is yet another research that on the ground anti-censorship tools could take advantage of.

Currently, the deployed solutions around this issue are focused on adding random padding and multiplexing the traffic. However, with more research, it would be possible to add more tools to the mix to resist traffic shape analysis while considering the cost of such operations.

To adequately resist traffic shape analysis, an anti-censorship tool would need to combine the methods mentioned above. This would not be easy as it would require solving more than one issue in one single research. However, as more research is done in this field, we should be able to stay ahead of censors again.

For end users, accessibility is not only about being able to access the internet, but also being able to do so with reasonable performance. Software not designed for anti-censorship purposes always tries to reduce server and network load without dealing with situations where the network is purposefully downgraded. This requires the anti-censorship tool to take performance into consideration when it comes to protocol design. This is particularly important where network neutrality is evidently not honoured with the presence of censorship. Frequent adverse network conditions often include high packet loss and throttling. These sabotage can often result in abysmally slow connection speed, which encourages users to stop using anti-censorship tools while retaining plausible deniability for the censor. A transmission performance tool in an adverse network environment can increase network performance and improve user experience.
The issue with network performance is particularly acute in China, for a variety of reasons and they are both business and political. Here is the context:
China’s government has a goal of providing affordable and accessible internet connection everywhere it rules. If it sounds too good to be true, that’s because it is. The network communication towers in rural and loosely populated areas cost a lot, and they could not hope to recover the cost from the users. To make the matter worse, in order to enforce internet censorship effectively, there is no internet exchange available to the general public in China, and like business networks, the ISP in China would exchange traffic overseas with other isps while ordinarily avoiding offering BGP sessions within China. And… you guessed it, these traffic exchanges are paid connectivities with transit connections. As the established internet services providers especially those in the US that are considered tier 0 won’t give them uncompensated peering. Okay now the Chinese ISP are now basically losing big, they need to pay for an infrastructure that couldn’t pay for themself, and for every traffic generated by home users, they are also losing money in internet exchange for connectivity.

But, life finds a way, and so does business that survives, they got a clever idea to fix the losing money issue. Remember the government only mandated that the home network should be affordable? They didn’t say the network for data centers should also be affordable. In order to recover the costs, they charge data centers a surprisingly high fee for internet connectivity, since the home users are connecting to those websites that host their contents in the data center, the cost of providing internet access is simply shifted to business internet users. In the end, the government is happy that ISP fulfilled their mandate, website operators just pass the cost of operation to users like limiting download speed to less than 1 kb per second unless users buy a paid plan, and users are happy that they could use the internet for an affordable price everywhere, sorry I mean everywhere a person with China mainland passport can travel without a visa and also safe which is almost strictly a subset of China mainland. Okay, with such a long sentence, you probably have forgotten I didn’t mention ISP, and assume they are also happy, and yes, they are happy, after just one more thing.

Remember that website operators pay for the traffic cost with inflated traffic fees? they can only charge !!!their!!! customers, that’s Chinese websites. For websites hosted outside China, they are still losing money because they couldn't charge these oversea websites money, and the Chinese ISP is still paying for connectivity with these international internet based services. So, they got a plan.

They will just restrict or throttle international connectivity independently of national level censorship, and charge extra, a significant one if an international service or user wishes to avoid the throttling. Since the Chinese government only mandates access to the “internet” and thus only domestic websites, this kind of restriction is okay for them. So basically all internet services outside have bad connectivity by default unless the operator pays a lot for better connectivity. This is often called optimised China routes.

Okay now back to the anti-censorship world. For anti-censorship service operators, if they are not paying a premium for the traffic, then the network service would be a terrible one by design. And this includes most VPS with unmetered network access. For individual users and anti-censorship services that don’t charge users fees for their traffic, they often find the need to set up in an environment where they don’t pay a lot for the network traffic, and as a result, the user in China would have poor network connectivity with these anti-censorship services.

There are a few fixes for this, and the thing I would think we need more academic work on is technological solutions.

As we just discussed, to limit international connectivity, the ISP in China often apply high packet loss and throttling to traffic sent to and received from oversea endpoints.

Existing researches have tried adding forward error correction to the quic, but packet loss of 10% to 50% often observed in lossy networks in China is not adequately considered. And yes, that’s a unique challenge of thinking big.

There are some existing solutions to deal with packet loss issues, and they are usually based on customized automatic repeat requests and forward error correction systems. This includes kcp and tinyfec utilize more aggressive packet retransmission and forward error correction to deal with packet loss. Yet, like all anti-censorship protocols, their time is limited, as more and more ISP begin to restrict UDP connections with throttling. And the users begin to tunnel UDP with fake tcp headers, which unfortunately don’t work on mobile.

Okay, we have caught up with history and context, and here is what we could do next as the academic community.

We should always consider performance implications with poor network connectivity when designing anti-censorship tools. This includes making sure the censor-resistant design still holds its strength when there is strong network interference and having more research on improving network performance with networks with reduced connectivity.

Here are some example research topics that could help with network performance.
Split tunnel tcp research as we have mentioned earlier, improved automatic repeat requests that make use of more advanced forward error correction.

And with that, this is almost the end of this presentation. And here is a list of all the research topics suggested in this keynote:
Understanding the mechanism behind the persistent block of IP address, and lift of such block especially in the case of TLS based proxy.
Find the next high collateral damage censorship resistant communication channel based on maybe push notification, serverless platform, or other apis that can relay information.
Find a way to generate padding length while balancing with anti-censorship, interactive communication, and conserving traffics goals.
Find a way to split reliable stream traffic into more than one reliable stream traffic, while retaining efficiency.
Create an improved automatic repeat request system that makes use of more advanced forward error correction.
And the general advice that anti-censorship tools aimed toward widespread deployment should consider the deployment process of an anti-censorship protocol, and the network environment these tools will be operated, and contribute your anti-censorship design to an existing anti-censorship project if you are willing and able.

That is my presentation today! Thanks for listening. Now it is Q&A time!

yuhan6665 · 2023-10-01T22:48:37Z

yuhan6665
Oct 1, 2023

很棒的总结与展望感谢大佬们连接开源社区与学术界的努力

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keynote "Bridging the gap between anti-censorship research and real world problems" @ FOCI 2023 #2623

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Keynote "Bridging the gap between anti-censorship research and real world problems" @ FOCI 2023 #2623

xiaokangwang Jul 25, 2023 Maintainer

Replies: 1 comment

yuhan6665 Oct 1, 2023

xiaokangwang
Jul 25, 2023
Maintainer

yuhan6665
Oct 1, 2023