To be able to serve an arbitrary number of HTTP clients simultaneously, replicator separates independent transactions in fibers. Fibers can be though of as threads, except that they lack the overhead of real posix threads. Quoted from wikipedia:
"Typically fibers are implemented entirely in userspace. As a result, context switching between fibers in a process does not require any interaction with the kernel at all and is therefore extremely efficient: a context switch can be performed by locally saving the CPU registers used by the currently executing fiber and loading the registers required by the fiber to be executed. Since scheduling occurs in userspace, the scheduling policy can be more easily tailored to the requirements of the program's workload."
The last point can be considered an advantage over real threads, which have no control over place or time of switching and therefore require special care of being thread-safe. Fibers, on the other hand, switch at fixed points in code, which makes thread-safety considerably easier to achieve. The downside of fixed switching points is the risk of blocking calls in between, stalling the entire application. As the article continues:
"However, the use of blocking system calls in fibers can be problematic. If a fiber performs a system call that blocks, the other fibers in the process are unable to run until the system call returns. A typical example of this problem is when performing I/O: most programs are written to perform I/O synchronously. When an I/O operation is initiated, a system call is made, and does not return until the I/O operation has been completed. In the intervening period, the entire process is "blocked" by the kernel and cannot run, which starves other fibers in the same process from executing."
The above problem is a consequence of 'co-operative scheduling', which means that "a running fiber must explicitly 'yield' to allow another fiber to run". In practice - or at least, replicator's practice, one needs only assure that fibers yield right before every I/O operation, and are resumed only after this operation is guaranteed not to block. To this end, fibers communicate the current state of communication before yielding to the scheduler. Three states are supported:
- RECV: fiber is awaiting data
- SEND: fiber is queueing data
- WAIT: fiber is sleeping
Argument to the first two states is the target socket, which the scheduler adds to the total pool of 'select' monitored sockets. The third state will either wake the fiber after a certain time (WAIT #seconds), wake together with the first other fiber to resume operation (WAIT None), or force a queue run (WAIT 0). The latter is useful in case heavy processing risks stalling the server even without blocking system calls. Note again that the responsibility of yielding in time is with the programmer.
As of version 2.2 the Python language includes on object that can very easily be turned into the fiber just described: the generator. Complete with a yield statement to both put the fiber on hold and communitate the state to the server, this is precisely the kind of "resumable function" we are looking for. Other implementations of this idea speak of weightless threads, coroutines, microthreads, tasklets, etc. Also notable in this respect is Stackless Python, introducing true microthreads in the language itself, and stackless in PyPy. And, lastly, there is the asyncore module that is very similar to fibers except for its rigid framework. Indeed, the previous versions of replicator were based on this.
A Python fiber terminates when either it returns or raises an exception. The scheduler takes care that exceptions are handled without interfering with other fibers and prints an error message or, optionally, a complete traceback message. As for printing, the default behaviour is to withhold all output until the fiber terminates in order to have a better view of past per-fiber operation. This behaviour can be changed to direct output. Fibers need not worry about this; it is all handled by the scheduler which replaces sys.stdout with the configured printer upon fiber activation. This to prevent tedius passing around of various log objects.
Remains to explain how the system is put in action. This is by the fiber module's spawn method, which opens a socket at the specified localhost port and starts the main loop listening for incoming connections. For every such connection it creates an instance of the specified generator function. Two additional arguments are a timeout time, the time of inactivity after which a fiber is terminated, and a debug boolean flag to control printed output. The main loop is exited by sending SIGHUP or ^C, in either case raising a KeyboardInterrupt. Finally, note that a valid generator:
- takes two arguments: 1) the connecting socket and 2) its address
- yields only objects of type WAIT, SEND, RECV
Other than that there are no restrictions.