You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lectures/L30.tex
+9-9Lines changed: 9 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ \section*{Clusters and Cloud Computing}
13
13
multiple threads or multiple processes, you can do the same with multiple
14
14
computers. We'll survey techniques for programming for
15
15
performance using multiple computers; although there's overlap with
16
-
distributed systems, we're looking more at calculations here.
16
+
distributed systems, we're looking more at calculations here rather than coordination mechanisms.
17
17
18
18
\paragraph*{Message Passing.} Rust encourages message-passing, but
19
19
a lot of your previous experience when working with C may have centred around
@@ -30,12 +30,12 @@ \section*{Clusters and Cloud Computing}
30
30
Interface}, a de facto standard for programming message-passing multi-
31
31
computer systems. This is, unfortunately, no longer the way.
32
32
MPI sounds good, but in practice people tend to use other things.
33
-
Here's a detailed piece about the relevance of MPI today:~\cite{hpcmpi}, if
33
+
Here's a detailed piece about the relevance of MPI as of 10 years ago:~\cite{hpcmpi}, if
34
34
you are curious.
35
35
36
36
\paragraph{REST}
37
37
We've already seen asynchronous I/O using HTTP (curl) which we could use to
38
-
consume a REST API as one mechanism for multi-computer communication. You
38
+
interact with a REST API as one mechanism for multi-computer communication. You
39
39
may have also learned about sockets and know how to use those, which would
40
40
underlie a lot of the mechanisms we're discussing. The socket approach is too
41
41
low-level for what we want to discuss, while the REST API approach is at a
@@ -54,11 +54,12 @@ \section*{Clusters and Cloud Computing}
54
54
Communication is based around the idea of producers writing a record (some data element, like an invoice) into a topic (categorizing messages) and consumers taking the item from the topic and doing something useful with it. A message remains available for a fixed period of time and can be replayed if needed. I think at this point you have enough familiarity with the concept of the producer-consumer problem and channels/topics/subscriptions that we don't need to spend a lot of time on it.
55
55
56
56
57
-
Kafka's basic strategy is to write things into an immutable log. The log is split into different partitions; you choose how many when creating the topic, where more partitions equals higher parallelism. The producer writes something and it goes into one of the partitions. Consumers read from each one of the partitions and writes down its progress (``commit its offset'') to keep track of how much of the topic it has consumed. See this image from \url{kafka.apache.org}:
57
+
Kafka's basic strategy is to write things into an immutable log. The log is split into different partitions; you choose how many when creating the topic, where more partitions equals higher parallelism. The producer writes something and it goes into one of the partitions. Consumers read from each one of the partitions and writes down its progress (``commits its offset'') to keep track of how much of the topic it has consumed. See this image from \url{kafka.apache.org}:
The nice part about such an architecture is that we can provision the parallelism that we want, and the logic for the broker (the system between the producer and the consumer, that is, Kafka) is simple. Also, consumers can take items and deal with them at their own speed and there's no need for consumers to coordinate; they manage their own offsets. Messages are removed from the topic based on their expiry, so it's not important for consumers to get them out of the queue as quickly as possible.
64
65
@@ -104,8 +105,7 @@ \subsection*{Cloud Computing}
104
105
instances, that you've started up. Providers offer different instance
105
106
sizes, where the sizes vary according to the number of cores, local
106
107
storage, and memory. Some instances even have GPUs, but it seemed
107
-
uneconomic to use this for Assignment 3, at least in previous years (I
108
-
have not done the calculation this year).
108
+
uneconomic to use this for Assignment 3.
109
109
Instead we have the {\tt ecetesla} machines.
110
110
111
111
\paragraph{Launching Instances.} When you need more compute power,
@@ -152,7 +152,7 @@ \section*{Clusters versus Laptops}
152
152
\paragraph{Results.} 128 cores don't consistently beat a laptop at PageRank: e.g. 249--857s on the twitter\_rv dataset for the big data system vs 300s for the laptop, and they are 2$\times$ slower for label
153
153
propagation, at 251--1784s for the big data system vs 153s on
0 commit comments