Skip to content

Commit 0b94a4d

Browse files
committed
Fix #89 missing closing bracket in CUDA
1 parent 342930b commit 0b94a4d

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

lectures/L21.tex

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -184,7 +184,7 @@ \section*{GPUs: Heterogeneous Programming}
184184

185185
Let's break it down. The calculation of forces between bodies takes some \texttt{float4} and \texttt{float3} arguments. In the Rust example, I made my own \texttt{Point} and \texttt{Acceleration} types. CUDA uses \textit{vector types}, which are a group of $n$ of that primitive type. So a \texttt{float4} is a grouping of four \texttt{float}s where the components are referred to as \texttt{x, y, z, w}. There exist vector types for the standard C primitives (e.g., \texttt{int, uint, float, double, char}, and some more) in sizes of 1 through 4. It's just a nice way to package up related values without needing a custom structure (although you can send structures in to kernels). When we get to the host code you'll see that I've had to modify its representation of the data as well.
186186

187-
The function also is prefixed with \texttt{\_\_device\_\_} which indicates that it will be called from another function when running on the GPU. A \texttt{\_\_global\_\_} function (as the calculation of forces is called from the host, and such global functions can call device functions but not other global functions. Device functions can call only other device functions. So it makes it clear where the entry points are from host code. In some OOP-sense, you could consider the device functions to be ``private'', not that I encourage you to think that way.
187+
The function also is prefixed with \texttt{\_\_device\_\_} which indicates that it will be called from another function when running on the GPU. A \texttt{\_\_global\_\_} function (as the calculation of forces is called from the host, and such global functions can call device functions but not other global functions). Device functions can call only other device functions. So it makes it clear where the entry points are from host code. In some OOP-sense, you could consider the device functions to be ``private'', not that I encourage you to think that way.
188188

189189
The only other thing that really stands out is the \texttt{extern "C"} declaration at the beginning of the global function. This disables what is called \textit{name mangling} or \textit{name decoration}, which is to say a compiler trying to differentiate between multiple functions with the same name. If this is too compiler-magic to worry about, just place this magic spell in front of the function call and it prevents the compiler from telling you it can't find the function by the name you specified. More modern versions of \texttt{nvcc} may not have this problem, mind you.
190190

0 commit comments

Comments
 (0)