diff --git a/README.md b/README.md index 1b1e9cf..642717c 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,7 @@ Check out oour [Get Started Guide](https://fszewczyk.github.io/shkyera-grad/md_d Here's a small example showcasing a feed-forward network learning the XOR function. Check out the `examples/` folder for more examples. -```cpp +```{.cpp} #include "shkyera-grad/include/ShkyeraGrad.hpp" int main() { diff --git a/docs/tutorials/Cheatsheet.md b/docs/tutorials/Cheatsheet.md index 364df21..8b057aa 100644 --- a/docs/tutorials/Cheatsheet.md +++ b/docs/tutorials/Cheatsheet.md @@ -6,7 +6,7 @@ This page contains all the info you need to develop your models using Shkyera Gr Almost all of the classes in _Shkyera Grad_ are implemented using templates. To simplify creation of these objects, we introduced a standard way to instantiate objects with floating-point template parameters, i.e. -```cpp +```{.cpp} Linear32 = Linear Optimizer32 = Optimizer> Loss::MSE64 = Loss::MSE @@ -20,7 +20,7 @@ Adam64 = Adam Here's a full list of available layers: -```cpp +```{.cpp} auto linear = Linear32::create(inputSize, outputSize); auto dropout = Dropout32::create(inputSize, outputSize, dropoutRate); ``` @@ -29,7 +29,7 @@ auto dropout = Dropout32::create(inputSize, outputSize, dropoutRate); These are all implemented optimizers: -```cpp +```{.cpp} auto simple = Optimizer32(network->parameters(), learningRate); auto sgdWithMomentum = SGD32(network->parameters(), learningRate, momentum = 0.9); auto adam = Adam32(network->parameters(), learningRate, beta1 = 0.9, beta2=0.999, epsilon=1e-8); @@ -39,7 +39,7 @@ auto adam = Adam32(network->parameters(), learningRate, beta1 = 0.9, beta2=0.999 Optimization can be performed according to these predefined loss functions: -```cpp +```{.cpp} auto L1 = Loss::MAE32; auto L2 = Loss::MSE32; auto crossEntropy = Loss::CrossEntropy32; @@ -49,7 +49,7 @@ auto crossEntropy = Loss::CrossEntropy32; Simply copy-pase this code to quickly train your network: -```cpp +```{.cpp} using T = Type::float32; // feel free to change it to float64 auto optimizer = Adam(network->parameters(), 0.05); diff --git a/docs/tutorials/GetStarted.md b/docs/tutorials/GetStarted.md index 5894c85..f750ee4 100644 --- a/docs/tutorials/GetStarted.md +++ b/docs/tutorials/GetStarted.md @@ -12,7 +12,7 @@ git clone https://github.com/fszewczyk/shkyera-grad.git and import the main file of the library inside your own project. -```cpp +```{.cpp} #include "shkyera-grad/include/ShkyeraGrad.hpp" ``` @@ -24,7 +24,7 @@ Now, you can use all the features of this small engine. Internally, _Shkyera Grad_ **always** operates on individual scalars. For most purposes, you do not need to deal with them directly, but it's nice to understand how they work. Each scalar is wrapped inside a `Value` class. However, you should never instantiate objects of this type yourself. Instead, you should use the provided interface in the following way. -```cpp +```{.cpp} // Creates a floating-point scalar ValuePtr a = Value::create(5.2); ValuePtr a = Value::create(5.2); @@ -45,7 +45,7 @@ auto c = Value::create(7); You can also perform various operations directly on scalars! -```cpp +```{.cpp} using T = Type::float32; auto a = Value::create(2.1); @@ -60,7 +60,7 @@ auto e = (a + b - c)->pow(d); The magic behind the _Shkyera Grad_ is that it keeps track of all the operations, so that you can later calculate the derivatives of your expression. -```cpp +```{.cpp} auto a = Value::create(2.0); auto b = Value::create(3.0); auto c = a * b; @@ -78,7 +78,7 @@ If you want some refreshment on derivatives, check out [this wonderful video](ht Multiple scalars can be grouped together in a `Vector` to simplify operating on them. Input to any `Module` (more on them later) is a `Vector`. This abstraction provides some functionality that allows you to compute, for example a dot product. -```cpp +```{.cpp} // The easiest way to create a Vector auto a = Vector::of({1, 2, 3}); @@ -104,7 +104,7 @@ Nice! You got the basics! Let's build a network. The best way to create a model You can create your first neural network using `SequentialBuilder` in the following way. -```cpp +```{.cpp} auto network = SequentialBuilder::begin() .add(Linear::create(2, 15)) // Adds a layer with 2 inputs and 15 outputs .add(ReLU::create()) // Adds a ReLU activation function @@ -122,7 +122,7 @@ auto network = SequentialBuilder::begin() To train our network, we need to define an `Optimizer` that will optimizer the parameters as well as the `Loss` function that we will minimize. _Shkyera Grad_ comes with a set of well-known optimizers and loss functions. Again, check out the Cheat Sheet for a complete list. -```cpp +```{.cpp} // Simple stochastic gradient descent optimizer with 0.01 learning rate auto optimizer = Optimizer(network->parameters(), 0.01); @@ -141,7 +141,7 @@ auto awesomeCustomOptimizer = Adam32(network->parameters(), 0.01, beta1, beta2, Here's a list of some available `Loss` functions: -```cpp +```{.cpp} Loss::MAE // Mean Absolute Error Loss::MSE // Mean Squared Error Loss::CrossEntropy // Cross Entropy Loss - good for classification @@ -164,7 +164,7 @@ XOR (Exclusive OR) is a simple Boolean function that maps two values two one: Here, we basically pase the table above into `Vector`s. -```cpp +```{.cpp} std::vector xs; std::vector ys; @@ -179,7 +179,7 @@ xs.push_back(Vec32::of({1, 1})); ys.push_back(Vec32::of({0})); We define a simple neural network to predict this function. Our network has a total of three layers. It is a bit of an overkill for this task, but we will use it for learning purposes. -```cpp +```{.cpp} auto network = SequentialBuilder::begin() .add(Linear32::create(2, 15)) .add(ReLU32::create()) @@ -194,14 +194,14 @@ auto network = SequentialBuilder::begin() Now, we just need to specify the optimizer and the loss function we want to use: -```cpp +```{.cpp} auto optimizer = Adam32(network->parameters(), 0.05); auto lossFunction = Loss::MSE; ``` We train our model for 100 epochs. After each epoch, we pring the average loss. -```cpp +```{.cpp} for (size_t epoch = 0; epoch < 100; epoch++) { // We train for 100 epochs auto epochLoss = Val32::create(0); @@ -223,7 +223,7 @@ for (size_t epoch = 0; epoch < 100; epoch++) { // We train for 100 After the training, let's inspect how our network behaves. -```cpp +```{.cpp} for (size_t sample = 0; sample < xs.size(); ++sample) { // Go through each example Vec32 pred = network->forward(xs[sample]); // Predict result std::cout << xs[sample] << " -> " << pred << "\t| True: " << ys[sample] << std::endl;