Add sgd docs #5208

pratik305 · 2024-09-11T14:56:28Z

Description

This pull request introduces a new documentation file on Stochastic Gradient Descent (SGD) in the stochastic-gradient-descent.md file.
SGD is a widely used optimization algorithm in machine learning and deep learning due to its efficiency and scalability, particularly for large datasets. However, its effectiveness and stability can be significantly influenced by how it is implemented and tuned. This new documentation aims to offer clear, comprehensive insights into SGD’s operation, benefits, and limitations, helping users to better understand and apply this algorithm in their projects.

Issue Solved

Closes #4527

Type of Change

Adding a new entry
Editing an existing entry (fixing a typo, bug, issues, etc)
Updating the documentation

Checklist

All writings are my own.
My entry follows the Codecademy Docs style guide.
My changes generate no new warnings.
I have performed a self-review of my own writing and code.
I have checked my entry and corrected any misspellings.
I have made corresponding changes to the documentation if needed.
I have confirmed my changes are not being pushed from my forked main branch.
I have confirmed that I'm pushing from a new branch named after the changes I'm making.
I have linked any issues that are relevant to this PR in the Issues Solved section.

CLAassistant · 2024-09-11T14:56:35Z

All committers have signed the CLA.

pratik305 · 2024-09-12T01:18:46Z

@SaviDahegaonkar i am facing issue related to workflow awaiting approval how can i solve it

pratik305 · 2024-09-12T14:39:32Z

hi @SaviDahegaonkar Could you please review this PR when you get a chance? Let me know if any adjustments are needed.
Thanks

pratik305 · 2024-09-14T03:55:31Z

hi @SaviDahegaonkar Could you please review this PR when you get a chance? Let me know if any adjustments are needed.
Thanks

SaviDahegaonkar

Hey @pratik305 ,
I had a look at your file and suggested some changes to you please make them asap. Also include an example section and an syntax section as per the issue related to this says and also include your file in the correct path.

SaviDahegaonkar · 2024-09-14T11:17:56Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

@@ -0,0 +1,61 @@
+---
+Title: 'Stochastic Gradient Desent'


Suggested change

Title: 'Stochastic Gradient Desent'

Title: 'Stochastic Gradient Descent'

SaviDahegaonkar · 2024-09-14T11:21:18Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

@@ -0,0 +1,61 @@
+---
+Title: 'Stochastic Gradient Desent'
+Description: 'Stochastic Gradient Desent is optimizer algorithm that minimizes the loss functions in machine learning and deep learning models.'


Suggested change

Description: 'Stochastic Gradient Desent is optimizer algorithm that minimizes the loss functions in machine learning and deep learning models.'

Description: 'Stochastic Gradient Descent is an optimizer algorithm that minimizes the loss functions in machine learning and deep learning models.'

SaviDahegaonkar · 2024-09-14T11:51:39Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+Subjects:
+  - 'Machine Learning'
+  - 'Deep Learning'
+  - 'Computer Science'


Suggested change

Subjects:

- 'Machine Learning'

- 'Deep Learning'

- 'Computer Science'

Subjects:

- 'Machine Learning'

- 'Deep Learning'

- 'Computer Science'

Include only those subjects that are part of subjects.md file and you can add one if it is not in the list. Here Deep Learning is not part of subjects.md file so you can add this if you want it to include in your PR.

SaviDahegaonkar · 2024-09-14T12:01:58Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+  - 'AI'
+  - 'Neural Network'
+  - 'Optimizer'


Suggested change

- 'AI'

- 'Neural Network'

- 'Optimizer'

- 'AI'

- 'Neural Network'

- 'Optimizer'

Include only those tags that are part of tags.md list. Here optimizer is not part of the tags list so you can include it in the tags.md list so you can use this in your PR.

SaviDahegaonkar · 2024-09-14T12:10:45Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+  - 'paths/data-science'
+---
+
+**Stochastic Gradient Descent** (SGD) is a optimization algorithm. It is variant of gradient descent optimizer. The SGD minimize the loss function of machine learning algorithms and deep learning algorithms during backpropagation to update the weight and bias in Artificial Neural Networks. 


Suggested change

**Stochastic Gradient Descent** (SGD) is a optimization algorithm. It is variant of gradient descent optimizer. The SGD minimize the loss function of machine learning algorithms and deep learning algorithms during backpropagation to update the weight and bias in Artificial Neural Networks.

**Stochastic Gradient Descent** (SGD) is an optimization algorithm. It is variant of gradient descent optimizer. The SGD minimizes the loss function of machine learning algorithms and deep learning algorithms during backpropagation to update the weights and biases in Artificial Neural Networks.

SaviDahegaonkar · 2024-09-14T12:22:35Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+
+**Stochastic Gradient Descent** (SGD) is a optimization algorithm. It is variant of gradient descent optimizer. The SGD minimize the loss function of machine learning algorithms and deep learning algorithms during backpropagation to update the weight and bias in Artificial Neural Networks. 
+
+The term stochastic mean randomness on which algorithm based upon. In this algorithm instead of taking whole dataset like grdient descent we take single randomly selected data point or small batch of data.suppose if the data set contains 500 rows SGD update the model parameters 500 times in one cycle or one epoch.


Suggested change

The term stochastic mean randomness on which algorithm based upon. In this algorithm instead of taking whole dataset like grdient descent we take single randomly selected data point or small batch of data.suppose if the data set contains 500 rows SGD update the model parameters 500 times in one cycle or one epoch.

The term `stochastic` means randomness on which the algorithm is based. In this algorithm, instead of taking whole datasets like `gradient descent`, we take single randomly selected data points or small batches of data. Suppose if the data set contains 500 rows SGD updates the model parameters 500 times in one cycle or one epoch.

SaviDahegaonkar · 2024-09-14T12:30:52Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+$$ 
+\large \theta = \theta - \alpha  * \nabla J((\theta ; x_iy_i))
+$$


Suggested change

$$

\large \theta = \theta - \alpha * \nabla J((\theta ; x_iy_i))

$$

$$

\large \theta = \theta - \alpha \cdot \nabla J(\theta ; x_i, y_i)

$$

Typically \cdot is used for multiplication to keep the LaTex notation standard and the * symbol is not used.

SaviDahegaonkar · 2024-09-14T12:34:46Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+## Advantages
+- **Faster convergence:** SGD updates parameters more frequently hence it takes less time to converge especially for large datasets.
+- **Reduced Computation Time:** SDD takes only subset of dataset or batch for each update. This makes it easy to handle large datasets and compute faster.
+- **Avoid Local Minima:** The noise introduced by updating parameters with individual data points or small batches can help escape local minima.This can potentially lead to better solutions in complex, non-convex optimization problems.
+- **Online Learning:** SGD can be used in scenarios where data is arriving sequentially (online learning).- It allows models to be updated continuously as new data comes in.


Suggested change

## Advantages

- **Faster convergence:** SGD updates parameters more frequently hence it takes less time to converge especially for large datasets.

- **Reduced Computation Time:** SDD takes only subset of dataset or batch for each update. This makes it easy to handle large datasets and compute faster.

- **Avoid Local Minima:** The noise introduced by updating parameters with individual data points or small batches can help escape local minima.This can potentially lead to better solutions in complex, non-convex optimization problems.

- **Online Learning:** SGD can be used in scenarios where data is arriving sequentially (online learning).- It allows models to be updated continuously as new data comes in.

## Advantages

- **Faster convergence:** SGD updates parameters more frequently hence it takes less time to converge especially for large datasets.

- **Reduced Computation Time:** SGD takes only a subset of dataset or batch for each update. This makes it easy to handle large datasets and compute faster.

- **Avoid Local Minima:** The noise introduced by updating parameters with individual data points or small batches can help escape local minima.This can potentially lead to better solutions in complex, non-convex optimization problems.

- **Online Learning:** SGD can be used in scenarios where data is arriving sequentially (online learning).- It allows models to be updated continuously as new data comes in.

SaviDahegaonkar · 2024-09-14T12:37:12Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+## Practical Tips And Tricks When Using SGD
+- Shuffle data before training 
+- Use mini batches(batch size 32)
+- Normalize input
+- Choose suitable learning rate (0.01)


Suggested change

## Practical Tips And Tricks When Using SGD

- Shuffle data before training

- Use mini batches(batch size 32)

- Normalize input

- Choose suitable learning rate (0.01)

## Practical Tips And Tricks When Using SGD

- Shuffle data before training

- Use mini batches(batch size 32)

- Normalize input

- Choose a suitable learning rate (0.01)

pratik305 · 2024-09-15T13:43:20Z

@SaviDahegaonkar all changes are done

SaviDahegaonkar

Hey @pratik305 ,
I have suggested a few grammatical corrections and also included a relevant syntax and example section. Please make the changes asap.

Thanks,
Savi

SaviDahegaonkar · 2024-09-15T15:57:30Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+  - 'Computer Science'
+Tags:
+  - 'AI'
+  - 'Neural Network'


Suggested change

- 'Neural Network'

- 'Neural Networks'

SaviDahegaonkar · 2024-09-15T15:57:43Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

@@ -0,0 +1,106 @@
+---
+Title: 'Stochastic Gradient Descent'
+Description: 'Stochastic Gradient Descent is an optimizer algorithm that minimizes the loss functions in machine learning and deep learning models.'


Suggested change

Description: 'Stochastic Gradient Descent is an optimizer algorithm that minimizes the loss functions in machine learning and deep learning models.'

Description: 'Stochastic Gradient Descent is an optimizer algorithm that minimizes the loss function in machine learning and deep learning models.'

SaviDahegaonkar · 2024-09-15T16:00:55Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+  - 'paths/data-science'
+---
+
+**Stochastic Gradient Descent** (SGD) is an optimization algorithm. It is variant of gradient descent optimizer. The SGD minimizes the loss function of machine learning algorithms and deep learning algorithms during backpropagation to update the weights and biases in Artificial Neural Networks. 


Suggested change

**Stochastic Gradient Descent** (SGD) is an optimization algorithm. It is variant of gradient descent optimizer. The SGD minimizes the loss function of machine learning algorithms and deep learning algorithms during backpropagation to update the weights and biases in Artificial Neural Networks.

**Stochastic Gradient Descent** (SGD) is an optimization algorithm. It is a variant of gradient descent optimizer. The SGD minimizes the loss function of machine learning algorithms and deep learning algorithms during backpropagation to update the weights and biases in Artificial Neural Networks.

SaviDahegaonkar · 2024-09-15T16:05:18Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+- At each iteration, a random sample  is selected from the training dataset.
+- The gradient of the cost function with respect to the model parameters is computed based on the selected sample.
+- The model parameters are updated using the computed gradient and the learning rate.
+- The process is repeated for multiple iterations until convergence or a specified number of epochs.


Suggested change

- At each iteration, a random sample is selected from the training dataset.

- The gradient of the cost function with respect to the model parameters is computed based on the selected sample.

- The model parameters are updated using the computed gradient and the learning rate.

- The process is repeated for multiple iterations until convergence or a specified number of epochs.

- At each iteration, a random sample is selected from the training dataset.

- The gradient of the cost function with respect to the model parameters is computed based on the selected sample.

- The model parameters are updated using the computed gradient and the learning rate.

- The process is repeated for multiple iterations until convergence or a specified number of epochs.

SaviDahegaonkar · 2024-09-15T16:19:19Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+## Syntax
+- Learning Rate (α): A hyperparameter that controls the size of the update step.
+- Number of Iterations: The number of times the algorithm will iterate over the dataset.
+- Loss Function: The function that measures the error of the model predictions.
+- Gradient Calculation: The method for computing gradients based on the loss function.


Suggested change

## Syntax

- Learning Rate (α): A hyperparameter that controls the size of the update step.

- Number of Iterations: The number of times the algorithm will iterate over the dataset.

- Loss Function: The function that measures the error of the model predictions.

- Gradient Calculation: The method for computing gradients based on the loss function.

## Syntax

- Learning Rate (α): A hyperparameter that controls the size of the update step.

- Number of Iterations: The number of times the algorithm will iterate over the dataset.

- Loss Function: The function that measures the error of the model predictions.

- Gradient Calculation: The method for computing gradients based on the loss function.

The syntax must be included inside the backticks (.....) in the pseudo block. I can't find any syntax here. Please include a syntax.

@SaviDahegaonkar there is no syntax for sgd. we can use sgd with tensorflow, pytorch, numpy,scikit-learn each one of them are different syntax. what to do

can i add syntax like this
SGD(learning_rate, n_iterations, loss_function, gradient_calculation)

and then add code like
def stochastic_gradient_descent(X, y, theta, learning_rate, n_iterations):
for iteration in range(n_iterations):
for i in range(len(y)):
gradient = compute_gradient(X[i], y[i], theta)
theta -= learning_rate * gradient
return theta

pratik305 · 2024-09-22T13:53:05Z

@SaviDahegaonkar sir reply fast that can i make changes

SaviDahegaonkar · 2024-09-30T10:53:53Z

@SaviDahegaonkar sir reply fast that can i make changes

Yes sure you can, I will approve this PR.

… add-sgd-docs

pratik305 · 2024-09-30T14:18:03Z

@SaviDahegaonkar all done

SaviDahegaonkar

Hey @pratik305 ,
LGTM!

Thanks,
Savi

SaviDahegaonkar · 2024-10-02T08:44:22Z

...ai/concepts/neural-networks/terms/stochastic-gradient-descent/stochastic-gradient-descent.md

+  def stochastic_gradient_descent(X, y, theta, learning_rate, n_iterations):
+      for iteration in range(n_iterations):
+      for i in range(len(y)):
+      gradient = compute_gradient(X[i], y[i], theta)
+      theta -= learning_rate * gradient
+      return theta


Suggested change

def stochastic_gradient_descent(X, y, theta, learning_rate, n_iterations):

for iteration in range(n_iterations):

for i in range(len(y)):

gradient = compute_gradient(X[i], y[i], theta)

theta -= learning_rate * gradient

return theta

def stochastic_gradient_descent(X, y, theta, learning_rate, n_iterations):

for iteration in range(n_iterations):

for i in range(len(y)):

gradient = compute_gradient(X[i], y[i], theta)

theta -= learning_rate * gradient

return theta

Added some proper indentation.

… add-sgd-docs

pratik305 · 2024-10-05T04:25:24Z

@avdhoottt all changes are done can you review this please

pratik305 · 2024-10-10T01:49:16Z

@avdhoottt when will it get merged

pratik305 · 2024-10-13T14:19:37Z

@avdhoottt When will this is get this merge. Tell is there any think to change

pratik305 · 2024-10-14T14:40:50Z

@Maheshwaran17 can you please check this

pratik305 · 2024-10-20T14:09:07Z

@avdhoottt when will it get merge sir is there any problem

avdhoottt

LGTM!

avdhoottt · 2024-11-14T11:51:12Z

@avdhoottt when will it get merge sir is there any problem

Hey @pratik305, sorry for the late reply. I'm merging this PR. Thank you so much for contributing!

github-actions · 2024-11-14T11:52:08Z

👋 @pratik305
You have contributed to Codecademy Docs, and we would like to know more about you and your experience.
Please take a minute to fill out this four question survey to help us better understand Docs contributions and how we can improve the experience for you and our learners.
Thank you for your help!

🎉 Your contribution(s) can be seen here:

https://www.codecademy.com/resources/docs/ai/neural-networks/stochastic-gradient-descent

Please note it may take a little while for changes to become visible.
If you're appearing as anonymous and want to be credited, see here.

pratik305 · 2024-11-15T03:51:19Z

@avdhoottt thank you sir

pratik305 added 2 commits September 11, 2024 19:21

Started documentation for Stochastic Gradient Descent

dc2c740

correct some spelling mistake

4ee6162

SaviDahegaonkar self-assigned this Sep 11, 2024

SaviDahegaonkar added new entry New entry or entries status: under review Issue or PR is currently being reviewed neural-networks Neural Networks labels Sep 11, 2024

Merge branch 'main' into add-sgd-docs

a0bb7a6

pratik305 added 3 commits September 12, 2024 20:48

Merge branch 'main' into add-sgd-docs

815613e

Merge branch 'main' into add-sgd-docs

7fdbafd

Merge branch 'main' into add-sgd-docs

d83729c

SaviDahegaonkar requested changes Sep 14, 2024

View reviewed changes

SaviDahegaonkar added status: review 1️⃣ completed status: waiting for author and removed status: under review Issue or PR is currently being reviewed labels Sep 14, 2024

Add example and update documentation for SGD

4ae857a

Merge branch 'main' into add-sgd-docs

0b850ed

SaviDahegaonkar requested changes Sep 15, 2024

View reviewed changes

pratik305 added 2 commits September 18, 2024 10:08

Merge branch 'main' into add-sgd-docs

59535b0

Merge branch 'main' into add-sgd-docs

8fbec8e

pratik305 and others added 3 commits September 30, 2024 19:40

added syntax

94ddb5b

Merge branch 'add-sgd-docs' of https://github.com/pratik305/docs into…

b465efc

… add-sgd-docs

Merge branch 'main' into add-sgd-docs

50bbdf4

SaviDahegaonkar approved these changes Oct 2, 2024

View reviewed changes

SaviDahegaonkar added status: ready for next review and removed status: waiting for author labels Oct 2, 2024

pratik305 added 2 commits October 2, 2024 19:34

corrected code

3be99e8

Merge branch 'add-sgd-docs' of https://github.com/pratik305/docs into…

a14f821

… add-sgd-docs

avdhoottt added status: under review Issue or PR is currently being reviewed and removed status: ready for next review labels Oct 2, 2024

avdhoottt self-assigned this Oct 2, 2024

pratik305 added 2 commits October 2, 2024 19:36

Merge branch 'main' into add-sgd-docs

4fd729b

Merge branch 'main' into add-sgd-docs

93d2bde

pratik305 added 2 commits October 10, 2024 08:45

Merge branch 'main' into add-sgd-docs

5a1fd51

Merge branch 'main' into add-sgd-docs

8e18225

updated the file

55a704b

avdhoottt approved these changes Nov 14, 2024

View reviewed changes

avdhoottt added status: review 2️⃣ completed and removed status: under review Issue or PR is currently being reviewed labels Nov 14, 2024

Merge branch 'main' into add-sgd-docs

f57478f

avdhoottt merged commit c177d5f into Codecademy:main Nov 14, 2024
6 checks passed

pratik305 deleted the add-sgd-docs branch November 15, 2024 03:51

	Title: 'Stochastic Gradient Desent'
	Title: 'Stochastic Gradient Descent'

	Description: 'Stochastic Gradient Desent is optimizer algorithm that minimizes the loss functions in machine learning and deep learning models.'
	Description: 'Stochastic Gradient Descent is an optimizer algorithm that minimizes the loss functions in machine learning and deep learning models.'

	Stochastic Gradient Descent (SGD) is a optimization algorithm. It is variant of gradient descent optimizer. The SGD minimize the loss function of machine learning algorithms and deep learning algorithms during backpropagation to update the weight and bias in Artificial Neural Networks.
	Stochastic Gradient Descent (SGD) is an optimization algorithm. It is variant of gradient descent optimizer. The SGD minimizes the loss function of machine learning algorithms and deep learning algorithms during backpropagation to update the weights and biases in Artificial Neural Networks.


		Stochastic Gradient Descent (SGD) is a optimization algorithm. It is variant of gradient descent optimizer. The SGD minimize the loss function of machine learning algorithms and deep learning algorithms during backpropagation to update the weight and bias in Artificial Neural Networks.

		The term stochastic mean randomness on which algorithm based upon. In this algorithm instead of taking whole dataset like grdient descent we take single randomly selected data point or small batch of data.suppose if the data set contains 500 rows SGD update the model parameters 500 times in one cycle or one epoch.

	The term stochastic mean randomness on which algorithm based upon. In this algorithm instead of taking whole dataset like grdient descent we take single randomly selected data point or small batch of data.suppose if the data set contains 500 rows SGD update the model parameters 500 times in one cycle or one epoch.
	The term `stochastic` means randomness on which the algorithm is based. In this algorithm, instead of taking whole datasets like `gradient descent`, we take single randomly selected data points or small batches of data. Suppose if the data set contains 500 rows SGD updates the model parameters 500 times in one cycle or one epoch.

Add sgd docs #5208

Add sgd docs #5208

Conversation

pratik305 commented Sep 11, 2024

Description

Issue Solved

Type of Change

Checklist

CLAassistant commented Sep 11, 2024 • edited Loading

pratik305 commented Sep 12, 2024

pratik305 commented Sep 12, 2024

pratik305 commented Sep 14, 2024

SaviDahegaonkar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pratik305 commented Sep 15, 2024

SaviDahegaonkar left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pratik305 commented Sep 22, 2024

SaviDahegaonkar commented Sep 30, 2024

pratik305 commented Sep 30, 2024

SaviDahegaonkar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pratik305 commented Oct 5, 2024

pratik305 commented Oct 10, 2024

pratik305 commented Oct 13, 2024

pratik305 commented Oct 14, 2024

pratik305 commented Oct 20, 2024

avdhoottt left a comment

Choose a reason for hiding this comment

avdhoottt commented Nov 14, 2024

github-actions bot commented Nov 14, 2024

pratik305 commented Nov 15, 2024

CLAassistant commented Sep 11, 2024 •

edited

Loading

SaviDahegaonkar left a comment •

edited

Loading