diff --git a/bin/concept-of-the-week.txt b/bin/concept-of-the-week.txt index b2e6e340036..d1c5b89855e 100644 --- a/bin/concept-of-the-week.txt +++ b/bin/concept-of-the-week.txt @@ -1 +1 @@ -content/ruby/concepts/gems/gems.md \ No newline at end of file +content/c/concepts/user-input/user-input.md \ No newline at end of file diff --git a/content/ai/concepts/neural-networks/terms/convolutional-neural-networks/convolutional-neural-networks.md b/content/ai/concepts/neural-networks/terms/convolutional-neural-networks/convolutional-neural-networks.md index 40479a40768..2925ed6fa5e 100644 --- a/content/ai/concepts/neural-networks/terms/convolutional-neural-networks/convolutional-neural-networks.md +++ b/content/ai/concepts/neural-networks/terms/convolutional-neural-networks/convolutional-neural-networks.md @@ -1,6 +1,6 @@ --- Title: 'Convolutional Neural Networks' -Description: 'Convolutional Neural Networks are a type of neural network that are primarily used for computer vision tasks, such as image classification, object detection, and semantic segmentation.' +Description: 'Convolutional Neural Networks (CNNs) are neural networks primarily used for computer vision tasks like image classification, object detection, and segmentation.' Subjects: - 'Machine Learning' - 'Computer Science' diff --git a/content/ai/concepts/neural-networks/terms/gradient-descent/gradient-descent.md b/content/ai/concepts/neural-networks/terms/gradient-descent/gradient-descent.md new file mode 100644 index 00000000000..ef4f66b5957 --- /dev/null +++ b/content/ai/concepts/neural-networks/terms/gradient-descent/gradient-descent.md @@ -0,0 +1,112 @@ +--- +Title: 'Gradient Descent' +Description: 'Gradient Descent is an optimization algorithm that minimizes a cost function by iteratively adjusting parameters in the direction of its gradient.' +Subjects: + - 'Machine Learning' + - 'Data Science' + - 'Computer Science' +Tags: + - 'AI' + - 'Machine Learning' + - 'Neural Networks' + - 'Functions' +CatalogContent: + - 'paths/data-science' + - 'paths/machine-learning' +--- + +**Gradient Descent** is an optimization algorithm commonly used in machine learning and neural networks to minimize a cost function. Its goal is to iteratively find the optimal parameters (weights) that minimize the error or loss. + +In neural networks, gradient descent computes the gradient (derivative) of the cost function with respect to each parameter. It then updates the parameters in the direction of the negative gradient, effectively reducing the cost with each step. + +## Types of Gradient Descent + +There are three main types of gradient descent: + +| Type | Description | +| ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Batch Gradient Descent** | Uses the entire dataset to compute the gradient and update the weights. Typically slower but more accurate for large datasets. | +| **Stochastic Gradient Descent (SGD)** | Uses a single sample to compute the gradient and update the weights. It is faster, but the updates are noisy and can cause fluctuations in the convergence path. | +| **Mini-batch Gradient Descent** | A compromise between batch and stochastic gradient descent, using a small batch of samples to compute the gradient. It balances the speed and accuracy of the learning process. | + +## Gradient Descent Update Rule + +The basic update rule for gradient descent is: + +```pseudo +theta = theta - learning_rate * gradient_of_cost_function +``` + +- `theta`: The parameter (weight) of the model that is being optimized. +- `learning_rate`: A hyperparameter that controls the step size. +- `gradient_of_cost_function`: The gradient (derivative) of the cost function with respect to the parameters, indicating the direction and magnitude of the change needed. + +## Syntax + +Here's a basic syntax for Gradient Descent in the context of machine learning, specifically for updating the model parameters (weights) in order to minimize the cost function: + +```pseudo +# Initialize parameters (weights) and learning rate +theta = initial_value # Model Parameters (weights) +learning_rate = value # Learning rate (step size) +iterations = number_of_iterations # Number of iterations + +# Repeat until convergence +for i in range(iterations): + # Calculate the gradient of the cost function + gradient = compute_gradient(X, y, theta) # Gradient calculation + + # Update the parameters (weights) + theta = theta - learning_rate * gradient # Update rule + + # Optionally, compute and store the cost (for monitoring convergence) + cost = compute_cost(X, y, theta) + store(cost) +``` + +## Example + +In the following example, we implement simple gradient descent to minimize the cost function of a linear regression problem: + +```py +import numpy as np + +# Sample data (X: inputs, y: actual outputs) +X = np.array([1, 2, 3, 4, 5]) +y = np.array([1, 2, 1.3, 3.75, 2.25]) + +# Parameters initialization +theta = 0.0 # Initial weight +learning_rate = 0.01 # Step size +iterations = 1000 # Number of iterations + +# Cost function (Mean Squared Error) +def compute_cost(X, y, theta): + m = len(y) + cost = (1/(2*m)) * np.sum((X*theta - y)**2) # The cost function for linear regression + return cost + +# Gradient Descent function +def gradient_descent(X, y, theta, learning_rate, iterations): + m = len(y) + cost_history = [] + + for i in range(iterations): + gradient = (1/m) * np.sum(X * (X*theta - y)) # Derivative of cost function + theta = theta - learning_rate * gradient # Update theta + cost_history.append(compute_cost(X, y, theta)) # Track cost + return theta, cost_history + +# Run Gradient Descent +theta_optimal, cost_history = gradient_descent(X, y, theta, learning_rate, iterations) + +print(f"Optimal Theta: {theta_optimal}") +``` + +The output for the above code will be something like this: + +```shell +Optimal Theta: 0.6390909090909086 +``` + +> **Note**: The optimal `theta` value will be an approximation, as the gradient descent approach iteratively updates the weight to reduce the cost function. diff --git a/content/blockchain/blockchain.md b/content/blockchain/blockchain.md new file mode 100644 index 00000000000..516db996db0 --- /dev/null +++ b/content/blockchain/blockchain.md @@ -0,0 +1,29 @@ +--- +Title: 'Blockchain' +Description: 'Blockchain is a decentralized ledger that securely records transactions, ensuring transparency, trust, and immutability without a central authority.' +Codecademy Hub Page: 'https://www.codecademy.com/catalog/subject/blockchain' +CatalogContent: + - 'rust-for-programmers' + - 'paths/computer-science' +--- + +**Blockchain** is a decentralized and distributed digital ledger that securely records transactions across multiple nodes in a network. It ensures data integrity through cryptographic techniques and transparency by allowing participants to access an immutable, shared history of transactions. By eliminating the need for a central authority, blockchain enables trust and collaboration in various applications, from cryptocurrencies to supply chain management. + +Blockchain’s origins trace back to 1991 when cryptographers Stuart Haber and W. Scott Stornetta introduced a system for timestamping digital documents. The technology gained prominence in 2008 with Satoshi Nakamoto’s creation of Bitcoin, the first decentralized cryptocurrency using blockchain as its backbone. Over time, its applications have expanded beyond cryptocurrencies to include smart contracts, supply chain management, and enterprise solutions. + +Key principles of Blockchain include: + +- **Decentralization**: No central authority; data is shared across nodes. +- **Cryptographic Security**: Data integrity ensured by encryption. +- **Consensus Mechanisms**: Agreement protocols like Proof of Work or Proof of Stake. + +## Types of Blockchains + +1. **Public Blockchains**: + Open and decentralized networks where anyone can participate, read, or write data. These blockchains prioritize transparency and security, making them ideal for cryptocurrencies (e.g., Bitcoin, Ethereum). However, they may face scalability challenges and require significant energy for consensus. + +2. **Private Blockchains**: + Permissioned networks with restricted access, used by organizations to enhance efficiency and control. Only authorized participants can interact with the network, making it suitable for use cases like supply chain management or internal data sharing (e.g., Hyperledger, Corda). + +3. **Consortium Blockchains**: + Blockchains managed collaboratively by a group of organizations. These hybrid systems strike a balance between decentralization and controlled access, often used in industries where shared authority is required, such as banking or healthcare (e.g., R3 Corda, Quorum). diff --git a/content/lua/concepts/strings/terms/lower/lower.md b/content/lua/concepts/strings/terms/lower/lower.md index 18f9b5c09df..078483c8fee 100644 --- a/content/lua/concepts/strings/terms/lower/lower.md +++ b/content/lua/concepts/strings/terms/lower/lower.md @@ -1,5 +1,5 @@ --- -Title: 'lower()' +Title: '.lower()' Description: 'Returns a copy of the string given, with all uppercase characters transformed to lowercase.' Subjects: - 'Code Foundations' diff --git a/content/numpy/concepts/built-in-functions/terms/sort/sort.md b/content/numpy/concepts/built-in-functions/terms/sort/sort.md new file mode 100644 index 00000000000..87e8b86680d --- /dev/null +++ b/content/numpy/concepts/built-in-functions/terms/sort/sort.md @@ -0,0 +1,79 @@ +--- +Title: '.sort()' +Description: 'Sorts an array in ascending order along the specified axis and returns a sorted copy of the input array.' +Subjects: + - 'Computer Science' + - 'Data Science' +Tags: + - 'Arrays' + - 'Functions' + - 'NumPy' +CatalogContent: + - 'learn-python-3' + - 'paths/data-science' +--- + +In NumPy, the **`.sort()`** function sorts the elements of an array or matrix along a specified axis. It returns a new array with elements sorted in ascending order, leaving the original array unchanged. Sorting can be performed along different axes (such as rows or columns in a 2D array), with the default being along the last axis (`axis=-1`). + +## Syntax + +```pseudo +numpy.sort(a, axis=-1, kind=None, order=None) +``` + +- `a`: The array of elements to be sorted. +- `axis`: The axis along which to sort. If set to `None`, the array is flattened before sorting. The default is `-1`, which sorts along the last axis. +- `kind`: The sorting algorithm to use. The options are: + - [`'quicksort'`](https://www.codecademy.com/resources/docs/general/algorithm/quick-sort): Default algorithm, a fast, comparison-based algorithm. + - [`'mergesort'`](https://www.codecademy.com/resources/docs/general/algorithm/merge-sort): Stable sort using a divide-and-conquer algorithm. + - [`'heapsort'`](https://www.codecademy.com/resources/docs/general/algorithm/heap-sort): A comparison-based sort using a heap. + - `'stable'`: A stable sorting algorithm, typically mergesort. +- `order`: If `a` is a structured array, this specifies the field(s) to sort by. If not provided, sorting will be done based on the order of the fields in `a`. + +## Example + +The following example demonstrates how to use the `.sort()` function with various parameters: + +```py +import numpy as np + +arr = np.array([[3, 1, 2], [6, 4, 5]]) + +print(np.sort(arr)) +print(np.sort(arr, axis=0)) +print(np.sort(arr, axis=None)) +``` + +This example results in the following output: + +```shell +[[1 2 3] + [4 5 6]] +[[3 1 2] + [6 4 5]] +[1 2 3 4 5 6] +``` + +## Codebyte Example + +Run the following codebyte example to better understand the `.sort()` function: + +```codebyte/python +import numpy as np + +arr = np.array([[23, 54, 19], [45, 34, 12]]) + +print("Original array:") +print(arr) + +# Sort along axis 0 (sort by columns) +print("\nSorted array along axis 0 (columns):") +print(np.sort(arr, axis=0)) + +# Sort along axis 1 (sort by rows) +print("\nSorted array along axis 1 (rows):") +print(np.sort(arr, axis=1)) + +print("\nSorted array (flattened):") +print(np.sort(arr, axis=None)) +``` diff --git a/content/numpy/concepts/math-methods/terms/deg2rad/deg2rad.md b/content/numpy/concepts/math-methods/terms/deg2rad/deg2rad.md new file mode 100644 index 00000000000..2432eca0f88 --- /dev/null +++ b/content/numpy/concepts/math-methods/terms/deg2rad/deg2rad.md @@ -0,0 +1,65 @@ +--- +Title: '.deg2rad()' +Description: 'Converts angles from degrees to radians.' +Subjects: + - 'Computer Science' + - 'Data Science' + - 'Web Development' +Tags: + - 'Math' + - 'NumPy' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +In NumPy, the **`.deg2rad()`** function converts an angle from degrees to radians. + +> **Note:** In NumPy, the default unit for angles is radians. Therefore, the `.deg2rad()` function is used to convert angle values from degrees to radians. + +## Syntax + +```pseudo +numpy.deg2rad(x, out=None) +``` + +- `x`: The input array (or scalar) containing angles in degrees that need to be converted to radians. +- `out` (Optional): A location where the result is stored. If not specified, a new array is returned. + +## Example + +In this example, the code converts an angle measured in degrees to radians using the `numpy.deg2rad()` function: + +```py +import numpy as np + +# Angle of the board on a table measured in degrees +angle_degrees = 45 + +# Convert the angle to radians +angle_radians = np.deg2rad(angle_degrees) + +# Output the result +print(f"Angle in degrees: {angle_degrees}") +print(f"Angle in radians: {angle_radians}") +``` + +The code above produces the following output: + +```shell +Angle in degrees: 45 +Angle in radians: 0.7853981633974483 +``` + +## Codebyte Example + +Run the codebyte example below to understand how the `.deg2rad()` function works: + +```codebyte/python +import numpy as np + +degrees = 170 +radians = np.deg2rad(degrees) + +print(f"{degrees} degrees is {radians} radians.") +``` diff --git a/content/numpy/concepts/math-methods/terms/square/square.md b/content/numpy/concepts/math-methods/terms/square/square.md new file mode 100644 index 00000000000..75e6f655f02 --- /dev/null +++ b/content/numpy/concepts/math-methods/terms/square/square.md @@ -0,0 +1,127 @@ +--- +Title: '.square()' +Description: 'Calculates the square of each element in an array.' +Subjects: + - 'Computer Science' + - 'Data Science' + - 'Discrete Math' +Tags: + - 'Arrays' + - 'Functions' + - 'NumPy' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +In NumPy, the **`.square()`** method computes the square of a number or the square of the elements in an array. It is commonly used in mathematical calculations, machine learning, data analysis, engineering, and graphics. + +## Syntax + +```pseudo +numpy.square(x, out = None, where = True, dtype = None) +``` + +- `x`: The input data, which can be a number, an array, or a multidimensional array. +- `out` (Optional): A location where the result is stored. If provided, it must have the same shape as the expected output. +- `where` (Optional): A boolean array specifying which elements to compute. The result is only computed for elements where `where` is `True`. +- `dtype` (Optional): The desired data type for the output array. If not specified, it defaults to the data type of x. + +## Examples + +### Modifying the output array + +The output array for NumPy operations cannot be a Python [list](https://www.codecademy.com/resources/docs/python/built-in-functions/list) because lists are not optimized for numerical computations. NumPy arrays are composed of contiguous blocks of memory, which enhances performance. Therefore, the array passed for the out parameter must be a NumPy array initialized with the `numpy.array` function: + +```py +import numpy as np + +output_array = np.array([0, 0, 0, 0, 0]) +``` + +This array can then be used as the `out` parameter in the `numpy.square()` function: + +```py +import numpy as np + +output_array = np.array([0, 0, 0, 0, 0]) + +array = [1, 2, 3, 4, 5] +np.square(array, out = output_array) +print(output_array) +``` + +This generates the output as follows: + +```shell +[1, 4, 9, 16, 25] +``` + +### Operating conditionally + +Using the `where` parameter, the function will execute conditionally. The `where` parameter specifies where to apply the operation, based on a condition. If the condition is `True` at a particular index, the corresponding element in the array will be squared. If the condition is `False`, the element will remain unchanged. For instance: + +```py +import numpy as np + +array = np.array([1, 2, 3, 4, 5]) +conditions = np.array([False, True, True, False, True]) + +result = np.square(array, where=conditions) +print(result) +``` + +Output: + +```shell +array([1, 4, 9, 4, 25]) +``` + +The `where` parameter takes a boolean array or condition. It determines where the squaring operation will take place: + +- True at an index: The element at that index will be squared. +- False at an index: The element at that index will remain unchanged. + +If the `where` parameter is set to a single boolean value (either `True` or `False`), the entire array is either squared (if `True`) or left unchanged (if `False`). + +### Changing types + +Sometimes, it is important to increase or decrease the size of the datatype of the output array. This can be done by setting the `dtype` parameter to an np datatype, like: + +```py +import numpy as np +array = np.array([1, 2, 3, 4, 5]) # Ensuring it's a numpy array +result = np.square(array, dtype=np.float32) + +# Print the result +print(result) +``` + +Output generated will be as follows: + +```shell +array([ 1., 4., 9., 16., 25.], dtype=float32) +``` + +## Codebyte Example + +Run the following example to understand how the `.square()` method works: + +```codebyte/python +import numpy as np + +# Create a NumPy array +array = np.array([1, 2, 3, 4, 5]) + +# Create an output array initialized with zeros +output_array = np.zeros_like(array) + +# Set the condition for the 'where' parameter (square values where condition is True) +conditions = np.array([False, True, True, False, True]) + +# Use numpy.square() with all parameters +result = np.square(array, out=output_array, where=conditions) + +# Print the result +print("Squared values with conditions:", result) +``` diff --git a/content/plotly/concepts/graph-objects/terms/candlestick/candlestick.md b/content/plotly/concepts/graph-objects/terms/candlestick/candlestick.md new file mode 100644 index 00000000000..d7ff9b6d285 --- /dev/null +++ b/content/plotly/concepts/graph-objects/terms/candlestick/candlestick.md @@ -0,0 +1,88 @@ +--- +Title: '.Candlestick()' +Description: 'Creates candlestick charts to visualize financial data, showing open, high, low, and close values over time.' +Subjects: + - 'Data Science' + - 'Data Visualization' +Tags: + - 'Data' + - 'Finance' + - 'Plotly' + - 'Graphs' + - 'Data Visualization' +CatalogContent: + - 'learn-python-3' + - 'paths/data-visualization' +--- + +The **`.Candlestick()`** method in Plotly's [`graph_objects`](https://www.codecademy.com/resources/docs/plotly/graph-objects) module is used to create candlestick charts, widely used for visualizing financial data. A candlestick chart displays four key data points for a specific time period: + +1. **Open**: The starting value of the asset. +2. **High**: The highest value achieved during the time period. +3. **Low**: The lowest value during the period. +4. **Close**: The final value of the asset. + +Candlestick charts are commonly used to identify trends and patterns in stock prices and forex, helping analysts and traders visualize market behavior and make informed decisions. + +## Syntax + +```pseudo +import plotly.graph_objects as go + +go.Candlestick(x=None, open=None, high=None, low=None, close=None, increasing=None, ...) +``` + +- `x`: Represents the x-axis values, typically dates or time intervals for the candlestick chart. +- `open`: Represents the opening price of the asset for each time period. +- `high`: Represents the highest price of the asset for each time period. +- `low`: Represents the lowest price of the asset for each time period. +- `close`: Represents the closing price of the asset for each time period. +- `increasing`: Customizes the appearance of candles in cases where the closing price is higher than the opening price. The line color, width, or other styles can be defined. + +> **Note**: The ellipsis (`...`) indicates that additional optional parameters can be specified to customize the candlestick chart further. + +## Example + +The following code example creates a candlestick chart using Plotly's `.candlestick()` method. The x-axis represents dates or time periods, and the y-axis displays the opening, highest, lowest, and closing prices for each time period. + +```py +import plotly.graph_objects as go + +# Sample data +dates = ['2024-12-01', '2024-12-02', '2024-12-03'] +open_prices = [100, 105, 110] +high_prices = [110, 115, 120] +low_prices = [95, 100, 105] +close_prices = [105, 110, 115] + +# Create the figure +fig = go.Figure(data=[go.Candlestick( + # Dates or time periods for the x-axis. + x=dates, + # Opening prices for each date. + open=open_prices, + # Highest prices for each date. + high=high_prices, + # Lowest prices for each date. + low=low_prices, + # Closing prices for each date. + close=close_prices +)]) + +# Customize layout +fig.update_layout( + title='Sample Candlestick Chart', + xaxis_title='Date', + yaxis_title='Price', + xaxis_rangeslider_visible=False +) + +# Display the figure +fig.show() +``` + +This example generates an interactive candlestick chart that displays the price movements over specific dates. + +The above code generates the following output: + +![Candlestick example Plotly](https://raw.githubusercontent.com/Codecademy/docs/main/media/candlestick-example.png) diff --git a/content/plotly/concepts/graph-objects/terms/histogram2dContour/histogram2dContour.md b/content/plotly/concepts/graph-objects/terms/histogram2dContour/histogram2dContour.md new file mode 100644 index 00000000000..1205fe0c484 --- /dev/null +++ b/content/plotly/concepts/graph-objects/terms/histogram2dContour/histogram2dContour.md @@ -0,0 +1,64 @@ +--- +Title: '.Histogram2dContour()' +Description: 'Creates 2D histograms with contours for visualizing density distributions in data.' +Subjects: + - 'Data Science' + - 'Data Visualization' +Tags: + - 'Data' + - 'Data Structures' + - 'Plotly' +CatalogContent: + - 'learn-python-3' + - 'paths/data-science' +--- + +The **`.Histogram2dContour()`** method in Plotly's `graph_objects` module creates a 2D histogram with contour lines to visualize the joint distribution of two variables. It uses a grid where color intensity represents the count or aggregated values within each cell, while the contour lines indicate regions of equal density. This method helps visualize relationships and density in bivariate data, helping to uncover patterns and trends. + +## Syntax + +```pseudo +plotly.graph_objects.Histogram2dContour(x=None, y=None, nbinsx=None, nbinsy=None, colorscale=None, contours=None, ...) +``` + +- `x`: Input data for the x-axis. +- `y`: Input data for the y-axis. +- `nbinsx` (Optional): The number of bins (intervals) used to divide the x-axis range. If not specified (`None`), Plotly automatically calculates an appropriate number of bins based on the data. +- `nbinsy` (Optional): The number of bins (intervals) used to divide the y-axis range on the data. +- `colorscale` (Optional): Defines the color scale for heatmap. +- `contours` (Optional): Configuration for contour lines (e.g., `levels`, `start`, `end`, `size`). + +> **Note**: To personalize the scatter plot on polar axes, there are more possible options than those mentioned above, as indicated by the ellipsis in the syntax (...). + +## Example + +The following example showcases the use of the `.Histogram2dContour()`: + +```py +import plotly.graph_objects as go + +# Sample data +x = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4] +y = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1] + +# Create the histogram with contours +fig = go.Figure( + go.Histogram2dContour( + x=x, + y=y, + nbinsx=5, + nbinsy=5, + colorscale='Viridis', + contours=dict(start=0, end=4, size=1) + ) +) + +# Show the figure +fig.show() +``` + +The example demonstrates how to use `.Histogram2dContour()` to create a two-dimensional histogram that includes contour lines which are used to visualize the joint distribution between two variables. + +The above code generates the following output: + +![Histogram2dContour in Plotly](https://raw.githubusercontent.com/Codecademy/docs/main/media/histogram2dcontour-example.png) diff --git a/content/python/concepts/enum/enum.md b/content/python/concepts/enum/enum.md new file mode 100644 index 00000000000..dcf8a0c759f --- /dev/null +++ b/content/python/concepts/enum/enum.md @@ -0,0 +1,104 @@ +--- +Title: 'enum' +Description: 'A class that defines a set of named values, providing a structured way to represent constant values in a readable manner.' +Subjects: + - 'Code Foundations' + - 'Computer Science' +Tags: + - 'Data Types' + - 'Enum' + - 'Python' + - 'Variables' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +**`Enum`** (short for _enumeration_) is a class in Python used to define a set of named, immutable constants. Enumerations improve code readability and maintainability by replacing magic numbers or strings with meaningful names. Enums are part of Python's built-in `enum` module, introduced in Python 3.4. + +> **Note:** Magic numbers are unclear, hardcoded values in code. For example, `80` in a speed-checking program might be confusing. Replacing it with an enum constant, like `SpeedLimit.HIGHWAY`, makes the code easier to read and maintain. + +## Syntax + +```pseudo +from enum import Enum + +class EnumName(Enum): + MEMBER1 = value1 + MEMBER2 = value2 +``` + +- `EnumName`: The name of the enum class. +- `MEMBER1`, `MEMBER2`: Names of the constants. +- `value1`, `value2`: Values assigned to the constants (e.g. numbers or strings). + +## `enum` Module + +The `enum` module provides the `Enum` class for creating enumerations. It also includes: + +- `IntEnum`: Ensures that the values of the enuemration are integers. +- `Flag`: Allows combining constants with bitwise operations. +- `Auto`: Automatically assigns values to the enumeration members. + +Enums also provide methods like: + +- `.name`: Returns the name of the enum member (as a string). +- `.value`: Returns the value assigned to the enum member. + +## Example + +This example demonstrates how to create an enum for days of the week with integer values: + +```py +from enum import Enum + +class Weekday(Enum): + MONDAY = 1 + TUESDAY = 2 + WEDNESDAY = 3 + +# Accessing members +print(Weekday.MONDAY) +print(Weekday.MONDAY.name) +print(Weekday.MONDAY.value) + +# Iterating through members +for day in Weekday: + print(day) +``` + +This example results in the following output: + +```shell +Weekday.MONDAY +MONDAY +1 +Weekday.MONDAY +Weekday.TUESDAY +Weekday.WEDNESDAY +``` + +## Codebyte + +This example demonstrates how enums can represent traffic light states and associate actions with each state: + +```codebyte/python +from enum import Enum + +class TrafficLight(Enum): + RED = 'Stop' + YELLOW = 'Caution' + GREEN = 'Go' + +def traffic_action(light): + if light == TrafficLight.RED: + return "Stop your car." + elif light == TrafficLight.YELLOW: + return "Prepare to stop." + elif light == TrafficLight.GREEN: + return "You can go." + +# Example usage +current_light = TrafficLight.RED +print(traffic_action(current_light)) +``` diff --git a/content/python/concepts/loops/loops.md b/content/python/concepts/loops/loops.md index 8110b647798..4b0c1771205 100644 --- a/content/python/concepts/loops/loops.md +++ b/content/python/concepts/loops/loops.md @@ -118,6 +118,21 @@ for i in big_number_list: print(i) ``` +## Pass Keyword + +The `pass` keyword is used as a placeholder statement to allow empty loops, functions or classes to be included in an executable code block without throwing an error. This is common when structuring future implementations. + +```py +# Nested loop with a placeholder for incomplete logic +for i in range(3): + for j in range(3): + if i == j: + # Placeholder for future implementations + pass + else: + print(f"i: {i}, j:{j}") +``` + ## Video Walkthrough In this video, you will learn how to use the for and while loops in a Python script. diff --git a/content/python/concepts/os-path-module/terms/join/join.md b/content/python/concepts/os-path-module/terms/join/join.md index 2f14309d38a..a66397cff9f 100644 --- a/content/python/concepts/os-path-module/terms/join/join.md +++ b/content/python/concepts/os-path-module/terms/join/join.md @@ -31,7 +31,7 @@ import os.path cc_courses_slug = "https://www.codecademy.com/catalog" -python_3_lessons_slug = "learn-python-3/lessons/" +python_3_lessons_slug = "learn-python-3/lessons" second_lesson_slug = "string-methods/exercises/introduction-ii" diff --git a/content/python/concepts/sql-connectors/terms/pyodbc/pyodbc.md b/content/python/concepts/sql-connectors/terms/pyodbc/pyodbc.md new file mode 100644 index 00000000000..3596bad0f66 --- /dev/null +++ b/content/python/concepts/sql-connectors/terms/pyodbc/pyodbc.md @@ -0,0 +1,108 @@ +--- +Title: 'pyodbc' +Description: 'pyodbc is a library in Python that provides a bridge between Python applications and ODBC-compliant databases, allowing efficient database operations.' +Subjects: + - 'Data Science' + - 'Web Development' + - 'Developer Tools' +Tags: + - 'Database' + - 'SQL' + - 'Python' +CatalogContent: + - 'learn-python-3' + - 'paths/data-science' +--- + +**`pyodbc`** is a Python library that enables Python programs to interact with databases through **ODBC (Open Database Connectivity)**, a standard API for accessing database management systems (DBMS). It provides a powerful and efficient way to execute SQL queries, retrieve results, and perform other database operations. + +## Installation + +To install `pyodbc`, `pip` can be used: + +```bash +pip install pyodbc +``` + +## Syntax + +A basic connection to an ODBC database and query execution with `pyodbc` follows this structure: + +```pseudo +import pyodbc + +# Connect to the database +connection = pyodbc.connect("Driver={Driver_Name};" + "Server=server_name;" + "Database=database_name;" + "UID=user_id;" + "PWD=password;") + +# Create a cursor object +cursor = connection.cursor() + +# Execute a query +cursor.execute("SQL QUERY") + +# Fetch results +rows = cursor.fetchall() + +# Process results +for row in rows: + print(row) + +# Close the connection +connection.close() +``` + +## Key Parameters + +- `Driver`: Specifies the ODBC driver to use for the connection. +- `Server`: The database server's address or name. +- `Database`: The name of the database to connect to. +- `UID` and `PWD`: The username and password for authentication. They are case-sensitive in most databases. + +> **Note**: Connection string formats depend on the database type. Refer to [connectionstrings.com](https://www.connectionstrings.com/) for specific examples. + +## Example + +The following example demonstrates connecting to a Microsoft SQL Server, querying a table, and printing the results: + +```py +import pyodbc + +# Define connection string +connection_string = ("Driver={ODBC Driver 17 for SQL Server};" + "Server=localhost;" + "Database=TestDB;" + "UID=sa;" + "PWD=your_password;") + +try: + # Establish connection + conn = pyodbc.connect(connection_string) + cursor = conn.cursor() + + # Execute a SQL query + cursor.execute("SELECT * FROM Employees") + + # Fetch and print results + for row in cursor: + print(row) + +except pyodbc.Error as ex: + print("An error occurred:", ex) + +finally: + # Close the connection + if 'conn' in locals(): + conn.close() +``` + +## Use Cases + +Here are some use cases for `pyodbc`: + +- Connecting to a variety of databases (e.g., SQL Server, MySQL, PostgreSQL) via ODBC +- Executing dynamic SQL queries +- Efficiently handling large datasets diff --git a/content/python/concepts/sql-connectors/terms/sqlite3/sqlite3.md b/content/python/concepts/sql-connectors/terms/sqlite3/sqlite3.md new file mode 100644 index 00000000000..b7309b108dd --- /dev/null +++ b/content/python/concepts/sql-connectors/terms/sqlite3/sqlite3.md @@ -0,0 +1,147 @@ +--- +Title: 'SQLite3' +Description: 'SQLite3 is a library used to connect to SQLite databases.' +Subjects: + - 'Computer Science' + - 'Data Science' +Tags: + - 'SQLite' + - 'Documentation' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +The **`sqlite3`** library is used to connect to SQLite databases and provides functions to interact with them. It can also be used for prototyping while developing an application. + +## Syntax + +```pseudo +import sqlite3 +``` + +The `sqlite3` library handles the communication with the databases. + +## Create a Connection + +To work with a database, it first needs to be connected to using the **`.connect()`** function: + +```py +import sqlite3 + +con = sqlite3.connect("mydb_db.db") +``` + +## Create a Cursor + +A cursor is required to execute SQL statements and the **`.cursor()`** function creates one: + +```py +curs = connection.cursor() +``` + +## Create a Table + +The **`.execute()`** function can be used to create a table: + +```py +curs.execute('''CREATE TABLE persons( + name TEXT, + age INTEGER, + gender TEXT) + ''') +``` + +## Insert a Value Into the Table + +To insert values into the table, the SQL statement is executed with the `.execute()` function: + +```py +curs.execute('''INSERT INTO persons VALUES( + 'Alice', 21, 'female')''') +``` + +## Insert Multiple Values Into the Table + +To insert multiple values into the table, the SQL statement is executed using the **`.executemany()`** function with an array of values: + +```py +new_persons = [('Bob', 26, 'male'), + ('Charlie', 19, 'male'), + ('Daisy', 18, 'female') + ] + +curs.executemany('''INSERT INTO persons VALUES(?, ?, ?)''', new_persons) +``` + +## Commit the Transaction + +The **`.commit()`** function saves the inserted values to the database permanently: + +```py +con.commit() +``` + +## Check the Inserted Rows + +To check all the inserted rows, the **`.fetchall()`** function can be used: + +```py +result = cursor.execute("SELECT * FROM persons") + +result.fetchall() +``` + +## Close the Connection + +After completing all the transactions, the connection can be closed with **`.close()`**: + +```py +connection.close() +``` + +## Codebyte Example + +Here's a codebyte example showing how to connect to an SQLite database, create a table, insert/query data, and close the connection: + +```codebyte/python +import sqlite3 +# Create a connection to the database +con = sqlite3.connect("mydb_db.db") + +# Create a cursor to execute SQL statements +curs = con.cursor() + +# Ensure to create a new table +curs.execute('''DROP TABLE IF EXISTS persons''') + +# Create a new table +curs.execute('''CREATE TABLE persons( + name TEXT, + age INTEGER, + gender TEXT) + ''') + +# Insert a value into the table +curs.execute('''INSERT INTO persons VALUES( +'Alice', 21, 'female')''') + +# Insert multiple values into the table +new_persons = [('Bob', 26, 'male'), + ('Charlie', 19, 'male'), + ('Daisy', 18, 'female') + ] + +curs.executemany('''INSERT INTO persons VALUES(?, ?, ?)''', new_persons) + +# Commit the transaction to database +con.commit() + +# Check the inserted rows +result = curs.execute("SELECT * FROM persons") +printout = result.fetchall() +print(printout) + +# Close the connection +con.close() +``` diff --git a/content/python/concepts/statsmodels/terms/ols/ols.md b/content/python/concepts/statsmodels/terms/ols/ols.md new file mode 100644 index 00000000000..6dcb36866cd --- /dev/null +++ b/content/python/concepts/statsmodels/terms/ols/ols.md @@ -0,0 +1,80 @@ +--- +Title: 'Ordinary Least Squares' +Description: 'Uses Ordinary Least Squares (OLS) to perform linear regression in order to reduce prediction errors and evaluate associations between variables.' +Subjects: + - 'Computer Science' + - 'Data Science' + - 'Data Visualization' + - 'Machine Learning' +Tags: + - 'Data' + - 'Linear Regression' + - 'Machine Learning' +CatalogContent: + - 'learn-python-3' + - 'paths/data-science-foundations' +--- + +**Ordinary least squares** (OLS) is a statistical method that reduces the sum of squared residuals to assess the correlation between independent and dependent variables. In linear regression, it is widely used to predict values and analyze correlations between variables. + +## Syntax + +Here's syntax to implement Ordinary Least Squares in Python: + +```pseudo +import statsmodels.api as sm # Import the statsmodels library + +# Add a constant to the independent variable(s) for the intercept +X = sm.add_constant(X) # Method to add a constant to X + +# Fit the OLS model +model = sm.OLS(y, X).fit() # `OLS` function applied to y (dependent variable) and X (independent variables) + +# Access the model summary +model.summary() # Method to get summary statistics +``` + +- `sm.add_constant(x)`: Adds an intercept (constant term) to the independent variables X. +- `sm.OLS(y, X)`: Creates the OLS model with y as the dependent variable and X as the independent variables. +- `model.summary()`: Displays the model's results, including coefficients and `R-squared`values. + +## Example + +Here's an example predicting `test_scores` based on `hours_studied`: + +```py +import statsmodels.api as sm +import matplotlib.pyplot as plt +import numpy as np + +# Hours studied and corresponding test scores +hours_studied = [1, 2, 3, 4, 5] +test_scores = [50, 55, 60, 65, 70] + +# Add a constant to the independent variable +hours_with_constant = sm.add_constant(hours_studied) + +# Fit the OLS model +model = sm.OLS(test_scores, hours_with_constant).fit() + +# Display the summary of the model +print(model.summary()) + +# Predict the test scores using OLS model +predicted_scores = model.predict(hours_with_constant) + +# Plot the data and line +plt.scatter(hours_studied, test_scores, color='blue', label='Observed data') +plt.plot(hours_studied, predicted_scores, color='red', label='Fitted line') + +# Displaying the plot +plt.xlabel('Hours Studied') +plt.ylabel('Test Scores') +plt.title('OLS Regression: Test Scores vs Hours Studied') +plt.legend() + +# Show the plot +plt.show() +``` + +! [Regression plot](https://raw.githubusercontent.com/Codecademy/docs/main/media/ols-model-example.png) diff --git a/content/python/concepts/type-hints/type-hints.md b/content/python/concepts/type-hints/type-hints.md new file mode 100644 index 00000000000..a360f8ffeea --- /dev/null +++ b/content/python/concepts/type-hints/type-hints.md @@ -0,0 +1,117 @@ +--- +Title: 'Type Hints' +Description: 'Specify expected data types for variables, function arguments, and return values, improving code readability and aiding static analysis.' +Subjects: + - 'Code Foundations' + - 'Computer Science' +Tags: + - 'Python' + - 'Types' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +**Type hints** in Python are a feature that enables developers to specify the expected data types of variables, function arguments, and return values. It was introduced in Python 3.5. + +> **Note**: Type hints are part of the **`typing` module**, which provides a comprehensive set of tools for type annotations. + +Type hints help developers write more robust code by allowing tools like linters and IDEs to catch type-related errors before runtime. + +## Syntax + +This is the general syntax for type hints in function annotations: + +```pseudo +from typing import List, Dict, Union + +def function_name(parameter_name: parameter_type) -> return_type: + # Function body +``` + +- `parameter_name`: This represents the name of the parameter that the function accepts. +- `parameter_type`: This indicates the expected data type of `parameter_name`. +- `return_type`: This specifies the data type of the value that the function will return. + +### Commonly Used Type Hints + +- `int`, `float`, `str`, `bool`: These are the basic data types. +- `List[ElementType]`: This is a list containing elements of `ElementType`. +- `Dict[KeyType, ValueType]`: This is a dictionary with keys of `KeyType` and values of `ValueType`. +- `Union[Type1, Type2]`: This is a value that can be of either `Type1` or `Type2`. +- `Optional[Type]`: This indicates that a value can be of `Type` or `None`. + +> **Note**: Starting with Python 3.7, PEP 563 allows type annotations to be stored as strings and evaluated only when needed, optimizing runtime performance. From Python 3.10 onwards, PEP 604 introduces the `|` operator as a concise alternative to `Union`, simplifying syntax for type annotations. + +## Example + +This is an example of a function using type hints: + +```py +from typing import List, Dict, Union + +def process_data(data: List[Dict[str, Union[int,str]]]) -> List[str]: + """ + Processes a list of dictionaries to extract string values. + + Args: + (data: List[Dict[str, Union[int,str]]]): A list of dictionaries including string keys and integer or string values. + + PEP 604 Args: + (data: List[Dict[str, int | str]]): A list of dictionaries including string keys and integer or string values as per PEP 604. + + Returns: + List[str]: A list of string values extracted from the dictionaries. + """ + + result = [] + + for item in data: + for key, value in item.items(): + if isinstance(value, str): + result.append(value) + return result + +# Example usage +data = [ + {"name": "Alice", "age": 25}, + {"name": "Bob", "city": "New York"} +] + +output = process_data(data) + +print(output) +``` + +The above example would output the following: + +```shell +['Alice', 'Bob', 'New York'] +``` + +## Codebyte Example + +Here is a codebyte example demonstrating the usage of type hints: + +```codebyte/python +from typing import List, Optional + +def greet(name: Optional[str] = None) -> str: + """ + Args: + name (Optional[str]): Name of the person to greet. Defaults to None. + + Returns: + str: A greeting message. + """ + + if name: + return f"Hello, {name}!" + return "Hello, World!" + +# Test the function +print(greet("Dani")) +print(greet()) +``` + +> **Note**: While type hints enhance code clarity and facilitate static analysis during development, they do not affect how Python executes the code. diff --git a/content/pytorch/concepts/tensor-operations/terms/index-reduce/index-reduce.md b/content/pytorch/concepts/tensor-operations/terms/index-reduce/index-reduce.md new file mode 100644 index 00000000000..a10127a73e8 --- /dev/null +++ b/content/pytorch/concepts/tensor-operations/terms/index-reduce/index-reduce.md @@ -0,0 +1,64 @@ +--- +Title: '.index_reduce_()' +Description: 'Reduces a tensor along a specified dimension using indices to map input elements to positions in the output tensor, applying reduction operations such as sum, product, or mean.' +Subjects: + - 'Computer Science' + - 'Data Science' +Tags: + - 'Data Structures' + - 'Functions' + - 'Index' + - 'Values' +CatalogContent: + - 'intro-to-py-torch-and-neural-networks' + - 'paths/computer-science' +--- + +In PyTorch, **`.index_reduce_()`** performs an in-place reduction operation (such as sum, product, or mean) on a [tensor](https://www.codecademy.com/resources/docs/pytorch/tensors) along a specified dimension. It uses an index tensor to map input elements to positions in the output tensor, effectively aggregating values with the same index. + +## Syntax + +```pseudo +Tensor.index_reduce_(dim, index, source, reduce, *, include_self=True) +``` + +- `dim`: The axis of the tensor along which the reduction is performed. +- `index`: A 1D tensor containing indices that map the elements in the source tensor to specific positions in the current tensor. +- `source`: The tensor whose values are reduced and added to the current tensor at positions specified by `index`. +- `reduce`: Specifies the reduction operation to apply. Possible values include: + - `'prod'`: Product of elements with the same index. + - `'mean'`: Mean of elements with the same index. + - `'amax'`: Maximum of elements with the same index. + - `'amin'`: Minimum of elements with the same index. +- `include_self` (Optional): Determines whether the existing values in the current tensor are included in the reduction operation. + - If `True`, the values already present in the tensor are included. If no value is provided for the parameter, `include_self` defaults to `True`. + - If `False`, only the `source` tensor values contribute to the reduction. + +## Example + +The following example demonstrates the usage of the `.index_reduce()` method: + +```py +import torch + +# Define the target tensor +target = torch.zeros(2) + +# Source tensor +source = torch.tensor([1.0, 2.0, 3.0, 4.0]) + +# Indices mapping source to target +index = torch.tensor([0, 1, 0, 1], dtype=torch.long) # Ensure index tensor is of type 'long' + +# Perform in-place reduction using 'mean' along the 0th dimension (rows) +target.index_reduce_(dim=0, index=index, source=source, reduce='mean') +print(target) +``` + +The above code produces the following output: + +```shell +tensor([1.3333, 2.0000]) +``` + +This code reduces the `source` tensor along dimension 0 by averaging (`'mean'` reduce) the values mapped to the same indices in the `index` tensor, updating the `target` tensor in place. diff --git a/content/pytorch/concepts/tensor-operations/terms/masked-select/masked-select.md b/content/pytorch/concepts/tensor-operations/terms/masked-select/masked-select.md new file mode 100644 index 00000000000..7310c4060c5 --- /dev/null +++ b/content/pytorch/concepts/tensor-operations/terms/masked-select/masked-select.md @@ -0,0 +1,57 @@ +--- +Title: '.masked_select()' +Description: 'Selects elements from a tensor, based on a boolean mask, and returns them as a 1D tensor.' +Subjects: + - 'Computer Science' + - 'Data Science' +Tags: + - 'Data Structures' + - 'Functions' + - 'Index' + - 'Values' +CatalogContent: + - 'intro-to-py-torch-and-neural-networks' + - 'paths/computer-science' +--- + +In PyTorch, **`.masked_select()`** is a function that selects elements from an input tensor based on a boolean mask of the same shape. It returns a new 1D tensor containing the elements where the corresponding mask value is `True`. + +## Syntax + +```pseudo +torch.masked_select(input, mask, *, out=None) +``` + +- `input`: The input tensor from which elements will be selected. +- `mask`: A boolean tensor of the same shape as input, where `True` indicates the elements to be selected. +- `out` (Optional): A tensor to store the result. If provided, the selected elements will be written to this tensor instead of creating a new one. + +## Example + +Here's an example of using `.masked_select()` in PyTorch: + +```py +import torch + +# Create an input tensor +input_tensor = torch.tensor([1, 2, 3, 4, 5]) + +# Create a mask tensor with boolean values +mask = torch.tensor([True, False, True, False, True]) + +# Use masked_select to extract elements from the input tensor where the mask is True +selected_elements = torch.masked_select(input_tensor, mask) + +# Print the selected elements +print(selected_elements) +``` + +The code above generates the output as follows: + +```shell +tensor([1, 3, 5]) +``` + +In this example, the `input_tensor` contains elements `[1, 2, 3, 4, 5]`, and the `mask` tensor contains boolean values `[True, False, True, False, True]`. The `masked_select()` function selects elements from the `input_tensor` where the corresponding mask value is `True`, resulting in the tensor `[1, 3, 5]`. + +The `.masked_select()` function is useful for filtering elements from a tensor based on conditions specified by the mask tensor. It can be applied in various scenarios, such as selecting specific elements for further processing, analysis, or model training. diff --git a/content/pytorch/concepts/tensor-operations/terms/movedim/movedim.md b/content/pytorch/concepts/tensor-operations/terms/movedim/movedim.md new file mode 100644 index 00000000000..33689588f81 --- /dev/null +++ b/content/pytorch/concepts/tensor-operations/terms/movedim/movedim.md @@ -0,0 +1,131 @@ +--- +Title: '.movedim()' +Description: 'Returns a tensor with the dimensions moved from the positions specified in source to the positions specified in destination.' +Subjects: + - 'AI' + - 'Data Science' +Tags: + - 'AI' + - 'Arrays' + - 'Data Structures' + - 'Deep Learning' +CatalogContent: + - 'intro-to-py-torch-and-neural-networks' + - 'paths/computer-science' +--- + +In Pytorch, **`.movedim()`** is used to move specific dimensions of the input tensor to a specified positions, while the other dimensions that are not explicitly mentioned remain in their original order. + +## Syntax + +```pseudo +torch.movedim(input, source, destination) +``` + +- `input`: The input tensor whose dimensions are to be rearranged. +- `source`: The dimensions to be moved. Can be a single integer or a tuple of integers. +- `destination`: The target positions for the dimensions specified in `source`. It should have the same length as `source`. + +## Example + +The following example demonstrates the use of `.movedim()`: + +```py +import torch + +# Define a 1D tensor +a = torch.tensor([[1, 2, 3, -8]]) + +# Define a 2D tensor +b = torch.tensor([[1, 2, 3, -8], + [4, 3, 8, 0], + [-1, 7, 6, 3], + [5, 6, 9, 0]]) + +# Define a 3D tensor +c = torch.randn(2, 2, 3) + +# Define a 4D tensor +d = torch.randn(2, 3, 2, 3) + +# Move dimension 0 to dimension 1 for 1D tensor +a1 = torch.movedim(a, 0, 1) +print("One Dimensional tensor:") +print(a1) +print("\n") + +# Move dimension 0 to dimension 1 for 2D tensor +b1 = torch.movedim(b, 0, 1) +print("Two Dimensional tensor:") +print(b1) +print("\n") + +# Move dimension 0 to dimension 1 for 3D tensor +c1 = torch.movedim(c, 0, 1) +print("Three Dimensional tensor (Dim 1):") +print(c1) +print("\n") + +# Move dimension 0 to dimension 2 for 3D tensor +c2 = torch.movedim(c, 0, 2) +print("Three Dimensional tensor (Dim 2):") +print(c2) +print("\n") + +# Move dimensions [0, 1] to positions [2, 3] for 4D tensor +d1 = torch.movedim(d, [0, 1], [2, 3]) +print("Four Dimensional tensor:") +print(d1) +``` + +This example will generate the following output: + +```shell +One Dimensional tensor: +tensor([[ 1], + [ 2], + [ 3], + [-8]]) + +Two Dimensional tensor: +tensor([[ 1, 4, -1, 5], + [ 2, 3, 7, 6], + [ 3, 8, 6, 9], + [-8, 0, 3, 0]]) + +Three Dimensional tensor (Dim 1): +tensor([[[ 1.0064, -1.2284, -1.1452], + [-0.9374, 1.2943, -1.7862]], + + [[ 0.4316, 3.1050, -0.4264], + [-0.9219, 1.6863, -0.3411]]]) + +Three Dimensional tensor (Dim 2): +tensor([[[ 1.0064, -0.9374], + [-1.2284, 1.2943], + [-1.1452, -1.7862]], + + [[ 0.4316, -0.9219], + [ 3.1050, 1.6863], + [-0.4264, -0.3411]]]) + +Four Dimensional tensor: +tensor([[[[ 0.0753, 1.5373, 0.0765], + [-3.1675, 0.2926, 0.5799]], + + [[-0.1520, -0.4855, 1.9026], + [-1.6107, 0.5367, -0.3401]], + + [[-0.9148, -0.6213, 0.5939], + [-0.6407, -1.0397, -0.7044]]], + + + [[[ 0.3897, 0.6399, 1.0818], + [ 0.7111, -1.3950, -1.3415]], + + [[-0.3749, 2.3008, -0.2464], + [ 1.4121, -0.3554, -0.5184]], + + [[-0.3224, -0.9296, 0.1633], + [-0.2641, 0.8230, 0.1766]]]]) +``` diff --git a/content/pytorch/concepts/tensor-operations/terms/permute/permute.md b/content/pytorch/concepts/tensor-operations/terms/permute/permute.md new file mode 100644 index 00000000000..40ce74b2d96 --- /dev/null +++ b/content/pytorch/concepts/tensor-operations/terms/permute/permute.md @@ -0,0 +1,55 @@ +--- +Title: '.permute()' +Description: 'Returns a view of the given tensor with its dimensions permuted or rearranged according to a specific order.' +Subjects: + - 'AI' + - 'Data Science' +Tags: + - 'AI' + - 'Data Types' + - 'Deep Learning' + - 'Functions' +CatalogContent: + - 'intro-to-py-torch-and-neural-networks' + - 'paths/data-science' +--- + +In PyTorch, the **`.permute()`** function returns a view of a given [tensor](https://www.codecademy.com/resources/docs/pytorch/tensors) with its dimensions permuted or rearranged according to a specific order. + +## Syntax + +```pseudo +torch.permute(input, dims) +``` + +- `input`: The tensor whose dimensions are to be permuted. +- `dims`: The order in which the dimensions are to be permuted. + +## Example + +The following example demonstrates the usage of the `.permute()` function: + +```py +import torch + +# Create a tensor of size (2, 3, 4) +ten = torch.randn(2, 3, 4) + +# Permute the dimensions of the tensor in the order (2, 0, 1) +res = torch.permute(ten, (2, 0, 1)) + +# Print the size of the resultant tensor +print(res.size()) +``` + +In the above example, the order `(2, 0, 1)`: + +- Moves the dimension located at index `2` to index `0` +- Moves the dimension located at index `0` to index `1` +- Moves the dimension located at index `1` to index `2` + +The above code produces the following output: + +```shell +torch.Size([4, 2, 3]) +``` diff --git a/content/pytorch/concepts/tensor-operations/terms/row-stack/row-stack.md b/content/pytorch/concepts/tensor-operations/terms/row-stack/row-stack.md new file mode 100644 index 00000000000..4fef9022b7d --- /dev/null +++ b/content/pytorch/concepts/tensor-operations/terms/row-stack/row-stack.md @@ -0,0 +1,51 @@ +--- +Title: '.row_stack()' +Description: 'Stacks or arranges a sequence of tensors vertically (row-wise).' +Subjects: + - 'AI' + - 'Data Science' +Tags: + - 'AI' + - 'Data Types' + - 'Deep Learning' + - 'Functions' +CatalogContent: + - 'intro-to-py-torch-and-neural-networks' + - 'paths/data-science' +--- + +In PyTorch, the **`.row_stack()`** function stacks or arranges a sequence of [tensors](https://www.codecademy.com/resources/docs/pytorch/tensors) vertically (row-wise). It is an alias or alternative for the **`.vstack()`** function. + +## Syntax + +```pseudo +torch.row_stack(tensors, *, out=None) +``` + +- `tensors`: The sequence of tensors to be stacked vertically. +- `out` (Optional): A tensor to store the output. It must have the correct shape to accommodate the result. + +## Example + +The following example demonstrates the usage of the `.row_stack()` function: + +```py +import torch + +# Create two tensors +ten1 = torch.tensor([12, 23, 34]) +ten2 = torch.tensor([45, 56, 67]) + +# Stack the tensors vertically +res = torch.row_stack((ten1, ten2)) + +# Print the resultant tensor +print(res) +``` + +The above code produces the following output: + +```shell +tensor([[12, 23, 34], + [45, 56, 67]]) +``` diff --git a/content/pytorch/concepts/tensor-operations/terms/scatter/scatter.md b/content/pytorch/concepts/tensor-operations/terms/scatter/scatter.md new file mode 100644 index 00000000000..b3a3bcb5efb --- /dev/null +++ b/content/pytorch/concepts/tensor-operations/terms/scatter/scatter.md @@ -0,0 +1,58 @@ +--- +Title: '.scatter()' +Description: 'Writes values from a source into specific locations of a tensor along a specified dimension, based on indices.' +Subjects: + - 'AI' + - 'Data Science' +Tags: + - 'AI' + - 'Data Types' + - 'Deep Learning' + - 'Functions' +CatalogContent: + - 'intro-to-py-torch-and-neural-networks' + - 'paths/data-science' +--- + +In PyTorch, the **`.scatter()`** function writes values from a source ([tensor](https://www.codecademy.com/resources/docs/pytorch/tensors) or scalar) into specific locations of a tensor along a specified dimension, based on given indices. + +## Syntax + +```pseudo +torch.scatter(ten, dim, index, src) +``` + +- `ten`: The tensor where the values are to be inserted. +- `dim`: The dimension along which the values are to be inserted. +- `index`: The tensor which specifies the locations in `ten` where the values are to be inserted. +- `src`: The tensor which contains the values to be inserted. + +## Example + +The following example demonstrates the usage of the `.scatter()` function: + +```py +import torch + +# Create a tensor +ten = torch.tensor([[11, 12, 13, 14, 15], [16, 17, 18, 19, 20]]) + +# Create a tensor containing the locations +index = torch.tensor([[0, 2], [1, 3]]) + +# Create a tensor containing the values +src = torch.tensor([[21, 23], [27, 29]]) + +# Insert the given values into specified locations along dimension 1 in the original tensor +res = torch.scatter(ten, 1, index, src) + +# Print the resultant tensor +print(res) +``` + +The above code produces the following output: + +```shell +tensor([[21, 12, 23, 14, 15], + [16, 27, 18, 29, 20]]) +``` diff --git a/content/pytorch/concepts/tensor-operations/terms/select/select.md b/content/pytorch/concepts/tensor-operations/terms/select/select.md new file mode 100644 index 00000000000..947c581f81c --- /dev/null +++ b/content/pytorch/concepts/tensor-operations/terms/select/select.md @@ -0,0 +1,59 @@ +--- +Title: '.select()' +Description: 'Selects a specific slice along the given dimension in a tensor.' +Subjects: + - 'Computer Science' + - 'Machine Learning' +Tags: + - 'Functions' + - 'Machine Learning' + - 'Methods' + - 'Python' +CatalogContent: + - 'intro-to-py-torch-and-neural-networks' + - 'paths/computer-science' +--- + +The **`.select()`** method in PyTorch returns a specific slice of a [tensor](https://www.codecademy.com/resources/docs/pytorch/tensors) along a specified dimension, reducing the dimensionality of the output tensor by one compared to the input tensor. + +## Syntax + +```pseudo +torch.select(input, dim, index) +``` + +- `input`: The input tensor. +- `dim`: The dimension along which to select. +- `index`: The index of the slice to select along the specified dimension. + +## Example + +The following example illustrates the usage of `.select()` method: + +```py +import torch + +# 2D tensor +tensor = torch.tensor([[10, 20], [30, 40], [50, 60]]) +print("Input Tensor: ", tensor) + +# Select a row (dim=0) +row = torch.select(tensor, 0, 1) +print("\nSelected Row (dim=0, index=1):", row) + +# Select a column (dim=1) +col = torch.select(tensor, 1, 0) +print("\nSelected Column (dim=1, index=0):", col) +``` + +The above code gives the following output: + +```shell +Input Tensor: tensor([[10, 20], + [30, 40], + [50, 60]]) + +Selected Row (dim=0, index=1): tensor([30, 40]) + +Selected Column (dim=1, index=0): tensor([10, 30, 50]) +``` diff --git a/content/pytorch/concepts/tensors/terms/argwhere/argwhere.md b/content/pytorch/concepts/tensors/terms/argwhere/argwhere.md new file mode 100644 index 00000000000..3f3871fa26c --- /dev/null +++ b/content/pytorch/concepts/tensors/terms/argwhere/argwhere.md @@ -0,0 +1,65 @@ +--- +Title: '.argwhere()' +Description: 'Returns the indices of elements in a tensor that satisfy a specified condition, arranged in a 2D tensor.' +Subjects: + - 'AI' + - 'Data Science' +Tags: + - 'AI' + - 'Deep Learning' + - 'Functions' + - 'Machine Learning' +CatalogContent: + - 'intro-to-py-torch-and-neural-networks' + - 'py-torch-for-classification' +--- + +In PyTorch, **`.argwhere()`** returns the indices of elements in a tensor that satisfy a specified condition. It is useful for finding the positions of elements in a tensor that meet specific conditions, such as values greater than a threshold. + +## Syntax + +```pseudo +torch.argwhere(input) +``` + +- `input`: A tensor containing the elements to be checked. The condition will be applied to this tensor. + +It returns a 2D tensor containing the indices of the elements in the input tensor that satisfy the specified condition. Each row in the resulting tensor represents the indices of an element that meets the condition. + +## Example + +In this example, `.argwhere()` is used to find the indices of elements in the tensor that are greater than _0_, equal to _0_, and less than _2_: + +```py +import torch + +# Define a tensor +tensor = torch.tensor([[0, 1], [2, 0], [-1, 3]]) + +# Case 1: Use argwhere to find indices of elements greater than 0 +indices_case_1 = torch.argwhere(tensor > 0) + +# Case 2: Use argwhere to find indices of elements equal to 0 +indices_case_2 = torch.argwhere(tensor == 0) + +# Case 3: Use argwhere to find indices of elements less than 2 +indices_case_3 = torch.argwhere(tensor < 2) + +print("Case 1 (elements > 0):", indices_case_1) +print("Case 2 (elements == 0):", indices_case_2) +print("Case 3 (elements < 2):", indices_case_3) +``` + +Here is the output for the above example: + +```shell +Case 1 (elements > 0): tensor([[0, 1], + [1, 0], + [2, 1]]) +Case 2 (elements == 0): tensor([[0, 0], + [1, 1]]) +Case 3 (elements < 2): tensor([[0, 0], + [0, 1], + [1, 1], + [2, 0]]) +``` diff --git a/content/scipy/concepts/scipy-integrate/scipy-integrate.md b/content/scipy/concepts/scipy-integrate/scipy-integrate.md new file mode 100644 index 00000000000..04b48e13a97 --- /dev/null +++ b/content/scipy/concepts/scipy-integrate/scipy-integrate.md @@ -0,0 +1,46 @@ +--- +Title: 'scipy.integrate' +Description: 'Provides functions for numerical integration, solving ordinary differential equations, and handling integrals over a range of functions.' +Subjects: + - 'Computer Science' + - 'Data Science' +Tags: + - 'Algorithms' + - 'Data' + - 'Filter' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +**`scipy.integrate`** is a submodule of SciPy that provides tools for numerical integration and solving differential equations. It supports both single and multi-dimensional integrals, offering efficient methods for handling integrals of functions, ordinary differential equations (ODEs), and more. Key features include: + +- **Numerical Integration**: Calculate definite integrals of functions. +- **Ordinary Differential Equations (ODEs)**: Solve initial value problems for ODEs. +- **Quadruple Integration**: Handle higher-dimensional integrals over specified ranges. +- **Integration of Systems of ODEs**: Solve coupled systems of ODEs with multiple variables. + +`scipy.integrate` is a powerful tool for working with integrals and differential equations in scientific computing and engineering applications. + +## Syntax + +Here's a generic syntax outline for using `scipy.integrate`: + +```pseudo +import scipy.integrate + +# Example: Numerical integration (definite integral) +result, error = scipy.integrate.function_name(function, bounds, *args, **kwargs) + +# Example: Solving an ODE +solution = scipy.integrate.function_name(function, time_points, initial_conditions, *args, **kwargs) + +# Example: Multi-dimensional integration +result = scipy.integrate.function_name(function, bounds, *args, **kwargs) +``` + +- `scipy.integrate.function_name`: Replace this with the specific function you want to use (e.g., `quad`, `odeint`, `dblquad`). +- `*args`: Positional arguments specific to the function. +- `**kwargs`: Keyword arguments that can be used to modify the behavior of the function. + +This structure is applicable for most functions in `scipy.integrate`, where an integration or ODE solving task is defined and then applied to the data, with many functions like `quad()`, `odeint()`, `trapz()`, `dblquad()`, and more, making it versatile for various numerical integration and differential equation tasks. diff --git a/content/scipy/concepts/scipy-optimize/scipy-optimize.md b/content/scipy/concepts/scipy-optimize/scipy-optimize.md new file mode 100644 index 00000000000..b5bd8b74c75 --- /dev/null +++ b/content/scipy/concepts/scipy-optimize/scipy-optimize.md @@ -0,0 +1,67 @@ +--- +Title: 'scipy.optimize' +Description: 'The Optimize module in SciPy has algorithms for optimization and root-finding, solving tasks like curve fitting, parameter estimation, and resource allocation.' +Subjects: + - 'Data Science' + - 'Machine Learning' +Tags: + - 'Python' + - 'Optimization' + - 'Mathematics' +CatalogContent: + - 'learn-python' + - 'paths/data-science' +--- + +The **`scipy.optimize`** module is part of the [SciPy](https://www.codecademy.com/resources/docs/scipy) library for scientific computing in [Python](https://www.codecademy.com/resources/docs/python). It provides a variety of optimization and root-finding routines designed to solve mathematical problems, such as finding minima or maxima of functions, solving systems of equations, and performing linear or nonlinear optimizations. Whether tuning model parameters, allocating resources, or fitting complex curves, `scipy.optimize` offers a rich toolbox for improving decision-making and model performance. + +## Functions in `scipy.optimize` + +### Minimization + +Minimizes a scalar function (i.e., finds the values that minimize the objective function). It has the following syntax: + +```pseudo +optimize.minimize(fun, x0, method=...) +``` + +- `fun`: The objective function to minimize. +- `x0`: Initial guess. +- `method`: Algorithm to use (e.g., `'BFGS'`, `'Nelder-Mead'`, etc.). + +### Root-Finding + +Finds the roots (or solutions) of a function, i.e., the points where the function equals zero. It has a syntax: + +```pseudo +optimize.root(fun, x0, method=...) +``` + +- `fun`: The function for which the root is sought. +- `x0`: Initial guess. +- `method`: Algorithm to use (e.g., `'hybr'`, `'broyden1'`). + +### Linear Programming + +Solves linear optimization problems, such as maximizing or minimizing a linear objective function subject to linear constraints: + +```pseudo +optimize.linprog(c, A_ub=..., b_ub=..., A_eq=..., b_eq=..., bounds=..., method='highs') +``` + +- `c`: Coefficients of the linear objective function. +- `A_ub`, `b_ub`: Inequality constraints. +- `A_eq`, `b_eq`: Equality constraints. +- `bounds`: Variable bounds. + +### Curve Fitting + +Fits a model to observed data by performing nonlinear least squares fitting, finding the parameters that minimize the difference between the observed data and the model. The syntax is: + +```pseudo +optimize.curve_fit(f, xdata, ydata, p0=...) +``` + +- `f`: The model function, `f(x, …)`. +- `xdata`, **ydata**: The observed data. +- `p0`: Initial guess for the parameters. diff --git a/content/scipy/concepts/scipy-signal/scipy-signal.md b/content/scipy/concepts/scipy-signal/scipy-signal.md new file mode 100644 index 00000000000..21429a9f7b7 --- /dev/null +++ b/content/scipy/concepts/scipy-signal/scipy-signal.md @@ -0,0 +1,46 @@ +--- +Title: 'scipy.signal' +Description: 'Provides functions for signal processing tasks such as filtering, spectral analysis, and signal generation.' +Subjects: + - 'Computer Science' + - 'Data Science' +Tags: + - 'Algorithms' + - 'Data' + - 'Filter' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +**`scipy.signal`** is a submodule of SciPy that provides tools for signal processing, including filter design, spectral analysis, and convolution. It supports both continuous and discrete signals, with applications in areas like audio processing, communications, and data analysis. Key features include: + +- **Filter Design and Application**: Design and apply various types of filters. +- **Fourier Transform**: Analyze frequency components of signals. +- **Convolution and Correlation**: Apply convolution and correlation for signal processing tasks. +- **Signal Generation**: Generate standard test signals like sinusoids and square waves. + +`scipy.signal` is a powerful tool for working with signals in scientific and engineering fields. + +## Syntax + +Here's a generic syntax outline for using `scipy.signal`: + +```pseudo +import scipy.signal + +# Example: Designing a filter +b, a = scipy.signal.function_name(*args, **kwargs) + +# Example: Applying the filter to a signal +y = scipy.signal.function_name(b, a, x) + +# Example: Signal processing task (e.g., convolution, correlation) +result = scipy.signal.function_name(x, y, *args, **kwargs) +``` + +- `scipy.signal.function_name`: Replace this with the specific function you want to use (e.g., `buttap`, `filtfilt`, `convolve`). +- `*args`: Positional arguments specific to the function. +- `**kwargs`: Keyword arguments that can be used to modify the behavior of the function. + +This structure is applicable for most functions in `scipy.signal`, where a signal processing task is defined and then applied to the data, with many functions like `lfilter()`, `wiener()`, `correlate()`, `resample()`, `csd()`, `spectrogram()`, and more, making it versatile for various signal processing tasks. diff --git a/content/scipy/concepts/scipy-stats/scipy-stats.md b/content/scipy/concepts/scipy-stats/scipy-stats.md new file mode 100644 index 00000000000..0c934af1fa4 --- /dev/null +++ b/content/scipy/concepts/scipy-stats/scipy-stats.md @@ -0,0 +1,79 @@ +--- +Title: 'scipy.stats' +Description: 'scipy.stats is a Python module offering statistical functions, distributions, and hypothesis tests for data analysis.' +Subjects: + - 'Data Science' + - 'Machine Learning' +Tags: + - 'Distributions' + - 'Hypothesis Testing' + - 'Python' + - 'Statistics' +CatalogContent: + - 'learn-python' + - 'paths/data-science' +--- + +The **`scipy.stats`** module is part of the broader [SciPy](https://www.codecademy.com/resources/docs/scipy) library for scientific computing in Python. It provides functionality for working with various probability distributions, conducting hypothesis tests, and computing descriptive statistics. By leveraging `scipy.stats`, data scientists and analysts can quickly explore their data, model it using theoretical distributions, and draw meaningful conclusions through statistical inference. + +## Probability Distributions + +`scipy.stats` provides a wide range of distributions (e.g., Normal, Exponential, Binomial) with methods to work with them. For example, for the Normal distribution: + +```pseudo +stats.norm.pdf(x) # Probability Density Function +stats.norm.cdf(x) # Cumulative Distribution Function +stats.norm.rvs(size=n) # Generate random samples +``` + +- `pdf`: Returns the probability density function (PDF) value at a given point for continuous distributions.. +- `cdf`: Gives the probability that a random variable is less than or equal to a certain value. +- `rvs`: Draws random samples from the specified distribution. + +These methods can be used with other distributions available in `scipy.stats` by replacing norm with the desired distribution (e.g., `expon`, `binom`). + +## Descriptive Statistics + +Compute common statistical measures with both `numpy` and `scipy.stats`: + +```pseudo +np.mean(data) +np.median(data) +stats.mode(data) +stats.describe(data) +``` + +- `mean()`: Computes the average value of the data. +- `median()`: Finds the middle value separating the higher and lower halves of the data. +- `mode()`: Returns the most frequently occurring value (for multi-modal data, it returns the smallest mode). +- `describe()`: Provides a quick summary of the data, including count, min, max, mean, variance, skewness, and kurtosis. + +> **Note**: While `mean` and `median` are part of `numpy`, `mode` and `describe` belong to `scipy.stats`. + +## Hypothesis Testing + +Perform a variety of statistical tests to assess differences or relationships: + +```pseudo +stats.ttest_ind(group1, group2) # Independent t-test +stats.chisquare(observed, expected) # Chi-square test +stats.mannwhitneyu(group1, group2) # Mann-Whitney U test +``` + +- `ttest_ind()`: Checks if the means of two independent samples differ significantly. +- `chisquare()`: Compares observed frequencies to expected frequencies for a goodness-of-fit test. +- `mannwhitneyu()`: Tests for differences in the distribution of two independent samples (non-parametric). + +## Correlation and Regression + +Evaluate relationships between variables: + +```pseudo +stats.pearsonr(x, y) # Pearson correlation +stats.spearmanr(x, y) # Spearman rank correlation +stats.kendalltau(x, y) # Kendall’s Tau correlation +``` + +- `pearsonr()`: Measures linear correlation between two datasets. +- `spearmanr()`: Measures rank-based correlation, less sensitive to non-linear relationships. +- `kendalltau()`: Measures the association between two measured quantities using rank correlation. diff --git a/content/scipy/scipy.md b/content/scipy/scipy.md new file mode 100644 index 00000000000..0187279f9d9 --- /dev/null +++ b/content/scipy/scipy.md @@ -0,0 +1,10 @@ +--- +Title: 'SciPy' +Description: 'SciPy is a Python-based library that builds on NumPy’s array operations to provide a wide range of mathematical, scientific, and engineering tools.' +Codecademy Hub Page: 'https://www.codecademy.com/catalog/subject/data-science' +CatalogContent: + - 'learn-data-science' + - 'paths/data-science-foundations' +--- + +**`SciPy`** is a widely used open-source [Python](https://www.codecademy.com/enrolled/courses/learn-python-3) library that provides various scientific and numerical computing tools. Built on top of [NumPy’s](https://www.codecademy.com/resources/docs/numpy) robust array manipulation capabilities, SciPy extends Python with specialized modules for tasks such as optimization, signal processing, integration, statistics, image processing, and more. Its goal is to combine a consistent collection of high-level mathematical functions and algorithms so scientists, engineers, and data analysts can perform advanced computations efficiently, often without switching to lower-level languages like [C](https://www.codecademy.com/resources/docs/c). diff --git a/content/sklearn/concepts/biclustering/biclustering.md b/content/sklearn/concepts/biclustering/biclustering.md new file mode 100644 index 00000000000..f0ba785c198 --- /dev/null +++ b/content/sklearn/concepts/biclustering/biclustering.md @@ -0,0 +1,111 @@ +--- +Title: 'Biclustering' +Description: 'A technique for grouping rows and columns of a matrix to discover local patterns in data.' +Subjects: + - 'Data Science' + - 'Data Visualization' + - 'Machine Learning' +Tags: + - 'Machine Learning' + - 'Scikit-learn' + - 'Unsupervised learning' +CatalogContent: + - 'learn-python-3' + - 'paths/data-science' +--- + +**Biclustering** is a form of unsupervised machine learning that takes a data matrix and groups both the rows and columns of this matrix to unveil previously unknown patterns. It's  standard in gene expression, text mining, and other recommendation systems and captures more localized relationships than the general clustering method. Scikit-learn provides spectral co-clustering and diagonal biclustering algorithms, implemented as classes with a fit method, enabling efficient pattern discovery in complex datasets. + +## Syntax + +Here's a syntax that shows the implementation of biclustering using sklearn: + +```pseudo +from sklearn.cluster import SpectralCoclustering, SpectralBiclustering + +# For Spectral Co-clustering +model = SpectralCoclustering(n_clusters=number_of_biclusters, random_state=seed) +model.fit(data_matrix) + +# For Spectral Bi-clustering +model = SpectralBiclustering(n_clusters=number_of_biclusters, method="log", random_state=seed) +model.fit(data_matrix) +``` + +- `n_clusters`: Number of biclusters to create. +- `random_state`: Ensures the randomness for reproducible results. +- `method`(For SpectralBiclustering): Specifies the algorithm variant, e.g., `log` or `bistochastic`. The `log` method applies logarithmic scaling, while `bistochastic` normalizes rows and columns. The choice of method can affect the results depending on the dataset. + +> **Note**: Since Bicluster is not directly available in sklearn, alternative methods for biclustering, such as `SpectralBiclustering`, can be used. + +## Example + +Here's an example of implementing biclustering using `SpectralBiclustering` from sklearn: + +```py +import numpy as np +from sklearn.cluster import SpectralBiclustering + +# Sample data matrix +data_matrix = np.array([[1, 1, 0, 0], + [1, 1, 0, 0], + [0, 0, 1, 1], + [0, 0, 1, 1]]) + +# Apply Spectral Biclustering +model = SpectralBiclustering(n_clusters=2, random_state=42) +model.fit(data_matrix) + +# Get the bicluster labels for rows and columns +row_labels = model.rows_ +column_labels = model.columns_ + +# Print biclusters +print("Row Biclusters:", row_labels) +print("Column Biclusters:", column_labels) +``` + +The above code results in the following output: + +```shell +Row Biclusters: [[False False True True] + [False False True True] + [ True True False False] + [ True True False False]] +Column Biclusters: [[False False True True] + [ True True False False] + [False False True True] + [ True True False False]] +``` + +- In the **Row Biclusters**, `True` in a position means that the corresponding row is part of the bicluster. +- Similarly, in the **Column Biclusters**, `True` indicates that the corresponding column is part of the bicluster. + +## Codebyte Example + +Here the example demonstrates how to perform Spectral Biclustering on a simple **6x6** binary data matrix using `SpectralBiclustering` from `sklearn`: + +```codebyte/python +import numpy as np +from sklearn.cluster import SpectralBiclustering + +# Sample 6x6 data matrix +data_matrix = np.array([[1, 1, 0, 0, 1, 1], + [1, 1, 0, 0, 1, 1], + [0, 0, 1, 1, 0, 0], + [0, 0, 1, 1, 0, 0], + [1, 1, 0, 0, 1, 1], + [1, 1, 0, 0, 1, 1]]) + +# Apply Spectral Biclustering +model = SpectralBiclustering(n_clusters=2, random_state=42) +model.fit(data_matrix) + +# Get the bicluster labels for rows and columns +row_labels = model.rows_ +column_labels = model.columns_ + +# Print the resulting biclusters for rows and columns +print("Row Biclusters:", row_labels) +print("Column Biclusters:", column_labels) +``` diff --git a/content/sklearn/concepts/linear-discriminant-analysis/linear-discriminant-analysis.md b/content/sklearn/concepts/linear-discriminant-analysis/linear-discriminant-analysis.md new file mode 100644 index 00000000000..d5820599d46 --- /dev/null +++ b/content/sklearn/concepts/linear-discriminant-analysis/linear-discriminant-analysis.md @@ -0,0 +1,121 @@ +--- +Title: 'Linear Discriminant Analysis' +Description: 'Linear Discriminant Analysis aims to project data onto a lower-dimensional space while preserving the information that discriminates between different classes.' +Subjects: + - 'Data Science' + - 'Machine Learning' +Tags: + - 'Machine Learning' + - 'Scikit-learn' + - 'Supervised Learning' + - 'Unsupervised Learning' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +In Sklearn, **Linear Discriminant Analysis (LDA)** is a supervised algorithm that aims to project data onto a lower-dimensional space while preserving the information that discriminates between different classes. LDA finds a set of directions in the original feature space that maximize the separation between the classes. These directions are called discriminant directions. By projecting the data onto these directions, LDA reduces the dimensionality of the data while retaining the information that is most relevant for classification. + +## Syntax + +```pseudo +from sklearn.discriminant_analysis import LinearDiscriminantAnalysis + +# Create an LDA model +model = LinearDiscriminantAnalysis( + solver='svd', + shrinkage=None, + priors=None, + n_components=None, + store_covariance=False, + tol=0.0001, + covariance_estimator=None +) + +# Fit the model to the training data +model.fit(X_train, y_train) + +# Make predictions on the new data +y_pred = model.predict(X_test) +``` + +- `solver`: The solver to be used. Common options include: + - `svd`: Singular Value Decomposition (default). + - `lsqr`: Least Squares Solution. + - `eigen`: Eigenvalue Decomposition. +- `shrinkage`: Controls the amount of shrinkage applied to the covariance matrix. Common options include: + - `None`: No shrinkage (default). + - `auto`: Automatic shrinkage utilizing the Ledoit-Wolf lemma. +- `priors`: Prior probabilities of the classes. The default value is `None`. +- `n_components`: The number of components. The default value is `None`. +- `store_covariance`: If set to `True`, it explicitly calculates the covariance matrix when `solver` is set to `svd`. The default value is `False`. +- `tol`: The tolerance for the eigenvalue calculation. The default value is `0.0001`. +- `covariance_estimator`: Estimates the covariance matrices. The default value is `None`. + +## Example + +The following example demonstrates the implementation of LDA: + +```py +from sklearn.discriminant_analysis import LinearDiscriminantAnalysis +from sklearn.datasets import load_iris +from sklearn.model_selection import train_test_split +from sklearn.metrics import accuracy_score + +# Load the Iris dataset +iris = load_iris() +X = iris.data +y = iris.target + +# Create training and testing sets by splitting the dataset +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) + +# Create an LDA model +model = LinearDiscriminantAnalysis() + +# Fit the model to the training data +model.fit(X_train, y_train) + +# Make predictions on the new data +y_pred = model.predict(X_test) + +# Evaluate the model +print("Accuracy:", accuracy_score(y_test, y_pred)) +``` + +The above code produces the following output: + +```shell +Accuracy: 1.0 +``` + +## Codebyte Example + +The following codebyte example demonstrates the implementation of LDA: + +```codebyte/python +from sklearn.discriminant_analysis import LinearDiscriminantAnalysis +from sklearn.datasets import load_iris +from sklearn.model_selection import train_test_split +from sklearn.metrics import accuracy_score + +# Load the Iris dataset +iris = load_iris() +X = iris.data +y = iris.target + +# Create training and testing sets by splitting the dataset +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=44) + +# Create an LDA model +model = LinearDiscriminantAnalysis() + +# Fit the model to the training data +model.fit(X_train, y_train) + +# Make predictions on the test set +y_pred = model.predict(X_test) + +# Evaluate the model +print("Accuracy:", accuracy_score(y_test, y_pred)) +``` diff --git a/content/sklearn/concepts/multiclass-classification/multiclass-classification.md b/content/sklearn/concepts/multiclass-classification/multiclass-classification.md new file mode 100644 index 00000000000..1df559b9805 --- /dev/null +++ b/content/sklearn/concepts/multiclass-classification/multiclass-classification.md @@ -0,0 +1,122 @@ +--- +Title: 'Multiclass Classification' +Description: 'Multiclass classification is a supervised machine learning task where instances are categorized into one of three or more distinct classes.' +Subjects: + - 'AI' + - 'Data Science' + - 'Machine Learning' +Tags: + - 'Classification' + - 'Multitask Learning' + - 'Scikit-learn' + - 'Supervised Learning' +CatalogContent: + - 'learn-python-3' + - 'paths/intermediate-machine-learning-skill-path' +--- + +In [Sklearn](https://www.codecademy.com/resources/docs/sklearn), **Multiclass Classification** is a supervised machine learning task where instances are categorized into one of three or more distinct classes. Unlike binary classification, which involves two classes, multiclass classification requires the model to differentiate among multiple categories. + +Multiclass classification in Sklearn is implemented using algorithms such as [`Decision Trees`](https://www.codecademy.com/resources/docs/sklearn/decision-trees), [`Support Vector Machines (SVMs)`](https://www.codecademy.com/resources/docs/sklearn/support-vector-machines), and `Logistic Regression`. These algorithms handle multiple classes through strategies like One-vs-Rest (OvR) or One-vs-One (OvO), depending on the model and configuration. + +> **Note:** Sklearn offers many algorithms for multi-class classification. + +## Syntax + +Sklearn offers a variety of algorithms for multiclass classification. Below is an example syntax for performing multiclass classification using `RandomForestClassifier` in sklearn: + +```pseudo +from sklearn.datasets import make_classification +from sklearn.model_selection import train_test_split +from sklearn.ensemble import RandomForestClassifier # Replace with your classifier +from sklearn.metrics import classification_report + +# Generate a synthetic dataset +X, y = make_classification(n_samples=1000, n_features=20, n_classes=3, random_state=42) + +# Split the dataset into training and testing sets +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) + +# Create the classifier (can be any model that supports multiclass classification) +clf = RandomForestClassifier(random_state=42) + +# Fit the model +clf.fit(X_train, y_train) + +# Make predictions +y_pred = clf.predict(X_test) + +# Evaluate the model +print(classification_report(y_test, y_pred)) +``` + +## Example + +The following example code loads the `iris` dataset, split it into training and testing sets (80% training, 20% testing), then train a `RandomForestClassifier`, make predictions on the test data, calculates and prints the accuracy of the model: + +```py +from sklearn.datasets import load_iris +from sklearn.model_selection import train_test_split +from sklearn.ensemble import RandomForestClassifier +from sklearn.metrics import accuracy_score + +# Load the Iris dataset (for multiclass classification) +data = load_iris() +X, y = data.data, data.target + +# Split the dataset into training and testing sets (80% train, 20% test) +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) + +# Initialize the RandomForestClassifier +model = RandomForestClassifier() + +# Train the model on the training data +model.fit(X_train, y_train) + +# Make predictions on the test data +y_pred = model.predict(X_test) + +# Evaluate the model by calculating accuracy +accuracy = accuracy_score(y_test, y_pred) + +# Print the accuracy of the model +print(f"Accuracy: {accuracy:.2f}") +``` + +The code outputs the following output: + +```shell +Accuracy: 1.00 +``` + +## Codebyte Example + +The following codebyte example trains a `Random Forest classifier` for multiclass classification on synthetic data and predicts the category of a new product: + +```codebyte/python +from sklearn.ensemble import RandomForestClassifier +from sklearn.datasets import make_classification +from sklearn.model_selection import train_test_split +from sklearn.metrics import accuracy_score + +# Generate synthetic data for multiclass classification (3 classes) +X, y = make_classification(n_samples=1000, n_features=20, n_classes=3, random_state=42) + +# Split the dataset into training and testing sets (80% train, 20% test) +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) + +# Initialize the RandomForestClassifier +model = RandomForestClassifier() + +# Train the model on the training data +model.fit(X_train, y_train) + +# Make predictions on the test data +y_pred = model.predict(X_test) + +# Evaluate the model by calculating accuracy +accuracy = accuracy_score(y_test, y_pred) + +# Print the accuracy of the model +print(f"Accuracy: {accuracy:.2f}") +``` diff --git a/content/sklearn/concepts/multilabel-classification/multilabel-classification.md b/content/sklearn/concepts/multilabel-classification/multilabel-classification.md new file mode 100644 index 00000000000..1a517c24ce3 --- /dev/null +++ b/content/sklearn/concepts/multilabel-classification/multilabel-classification.md @@ -0,0 +1,121 @@ +--- +Title: 'Multilabel Classification' +Description: 'Multilabel classification is a machine learning task where each instance can be assigned multiple labels or categories simultaneously.' +Subjects: + - 'Computer Science' + - 'Data Science' + - 'Data Visualization' + - 'Machine Learning' +Tags: + - 'AI' + - 'Classification' + - 'Natural Language Processing' + - 'Scikit-learn' +CatalogContent: + - 'learn-python-3' + - 'paths/intermediate-machine-learning-skill-path' +--- + +In sklearn, **Multilabel Classification** assigns multiple labels to a single instance, allowing models to predict multiple outputs simultaneously. This method differs from traditional classification, where each instance belongs to only one class. + +Scikit-learn offers tools like `OneVsRestClassifier`, `ClassifierChain`, and `MultiOutputClassifier` to handle multilabel classification and enable efficient model training and evaluation. + +## Syntax + +Here's the syntax for using multiabel classification in sklearn: + +```pseudo +from sklearn.multioutput import MultiOutputClassifier +from sklearn.ensemble import RandomForestClassifier +from sklearn.model_selection import train_test_split + +# Step 1: Initialize the base classifier +base_model = RandomForestClassifier(random_state=42) + +# Step 2: Create a MultiOutputClassifier wrapper for multilabel classification +multi_label_model = MultiOutputClassifier(base_model) + +# Step 3: Train the model using the training dataset +multi_label_model.fit(X_train, y_train) + +# Step 4: Make predictions on the test dataset +predicted_labels = multi_label_model.predict(X_test) + +# Step 5: Evaluate predictions or use the results +print(predicted_labels) +``` + +- `RandomForestClassifier`: The base classifier for multilabel classification. +- `MultiOutputClassifier`: A wrapper to extend the base classifier for multilabel tasks. +- `Training and testing`: The model is trained with `fit()` and predictions are made using `predict()`. + +## Example + +This code demonstrates multilabel classification using scikit-learn by training a model to assign multiple labels: + +```py +from sklearn.datasets import make_multilabel_classification +from sklearn.ensemble import RandomForestClassifier +from sklearn.multioutput import MultiOutputClassifier +from sklearn.metrics import classification_report + +# Generate synthetic multilabel data +X, y = make_multilabel_classification(n_samples=100, n_features=10, n_classes=3, n_labels=2, random_state=42) + +# Initialize a base classifier +base_classifier = RandomForestClassifier() + +# Wrap the base classifier for multilabel classification +model = MultiOutputClassifier(base_classifier) + +# Train the model +model.fit(X, y) + +# Predict labels for new data +predictions = model.predict(X[:5]) + +# Display predictions +print("Predicted Labels for First 5 Samples:") +print(predictions) +``` + +The code results the following output: + +```shell +Predicted Labels for First 5 Samples: +[[1 1 0] + [1 1 0] + [0 0 1] + [1 1 1] + [0 1 0]] +``` + +## Codebyte Example + +The following codebyte example trains a Random Forest classifier for multilabel classification on dataset and predicts multiple categories for new samples: + +```codebyte/python +# This code demonstrates multilabel classification using scikit-learn. +from sklearn.datasets import make_multilabel_classification +from sklearn.ensemble import RandomForestClassifier +from sklearn.multioutput import MultiOutputClassifier + +# Generate synthetic multilabel data +X, y = make_multilabel_classification(n_samples=100, n_features=10, n_classes=3, n_labels=2, random_state=42) + +# Initialize a Random Forest classifier +classifier = RandomForestClassifier() + +# Wrap the classifier for multilabel classification +multi_label_model = MultiOutputClassifier(classifier) + +# Train the model on the dataset +multi_label_model.fit(X, y) + +# Predict labels for the first 4 samples +predictions = multi_label_model.predict(X[:4]) + +# Display the predictions +print("Predicted labels for the first 4 samples:") +print(predictions) +``` diff --git a/content/sklearn/concepts/multioutput-regression/multioutput-regression.md b/content/sklearn/concepts/multioutput-regression/multioutput-regression.md new file mode 100644 index 00000000000..e1764d5b1c8 --- /dev/null +++ b/content/sklearn/concepts/multioutput-regression/multioutput-regression.md @@ -0,0 +1,166 @@ +--- +Title: 'Multioutput Regression' +Description: 'Multioutput regression is a type of regression task where the model predicts multiple dependent variables (outputs) simultaneously for each input.' +Subjects: + - 'Data Science' + - 'Machine Learning' +Tags: + - 'Classification' + - 'Multitask Learning' + - 'MultiTaskLasso' + - 'Scikit-learn' +CatalogContent: + - 'learn-python-3' + - 'paths/data-science' +--- + +In [sklearn](https://www.codecademy.com/resources/docs/sklearn), **Multioutput Regression** is a type of regression task where the model predicts multiple dependent variables (outputs) simultaneously for each input, allowing for the modeling of relationships between multiple target variables and the features, which can improve prediction accuracy when outputs are correlated. + +This can be achieved using the `MultiOutputRegressor` class, which wraps a single-output regressor (like [`LinearRegression`](https://www.codecademy.com/resources/docs/sklearn/linear-regression-analysis) or [`DecisionTreeRegressor`](https://www.codecademy.com/resources/docs/sklearn/decision-trees)) and fits a separate model for each target variable. The model then predicts all outputs at once for each input, treating each target independently. + +## Syntax + +```pseudo +from sklearn.multioutput import MultiOutputRegressor + +multi_output_regressor = MultiOutputRegressor(estimator, n_jobs=None) +``` + +- `estimator`: The base regressor that is used to fit each target independently. This can be any regression model that supports single-output regression (e.g., `LinearRegression`, `DecisionTreeRegressor`, etc.). +- `n_jobs`: The number of jobs to run in parallel for fitting the models. + - If `None`, it defaults to 1 (single-threaded). + - If `-1`, it uses all available processors. + - If `int > 1`, it uses that many processors for parallel computation. + +## Example + +In the following example, a multi-output regression model is trained using `MultiOutputRegressor` with `LinearRegression` as the base `estimator` to predict two target variables from a dataset with 100 samples and 10 features: + +```py +from sklearn.datasets import make_regression +from sklearn.linear_model import LinearRegression +from sklearn.multioutput import MultiOutputRegressor + +# Generate a dataset with multiple targets +X, y = make_regression(n_samples=100, n_features=10, n_targets=2, random_state=42) + +# Create the base regressor +base_regressor = LinearRegression() + +# Initialize MultiOutputRegressor with the base regressor +multi_output_regressor = MultiOutputRegressor(base_regressor) + +# Fit the model +multi_output_regressor.fit(X, y) + +# Make predictions +predictions = multi_output_regressor.predict(X) +print(predictions) +``` + +The code above generates output as follows: + +```shell +[[ 120.68784134 275.71483026] + [ 253.98296996 563.67307766] + [ 37.84961654 83.88732044] + [-116.11399733 -517.34439275] + [ 292.0729889 342.1211352 ] + [ 126.5187794 394.60780464] + [ 74.14550766 117.86120484] + [ 34.74603745 293.23646551] + [ -1.57480398 -146.33293402] + [ 287.3570598 248.60804028] + [ 24.46227084 20.53247664] + [ 57.5037778 -52.07222977] + [ 33.46389769 59.9293089 ] + [-184.35231748 -145.97759938] + [ -18.0696738 -233.21065317] + [ 97.75493413 216.27609409] + [-224.33987424 -283.50617896] + [ -44.21983413 116.33800462] + [ 37.40886282 177.30394333] + [ 245.13484874 296.85999202] + [ -87.59651931 -39.75259675] + [-202.99155718 -222.10609199] + [ 41.24869185 181.88917186] + [ -17.87045638 -20.97891509] + [ 48.61661067 -165.6237776 ] + [-295.61268808 -528.01153829] + [ 58.07439548 173.69529786] + [ -71.14511833 -132.69257743] + [ -56.87043841 -190.48556695] + [ 49.51678317 137.10430708] + [ 26.66526388 83.57299169] + [ 1.14129753 36.65874573] + [ -7.74468723 6.85375096] + [ 3.73294889 261.10555969] + [ 56.44376756 40.51403006] + [ -1.99224336 151.40524829] + [-131.39863716 -331.62729808] + [ 109.99484706 384.60778547] + [ -7.74961445 107.97786082] + [ 193.82103464 316.71111332] + [ -7.79813083 7.15370226] + [ -52.15779501 -96.20796676] + [ 152.86738501 104.18711697] + [ 191.36728076 288.45882916] + [ 20.20018313 27.74645933] + [ 146.58558363 -117.63456814] + [-354.50728717 -533.4900471 ] + [ 14.97567883 -95.0910446 ] + [ -35.43101502 -118.48757456] + [ 5.35705289 42.88613639] + [-161.09291025 -117.90429652] + [ 172.2775084 396.90747784] + [ 162.61929411 209.92836958] + [-182.68456133 -163.2811691 ] + [ 89.07535864 -21.14848815] + [ -46.75916029 -110.53894603] + [ 231.09730211 319.15982778] + [ -40.108541 -84.98166962] + [-166.45390997 -265.05555636] + [ 0.97586946 -214.40604796] + [ 97.63593301 501.80797772] + [ 3.7398609 -72.64375758] + [ 130.65561152 66.64815668] + [ -85.31407057 -168.81530534] + [ -7.2468998 73.28377393] + [ 22.33697872 145.21764028] + [-120.51168929 -342.963189 ] + [ 121.12613888 65.01661617] + [ 124.10868505 354.92584718] + [-147.66348249 -294.81859794] + [ 61.14523063 60.52117341] + [-126.37893383 -334.70135616] + [-111.77099591 -81.93814188] + [-109.83747752 -237.97526597] + [ 8.00415806 91.38676316] + [ -26.37947013 36.09839868] + [ 106.36699275 130.83993429] + [ 69.06778835 125.59665375] + [ 134.03028548 319.28586998] + [ 130.75716498 15.34231243] + [ -86.46672131 -139.61281879] + [ -7.33734137 -226.69848199] + [ 199.71269604 357.97063185] + [ 100.94948846 -32.96835461] + [-257.05342439 -386.6851282 ] + [ -99.42556327 -108.57915827] + [ 224.41784227 425.50742575] + [-269.92957188 -202.28685621] + [-109.21584421 -225.03205094] + [-118.45089966 -420.99745962] + [ -29.83876402 19.58063146] + [ 95.06986687 70.5609531 ] + [ 41.32888453 4.51642366] + [ -10.61243193 289.0884239 ] + [ 73.11234969 158.84947994] + [ 10.45019796 260.51876186] + [-226.04884764 -372.71451196] + [ -17.30979575 -146.3735002 ] + [ -13.07113033 -42.21748842] + [ -59.54942557 -102.03957313]] +``` + +> **Note:** The output will vary each time the code is run unless a fixed `random_state` is set in `make_regression()`, ensuring reproducibility as shown in the example. diff --git a/content/sklearn/concepts/probability-calibration/probability-calibration.md b/content/sklearn/concepts/probability-calibration/probability-calibration.md new file mode 100644 index 00000000000..9a3928e20a6 --- /dev/null +++ b/content/sklearn/concepts/probability-calibration/probability-calibration.md @@ -0,0 +1,164 @@ +--- +Title: 'Probability Calibration' +Description: 'Probability calibration improves the reliability of predicted probabilities from machine learning models.' +Subjects: + - 'Data Science' + - 'Machine Learning' +Tags: + - 'Machine Learning' + - 'Scikit-learn' + - 'Supervised Learning' + - 'Unsupervised Learning' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +In [Sklearn](https://www.codecademy.com/resources/docs/sklearn), **Probability Calibration** is a technique used to improve the reliability of predicted probabilities from machine learning models. When a model outputs a probability, it makes a statement about the likelihood of a specific outcome. + +A well-calibrated model ensures that these probabilities accurately reflect the true likelihoods, meaning the predicted probabilities align closely with observed outcomes. + +Sklearn provides two primary methods for implementing probability calibration: + +- **Platt Scaling**: Fits a logistic regression model to the model's output probabilities. +- **Isotonic Regression**: Fits a non-parametric isotonic regression model to the model's output probabilities. + +## Syntax + +The `CalibratedClassifierCV` class is used to implement probability calibration. + +Platt Scaling uses a `sigmoid` function to map raw model scores to calibrated probabilities, ensuring they better reflect true likelihoods. + +The sigmoid function, σ(x) = 1 / (1 + e^(-x)), maps any real-valued number to a range between 0 and 1. + +In Platt Scaling, this function is parameterized as: + +P(y=1 | x) = 1 / (1 + e^(-(A \* x + B))) + +Where A and B are parameters learned during calibration. + +Following is the syntax for implementing probability calibration using Platt Scaling: + +```pseudo +from sklearn.calibration import CalibratedClassifierCV +from sklearn.linear_model import LogisticRegression + +# Create a logistic regression classifier +model = LogisticRegression() + +# Calibrate the classifier using Platt Scaling +model_calibrated = CalibratedClassifierCV(model, cv=5, method="sigmoid") + +# Fit the calibrated classifier to the training data +# X_train: Features for the training set; y_train: Target labels for the training set +model_calibrated.fit(X_train, y_train) + +# Make predictions using the calibrated classifier +y_pred_prob = model_calibrated.predict_proba(X_test) +``` + +Isotonic regression is a non-parametric regression technique that fits a piecewise constant, monotonic (increasing or decreasing) function to the data. + +In the context of calibration, the isotonic method uses isotonic regression to map the model's raw probabilities to calibrated probabilities while preserving their relative order. + +Following is the syntax for implementing probability calibration using Isotonic Regression: + +```pseudo +from sklearn.calibration import CalibratedClassifierCV +from sklearn.linear_model import LogisticRegression + +# Create a logistic regression classifier +model = LogisticRegression() + +# Calibrate the classifier using Isotonic Regression +model_calibrated = CalibratedClassifierCV(model, cv=5, method="isotonic") + +# Fit the calibrated classifier to the training data +# X_train: Features for the training set; y_train: Target labels for the training set +model_calibrated.fit(X_train, y_train) + +# Make predictions using the calibrated classifier +y_pred_prob = model_calibrated.predict_proba(X_test) +``` + +- `cv`: The number of cross-validation folds. The default value is `None`. +- `method`: The calibration method. Common options include `sigmoid` (default) and `isotonic`. + +## Example + +The following example demonstrates the implementation of probability calibration using Platt Scaling: + +```py +from sklearn.datasets import load_diabetes +from sklearn.model_selection import train_test_split +from sklearn.linear_model import LogisticRegression +from sklearn.calibration import CalibratedClassifierCV +from sklearn.metrics import brier_score_loss + +# Load the Diabetes Dataset +diabetes = load_diabetes() +X = diabetes.data +y = (diabetes.target > 126).astype(int) # Convert to binary classification + +# Create training and testing sets by splitting the dataset +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) + +# Create a logistic regression classifier +model = LogisticRegression() + +# Calibrate the classifier using Platt Scaling +model_calibrated = CalibratedClassifierCV(model, cv=5, method="sigmoid") + +# Fit the calibrated classifier to the training data +model_calibrated.fit(X_train, y_train) + +# Make predictions using the calibrated classifier +y_pred_prob = model_calibrated.predict_proba(X_test)[:, 1] + +# Calculate the Brier score +brier_score = brier_score_loss(y_test, y_pred_prob) +print("Brier Score:", brier_score) +``` + +The above code produces the following output: + +```shell +Brier Score: 0.17555317807611756 +``` + +## Codebyte Example + +The following example demonstrates the implementation of probability calibration using Isotonic Regression: + +```codebyte/python +from sklearn.datasets import load_diabetes +from sklearn.model_selection import train_test_split +from sklearn.linear_model import LogisticRegression +from sklearn.calibration import CalibratedClassifierCV +from sklearn.metrics import brier_score_loss + +# Load the Diabetes Dataset +diabetes = load_diabetes() +X = diabetes.data +y = (diabetes.target > 126).astype(int) # Convert to binary classification + +# Create training and testing sets by splitting the dataset +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) + +# Create a logistic regression classifier +model = LogisticRegression() + +# Calibrate the classifier using Isotonic Regression +model_calibrated = CalibratedClassifierCV(model, cv=5, method="isotonic") + +# Fit the calibrated classifier to the training data +model_calibrated.fit(X_train, y_train) + +# Make predictions using the calibrated classifier +y_pred_prob = model_calibrated.predict_proba(X_test)[:, 1] + +# Calculate the Brier score +brier_score = brier_score_loss(y_test, y_pred_prob) + +print("Brier Score:", brier_score) +``` diff --git a/content/sklearn/concepts/quadratic-discriminant-analysis/quadratic-discriminant-analysis.md b/content/sklearn/concepts/quadratic-discriminant-analysis/quadratic-discriminant-analysis.md new file mode 100644 index 00000000000..de6d3e83781 --- /dev/null +++ b/content/sklearn/concepts/quadratic-discriminant-analysis/quadratic-discriminant-analysis.md @@ -0,0 +1,105 @@ +--- +Title: 'Quadratic Discriminant Analysis' +Description: 'Quadratic Discriminant Analysis is a technique that models each class with a quadratic decision boundary, assuming different covariance matrices for each class.' +Subjects: + - 'Data Science' + - 'Machine Learning' +Tags: + - 'Machine Learning' + - 'Scikit-learn' + - 'Supervised Learning' + - 'Unsupervised Learning' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +In Sklearn, **Quadratic Discriminant Analysis (QDA)** is a classification technique that assumes that the data points within each class are normally distributed. Unlike **Linear Discriminant Analysis (LDA)**, which assumes a shared covariance matrix for all classes, QDA enables each class to have its own covariance matrix. This flexibility enables QDA to model more complex decision boundaries, making it suitable for datasets with overlapping classes or non-linear relationships between features. + +## Syntax + +```pseudo +from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis + +# Create a QDA model +model = QuadraticDiscriminantAnalysis(priors=None, reg_param=0.0, store_covariance=False, tol=0.0001) + +# Fit the model to the training data +model.fit(X_train, y_train) + +# Make predictions on the new data +y_pred = model.predict(X_test) +``` + +- `priors`: The prior probabilities of the classes. If `None`, the class distribution is estimated from the training data. If specified, it should sum to 1. This allows control over the importance of each class. +- `reg_param`: The regularization parameter. A value greater than 0 applies regularization to the covariance estimates. Regularization can help in cases where the covariance matrices might be singular or near-singular. +- `store_covariance`: Whether to store the covariance matrices for each class. If `True`, the covariance matrix is explicitly computed and stored when `solver='svd'`. If `False`, it will not store the covariance matrices but will use them for prediction during training. +- `tol`: The tolerance value for the eigenvalue decomposition when using `solver='eigen'`. This helps control the precision of the eigenvalue computation. + +## Example + +The following example demonstrates the implementation of QDA: + +```py +from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis +from sklearn.datasets import load_iris +from sklearn.model_selection import train_test_split +from sklearn.metrics import accuracy_score + +# Load the Iris dataset +iris = load_iris() +X = iris.data +y = iris.target + +# Create training and testing sets by splitting the dataset +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) + +# Create a QDA model +model = QuadraticDiscriminantAnalysis() + +# Fit the model to the training data +model.fit(X_train, y_train) + +# Make predictions on the new data +y_pred = model.predict(X_test) + +# Evaluate the model +print("Accuracy:", accuracy_score(y_test, y_pred)) +``` + +The above code produces the following output: + +```shell +Accuracy: 1.0 +``` + +## Codebyte Example + +The following codebyte example demonstrates the implementation of QDA: + +```codebyte/python +from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis +from sklearn.datasets import load_iris +from sklearn.model_selection import train_test_split +from sklearn.metrics import accuracy_score + +# Load the Iris dataset +iris = load_iris() +X = iris.data +y = iris.target + +# Create training and testing sets by splitting the dataset +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=44) + +# Create a QDA model +model = QuadraticDiscriminantAnalysis() + +# Fit the model to the training data +model.fit(X_train, y_train) + +# Make predictions on the new data +y_pred = model.predict(X_test) + +# Evaluate the model +print("Accuracy:", accuracy_score(y_test, y_pred)) +``` diff --git a/content/sklearn/concepts/stochastic-gradient-descent/stochastic-gradient-descent.md b/content/sklearn/concepts/stochastic-gradient-descent/stochastic-gradient-descent.md new file mode 100644 index 00000000000..ff1fa8b427f --- /dev/null +++ b/content/sklearn/concepts/stochastic-gradient-descent/stochastic-gradient-descent.md @@ -0,0 +1,134 @@ +--- +Title: 'Stochastic Gradient Descent' +Description: 'Stochastic Gradient Descent (SGD) aims to find the best set of parameters for a model that minimizes a given loss function.' +Subjects: + - 'Data Science' + - 'Machine Learning' +Tags: + - 'Machine Learning' + - 'Scikit-learn' + - 'Supervised Learning' + - 'Unsupervised Learning' +CatalogContent: + - 'learn-python-3' + - 'paths/computer-science' +--- + +In [Sklearn](https://www.codecademy.com/resources/docs/sklearn), **Stochastic Gradient Descent (SGD)** is a popular optimization algorithm that focuses on finding the best set of parameters for a model that minimizes a given loss function. + +Unlike traditional [gradient descent](https://www.codecademy.com/resources/docs/ai/search-algorithms/gradient-descent), which calculates the gradient using the entire dataset, SGD computes the gradient using a single training example at a time. This makes it computationally efficient for large datasets. + +Sklearn provides two primary classes for implementing SGD: + +- `SGDClassifier`: Well-suited for classification tasks. Supports various loss functions and penalties for fitting linear classification models. +- `SGDRegressor`: Well-suited for regression tasks. Supports various loss functions and penalties for fitting [linear regression models](https://www.codecademy.com/resources/docs/sklearn/linear-regression-analysis). + +## Syntax + +Following is the syntax for implementing SGD using `SGDClassifier`: + +```pseudo +from sklearn.linear_model import SGDClassifier + +# Create an SGDClassifier model +model = SGDClassifier(loss="hinge", penalty="l2", max_iter=1000, random_state=42) + +# Fit the classifier to the training data +model.fit(X_train, y_train) + +# Make predictions on the new data +y_pred = model.predict(X_test) +``` + +Following is the syntax for implementing SGD using `SGDRegressor`: + +```pseudo +from sklearn.linear_model import SGDRegressor + +# Create an SGDRegressor model +model = SGDRegressor(loss="squared_loss", penalty="l2", max_iter=1000, random_state=42) + +# Fit the regressor to the training data +model.fit(X_train, y_train) + +# Make predictions on the new data +y_pred = model.predict(X_test) +``` + +- `loss`: Specifies the loss function. + - For `SGDClassifier`, the options include `hinge` (default), `log`, and `modified_huber`. + - For `SGDRegressor`, the options include `squared_loss` (default), `huber`, and `epsilon_insensitive`. +- `penalty`: Specifies the regularization penalty. Common options include `l2` (L2 regularization, default), `l1` (L1 regularization), and `elasticnet` (a combination of L1 and L2 regularization). +- `max_iter`: Specifies the maximum number of iterations for the optimization algorithm. The default value is `1000`. Excessive values can lead to overfitting or unnecessary computations. +- `random_state`: Specifies the random seed for reproducibility. The default value is `None`. Setting `random_state` ensures consistent results across runs by fixing the randomness of data splitting or model initialization. + +## Example + +The following example demonstrates the implementation of SGD using `SGDClassifier`: + +```py +from sklearn.datasets import load_iris +from sklearn.linear_model import SGDClassifier +from sklearn.model_selection import train_test_split +from sklearn.metrics import accuracy_score + +# Load the Iris dataset +iris = load_iris() +X = iris.data +y = iris.target + +# Create training and testing sets by splitting the dataset +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) + +# Create an SGDClassifier model +model = SGDClassifier(loss="hinge", penalty="l2", max_iter=1000, random_state=42) + +# Fit the classifier to the training data +model.fit(X_train, y_train) + +# Make predictions on the new data +y_pred = model.predict(X_test) + +# Evaluate the model's accuracy +accuracy = accuracy_score(y_test, y_pred) +print("Accuracy:", accuracy) +``` + +The above code produces the following output: + +```shell +Accuracy: 0.8 +``` + +## Codebyte Example + +The following codebyte example demonstrates the implementation of SGD using `SGDRegressor`: + +```codebyte/python +from sklearn.datasets import load_diabetes +from sklearn.linear_model import SGDRegressor +from sklearn.model_selection import train_test_split +from sklearn.metrics import mean_squared_error + +# Load the Diabetes dataset +diabetes = load_diabetes() +X = diabetes.data +y = diabetes.target + +# Create training and testing sets by splitting the dataset +X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) + +# Create an SGDRegressor model +model = SGDRegressor(loss="squared_loss", penalty="l2", max_iter=1000, random_state=42) + +# Fit the regressor to the training data +model.fit(X_train, y_train) + +# Make predictions on the new data +y_pred = model.predict(X_test) + +# Evaluate the model's performance +m2e = mean_squared_error(y_test, y_pred) + +print("Mean Squared Error:", m2e) +``` diff --git a/content/sql/concepts/window-functions/terms/lag/lag.md b/content/sql/concepts/window-functions/terms/lag/lag.md index a1ebf767cba..50c0e56f1fd 100644 --- a/content/sql/concepts/window-functions/terms/lag/lag.md +++ b/content/sql/concepts/window-functions/terms/lag/lag.md @@ -43,19 +43,63 @@ Users Table | kyle | xy | 60 | ```sql -SELECT *, +SELECT + first_name, + last_name, + age, LAG(age, 1) OVER ( - ORDER BY age ASC) AS previous_age + ORDER BY age DESC + ) AS previous_age FROM Users; ``` -The output is a table that features a new column `previous_age`, which holds the values from the previous records. The first record is null because a default was not specified and the previous row would be out of range. - -Output +The output of the above code is a table that features a new column `previous_age`, which holds the values from the previous records. The first record is null because a default was not specified and the previous row would be out of range. | first_name | last_name | age | previous_age | | ---------- | --------- | --- | ------------ | -| kyle | xy | 60 | null | +| kyle | xy | 60 | NULL | | jenna | black | 35 | 60 | | chris | smith | 30 | 35 | | dave | james | 19 | 30 | + +### Using `PARTITION BY` Clause + +This example demonstrates how to use the `LAG()` function to create a new column, `previous_position`. + +The `PARTITION BY employee_id` clause ensures that the `LAG()` function operates within each group of rows that share the same `employee_id`. The `ORDER BY promotion_date` ensures the rows are processed in chronological order. + +`Promotions` Table + +| employee_id | promotion_date | new_position | +| ----------- | -------------- | ------------ | +| 1 | 2020-01-01 | Junior Dev | +| 1 | 2021-06-01 | Mid Dev | +| 1 | 2024-03-01 | Senior Dev | +| 2 | 2019-05-01 | Intern | +| 2 | 2022-11-01 | Analyst | +| 2 | 2024-11-20 | Data Analyst | + +```sql +SELECT + employee_id, + promotion_date, + new_position, + LAG(new_position) OVER ( + PARTITION BY employee_id + ORDER BY promotion_date + ) AS previous_position +FROM Promotions; +``` + +Within each group defined by `employee_id`, the `previous_position` column holds the value from the previous row based on `promotion_date`. The first record in each group is `NULL` because there is no preceding row. + +The above code generates the following output: + +| employee_id | promotion_date | new_position | previous_position | +| ----------- | -------------- | ------------ | ----------------- | +| 1 | 2020-01-01 | Junior Dev | NULL | +| 1 | 2021-06-01 | Mid Dev | Junior Dev | +| 1 | 2024-03-01 | Senior Dev | Mid Dev | +| 2 | 2019-05-01 | Intern | NULL | +| 2 | 2022-11-01 | Analyst | Intern | +| 2 | 2024-11-20 | Data Analyst | Analyst | diff --git a/content/tensorflow/concepts/math/math.md b/content/tensorflow/concepts/math/math.md new file mode 100644 index 00000000000..9f4823a51e1 --- /dev/null +++ b/content/tensorflow/concepts/math/math.md @@ -0,0 +1,144 @@ +--- +Title: 'Math' +Description: 'Mathematical computations on tensors using TensorFlow.' +Subjects: + - 'AI' + - 'Data Science' +Tags: + - 'Arithmetic' + - 'Arrays' + - 'Deep Learning' + - 'TensorFlow' +CatalogContent: + - 'intro-to-tensorflow' + - 'tensorflow-for-deep-learning' +--- + +In TensorFlow, **math operations** are fundamental for performing various mathematical computations on tensors. Tensors are multi-dimensional arrays that can be manipulated using various operations. + +TensorFlow offers a rich set of mathematical operations under the `tf.math` module. These operations include arithmetic, trigonometric and exponential functions, and more. + +Some of the key mathematical operations available in TensorFlow are listed below. + +## Arithmetic Operations + +TensorFlow provides a wide range of arithmetic operations that can be performed on tensors, including addition, subtraction, multiplication, division, and more. Here are some examples of arithmetic operations in TensorFlow: + +```py +import tensorflow as tf + +a = tf.constant([1, 2, 3]) +b = tf.constant([4, 5, 6]) + +# Arithmetic operations + +tf.math.add(a, b) # Element-wise addition +tf.math.subtract(a, b) # Element-wise subtraction +tf.math.multiply(a, b) # Element-wise multiplication +tf.math.divide(a, b) # Element-wise division +``` + +## Element-wise Operations + +Element-wise operations are operations applied to each element of a tensor individually. These operations include computing each element's power, calculating each element's square root, and returning the absolute value of each component. Here are some examples of element-wise operations in TensorFlow: + +```py +import tensorflow as tf + +a = tf.constant([1, 2, 3], dtype=tf.float32) + +# Element-wise operations + +tf.math.pow(a, 2) # Element-wise power +tf.math.sqrt(a) # Element-wise square root +tf.math.abs(a) # Element-wise absolute value +``` + +## Trigonometric Functions + +TensorFlow supports trigonometric functions such as sine, cosine, tangent, and their inverses, which have domain constraints. These functions are useful for various mathematical computations. Here are some examples of trigonometric functions in TensorFlow: + +```py +import tensorflow as tf + +a = tf.constant([0.0, 1.0, 2.0]) + +# Trigonometric functions + +tf.math.sin(a) # Element-wise sine +tf.math.cos(a) # Element-wise cosine +tf.math.tan(a) # Element-wise tangent +tf.math.asin(a) # Element-wise arcsine +tf.math.acos(a) # Element-wise arccosine +tf.math.atan(a) # Element-wise arctangent +``` + +## Exponential and Logarithmic Functions + +TensorFlow offers functions to compute exponentials and logarithms of tensor elements, widely used in mathematical and scientific computations. Here are some examples of exponential and logarithmic functions in TensorFlow: + +```py +import tensorflow as tf + +a = tf.constant([1.0, 2.0, 3.0]) + +# Exponential and logarithmic functions + +tf.math.exp(a) # Element-wise exponential +tf.math.log(a) # Element-wise natural logarithm +tf.math.log10(a) # Element-wise base-10 logarithm +tf.math.log1p(a) # Element-wise natural logarithm of (1 + x) +``` + +## Reduction Operations + +Reduction operations compute a single result from multiple tensor elements. These operations include sum, mean, maximum, minimum, and more. Here are some examples of reduction operations in TensorFlow: + +```py +import tensorflow as tf + +a = tf.constant([[1, 2, 3], [4, 5, 6]]) + +# Reduction operations + +tf.math.reduce_sum(a) # Sum of all elements +tf.math.reduce_mean(a) # Mean of all elements +tf.math.reduce_max(a) # Maximum value +tf.math.reduce_min(a) # Minimum value +``` + +## Comparison Operations + +TensorFlow supports comparison operations that compare tensor elements and return boolean values based on the comparison results. Here are some examples of comparison operations in TensorFlow: + +```py +import tensorflow as tf + +a = tf.constant([1, 2, 3]) +b = tf.constant([3, 2, 1]) + +# Comparison operations + +tf.math.equal(a, b) # Element-wise equality +tf.math.less(a, b) # Element-wise less than +tf.math.greater(a, b) # Element-wise greater than +tf.math.not_equal(a, b) # Element-wise inequality +``` + +## Special Functions + +TensorFlow offers a variety of special mathematical functions such as `Bessel` functions, `error` functions, and `gamma` functions. These functions are useful for advanced mathematical computations. Here are some examples of special functions in TensorFlow: + +```py +import tensorflow as tf + +a = tf.constant([1.0, 2.0, 3.0]) + +# Special functions + +tf.math.erf(a) # Element-wise error function +tf.math.lgamma(a) # Element-wise natural logarithm of the absolute value of the gamma function of x +tf.math.bessel_i0(a) # Element-wise modified Bessel function of the first kind of order 0 +``` + +By leveraging these mathematical operations, a wide range of computations on tensors can be performed in TensorFlow, making it a powerful tool for scientific computing, machine learning, and deep learning applications. diff --git a/documentation/catalog-content.md b/documentation/catalog-content.md index 44c6da9ba57..955d04c350f 100644 --- a/documentation/catalog-content.md +++ b/documentation/catalog-content.md @@ -9,63 +9,63 @@ These slugs may vary for different topics. Feel free to add suggestions for new slugs to the lists as part of your PR! Be sure to insert them alphabetically. -### C +## C ``` - 'learn-c' - 'paths/computer-science' ``` -### C++ +## C++ ``` - 'learn-c-plus-plus' - 'paths/computer-science' ``` -### Cloud Computing +## Cloud Computing ``` - 'foundations-of-cloud-computing' - 'paths/back-end-engineer-career-path' ``` -### Command Line +## Command Line ``` - 'learn-the-command-line' - 'paths/computer-science' ``` -### CSS +## CSS ``` - 'learn-css' - 'paths/front-end-engineer-career-path' ``` -### Cybersecurity +## Cybersecurity ``` - 'introduction-to-cybersecurity' - 'paths/fundamentals-of-cybersecurity' ``` -### Dart +## Dart ``` - 'learn-dart' - 'paths/computer-science' ``` -### Emojicode +## Emojicode ``` - 'learn-emojicode' - 'paths/computer-science' ``` -### Git +## Git ``` - 'learn-git' @@ -73,7 +73,7 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/computer-science' ``` -### Go +## Go ``` - 'learn-go' @@ -81,77 +81,77 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/computer-science' ``` -### HTML +## HTML ``` - 'learn-html' - 'paths/front-end-engineer-career-path' ``` -### Java +## Java ``` - 'learn-java' - 'paths/computer-science' ``` -### JavaScript +## JavaScript ``` - 'introduction-to-javascript' - 'paths/front-end-engineer-career-path' ``` -### JavaScript:D3 +## JavaScript:D3 ``` - 'learn-d3' - 'paths/data-science' ``` -### Kotlin +## Kotlin ``` - 'learn-kotlin' - 'paths/computer-science' ``` -### Markdown +## Markdown ``` - 'learn-html' - 'paths/front-end-engineer-career-path' ``` -### Open Source +## Open Source ``` - 'introduction-to-open-source' - 'paths/code-foundations' ``` -### PHP +## PHP ``` - 'learn-php' - 'paths/computer-science' ``` -### PowerShell +## PowerShell ``` - 'learn-powershell' - 'paths/computer-science' ``` -### Python +## Python ``` - 'learn-python-3' - 'paths/computer-science' ``` -### Python:Matplotlib +## Python:Matplotlib ``` - 'learn-python-3' @@ -160,7 +160,7 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/data-science-foundations' ``` -### Python:Numpy +## Python:Numpy ``` - 'learn-python-3' @@ -169,7 +169,7 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/data-science-foundations' ``` -### Python:Pandas +## Python:Pandas ``` - 'learn-python-3' @@ -178,7 +178,7 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/data-science-foundations' ``` -### Python:Pillow +## Python:Pillow ``` - 'learn-python-3' @@ -187,7 +187,7 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/data-science-foundations' ``` -### Python:Plotly +## Python:Plotly ``` - 'learn-python-3' @@ -196,7 +196,7 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/data-science-foundations' ``` -### Python:PyTorch +## Python:PyTorch ``` - 'intro-to-py-torch-and-neural-networks' @@ -208,7 +208,7 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/machine-learning' ``` -### Python:Seaborn +## Python:Seaborn ``` - 'learn-python-3' @@ -217,13 +217,20 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/data-science-foundations' ``` -### Python:Sklearn +## Python:Sklearn ``` - 'paths/intermediate-machine-learning-skill-path' ``` -### R +## Python:TensorFlow + +``` +- 'intro-to-tensorflow' +- 'tensorflow-for-deep-learning' +``` + +## R ``` - 'learn-r' @@ -232,14 +239,14 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/computer-science' ``` -### React +## React ``` - 'react-101' - 'paths/front-end-engineer-career-path' ``` -### Ruby +## Ruby ``` - 'learn-rails' @@ -247,14 +254,14 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/full-stack-engineer-career-path' ``` -### Rust +## Rust ``` - 'rust-for-programmers' - 'paths/computer-science' ``` -### SQL +## SQL ``` - 'learn-sql' @@ -263,28 +270,28 @@ Feel free to add suggestions for new slugs to the lists as part of your PR! Be s - 'paths/data-science-foundations' ``` -### Swift +## Swift ``` - 'learn-swift' - 'paths/build-ios-apps-with-swiftui' ``` -### SwiftUI +## SwiftUI ``` - 'learn-swift' - 'paths/build-ios-apps-with-swiftui' ``` -### TypeScript +## TypeScript ``` - 'learn-typescript' - 'paths/full-stack-engineer-career-path' ``` -### UI/UX +## UI/UX ``` - 'intro-to-ui-ux' diff --git a/media/candlestick-example.png b/media/candlestick-example.png new file mode 100644 index 00000000000..d3aabbeac7b Binary files /dev/null and b/media/candlestick-example.png differ diff --git a/media/histogram2dcontour-example.png b/media/histogram2dcontour-example.png new file mode 100644 index 00000000000..ed21540c927 Binary files /dev/null and b/media/histogram2dcontour-example.png differ diff --git a/media/ols-model-example.png b/media/ols-model-example.png new file mode 100644 index 00000000000..7de8a60a43c Binary files /dev/null and b/media/ols-model-example.png differ