-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loan payment prediction #182
Closed
Sakshamjain98
wants to merge
3
commits into
Avdhesh-Varshney:main
from
Sakshamjain98:loan-payment-prediction
Closed
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
272 changes: 272 additions & 0 deletions
272
docs/projects/deep-learning/handwritten-digit-classifier-CNN-Model.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,272 @@ | ||
# Handwritten Digit Classifier | ||
|
||
### AIM | ||
|
||
To develop a Convolutional Neural Network (CNN) model for classifying handwritten digits with detailed explanations of CNN architecture and implementation using MNIST dataset. | ||
|
||
### DATASET LINK | ||
|
||
[MNIST Dataset](https://www.kaggle.com/code/imdevskp/digits-mnist-classification-using-cnn) | ||
- Training Set: 60,000 images | ||
- Test Set: 10,000 images | ||
- Image Size: 28x28 pixels (grayscale) | ||
|
||
### LIBRARIES NEEDED | ||
|
||
??? quote "LIBRARIES USED" | ||
```python | ||
import numpy as np | ||
import pandas as pd | ||
import matplotlib.pyplot as plt | ||
import seaborn as sns | ||
|
||
from sklearn.model_selection import train_test_split | ||
from sklearn.metrics import confusion_matrix | ||
|
||
import tensorflow as tf | ||
from tensorflow.keras.models import Sequential | ||
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, Flatten, Dropout | ||
from tensorflow.keras.preprocessing.image import ImageDataGenerator | ||
from tensorflow.keras.optimizers import Adam | ||
from tensorflow.keras.callbacks import EarlyStopping | ||
``` | ||
|
||
--- | ||
|
||
### DESCRIPTION | ||
|
||
!!! info "What is the requirement of the project?" | ||
- Create a CNN model to classify handwritten digits (0-9) from the MNIST dataset | ||
- Achieve high accuracy while preventing overfitting | ||
- Provide comprehensive visualization of model performance | ||
- Create an educational resource for understanding CNN implementation | ||
|
||
??? info "Technical Implementation Details" | ||
```python | ||
# Load and preprocess data | ||
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data() | ||
|
||
# Reshape and normalize data | ||
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1) | ||
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1) | ||
|
||
X_train = X_train.astype('float32') | ||
X_test = X_test.astype('float32') | ||
X_train /= 255 | ||
X_test /= 255 | ||
|
||
# One-hot encode labels | ||
y_train = tf.keras.utils.to_categorical(y_train, 10) | ||
y_test = tf.keras.utils.to_categorical(y_test, 10) | ||
``` | ||
|
||
### Model Architecture | ||
```python | ||
model = Sequential([ | ||
# First Convolutional Block | ||
Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)), | ||
MaxPool2D(2, 2), | ||
|
||
# Second Convolutional Block | ||
Conv2D(64, (3, 3), activation='relu'), | ||
MaxPool2D(2, 2), | ||
|
||
# Third Convolutional Block | ||
Conv2D(64, (3, 3), activation='relu'), | ||
|
||
# Flatten and Dense Layers | ||
Flatten(), | ||
Dense(64, activation='relu'), | ||
Dropout(0.5), | ||
Dense(10, activation='softmax') | ||
]) | ||
|
||
# Compile model | ||
model.compile(optimizer='adam', | ||
loss='categorical_crossentropy', | ||
metrics=['accuracy']) | ||
``` | ||
|
||
### Training Parameters | ||
```python | ||
# Data Augmentation | ||
datagen = ImageDataGenerator( | ||
rotation_range=10, | ||
zoom_range=0.1, | ||
width_shift_range=0.1, | ||
height_shift_range=0.1 | ||
) | ||
|
||
# Early Stopping | ||
early_stopping = EarlyStopping( | ||
monitor='val_loss', | ||
patience=3, | ||
restore_best_weights=True | ||
) | ||
|
||
# Training | ||
history = model.fit( | ||
datagen.flow(X_train, y_train, batch_size=32), | ||
epochs=20, | ||
validation_data=(X_test, y_test), | ||
callbacks=[early_stopping] | ||
) | ||
``` | ||
|
||
--- | ||
|
||
#### IMPLEMENTATION STEPS | ||
|
||
=== "Step 1" | ||
|
||
Data Preparation and Analysis | ||
```python | ||
# Visualize sample images | ||
plt.figure(figsize=(10, 10)) | ||
for i in range(25): | ||
plt.subplot(5, 5, i+1) | ||
plt.imshow(X_train[i].reshape(28, 28), cmap='gray') | ||
plt.axis('off') | ||
plt.show() | ||
|
||
# Check data distribution | ||
plt.figure(figsize=(10, 5)) | ||
plt.bar(range(10), [len(y_train[y_train == i]) for i in range(10)]) | ||
plt.title('Distribution of digits in training set') | ||
plt.xlabel('Digit') | ||
plt.ylabel('Count') | ||
plt.show() | ||
``` | ||
|
||
=== "Step 2" | ||
|
||
Model Training and Monitoring | ||
```python | ||
# Plot training history | ||
plt.figure(figsize=(12, 4)) | ||
|
||
plt.subplot(1, 2, 1) | ||
plt.plot(history.history['loss'], label='Training Loss') | ||
plt.plot(history.history['val_loss'], label='Validation Loss') | ||
plt.title('Model Loss') | ||
plt.legend() | ||
|
||
plt.subplot(1, 2, 2) | ||
plt.plot(history.history['accuracy'], label='Training Accuracy') | ||
plt.plot(history.history['val_accuracy'], label='Validation Accuracy') | ||
plt.title('Model Accuracy') | ||
plt.legend() | ||
|
||
plt.show() | ||
``` | ||
|
||
=== "Step 3" | ||
|
||
Model Evaluation | ||
```python | ||
# Make predictions | ||
y_pred = model.predict(X_test) | ||
y_pred_classes = np.argmax(y_pred, axis=1) | ||
y_test_classes = np.argmax(y_test, axis=1) | ||
|
||
# Create confusion matrix | ||
conf_mat = confusion_matrix(y_test_classes, y_pred_classes) | ||
|
||
# Plot confusion matrix | ||
plt.figure(figsize=(10, 8)) | ||
sns.heatmap(conf_mat, annot=True, fmt='d', cmap='Blues') | ||
plt.title('Confusion Matrix') | ||
plt.xlabel('Predicted') | ||
plt.ylabel('True') | ||
plt.show() | ||
``` | ||
|
||
--- | ||
|
||
#### MODEL PERFORMANCE | ||
|
||
=== "Metrics" | ||
|
||
- Training Accuracy: 99.42% | ||
- Validation Accuracy: 99.15% | ||
- Test Accuracy: 99.23% | ||
|
||
=== "Analysis" | ||
|
||
- Model shows excellent performance with minimal overfitting | ||
- Data augmentation and dropout effectively prevent overfitting | ||
- Confusion matrix shows most misclassifications between similar digits (4/9, 3/8) | ||
|
||
#### CHALLENGES AND SOLUTIONS | ||
|
||
=== "Challenge 1" | ||
|
||
**Overfitting Prevention** | ||
- Solution: Implemented data augmentation and dropout layers | ||
```python | ||
datagen = ImageDataGenerator( | ||
rotation_range=10, | ||
zoom_range=0.1, | ||
width_shift_range=0.1, | ||
height_shift_range=0.1 | ||
) | ||
``` | ||
|
||
=== "Challenge 2" | ||
|
||
**Model Optimization** | ||
- Solution: Used early stopping to prevent unnecessary training | ||
```python | ||
early_stopping = EarlyStopping( | ||
monitor='val_loss', | ||
patience=3, | ||
restore_best_weights=True | ||
) | ||
``` | ||
|
||
--- | ||
|
||
### CONCLUSION | ||
|
||
#### KEY LEARNINGS | ||
|
||
!!! tip "Technical Achievements" | ||
- Successfully implemented CNN with 99%+ accuracy | ||
- Effective use of data augmentation and regularization | ||
- Proper model monitoring and optimization | ||
|
||
??? tip "Future Improvements" | ||
- Experiment with different architectures (ResNet, VGG) | ||
- Implement real-time prediction capability | ||
- Add support for custom handwritten input | ||
|
||
#### APPLICATIONS | ||
|
||
=== "Application 1" | ||
|
||
- Postal code recognition systems | ||
```python | ||
# Example prediction code | ||
def predict_digit(image): | ||
image = image.reshape(1, 28, 28, 1) | ||
image = image.astype('float32') / 255 | ||
prediction = model.predict(image) | ||
return np.argmax(prediction) | ||
``` | ||
|
||
=== "Application 2" | ||
|
||
- Educational tools for machine learning | ||
```python | ||
# Example visualization code | ||
def visualize_predictions(images, predictions, actual): | ||
plt.figure(figsize=(15, 5)) | ||
for i in range(10): | ||
plt.subplot(2, 5, i+1) | ||
plt.imshow(images[i].reshape(28, 28), cmap='gray') | ||
plt.title(f'Pred: {predictions[i]}\nTrue: {actual[i]}') | ||
plt.axis('off') | ||
plt.show() | ||
``` | ||
|
||
--- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://miro.medium.com/v2/resize:fit:1400/0*8YE6hEXyYBjF5bei.jpg