Hi!!! I have categorized all my notebooks on Image Processing, Natural Language Processing and everything else into different folders all in one place and not lying all over on Github.
-
1. Starting_out: The very basics of image processing like colour channels, separating them, frequency histograms, and stuff like that while I was starting out
-
2. U-Net.ipynb: The U-net on the DRIVE Dataset and the results are saved in the above folder.
-
3. Dendritic Spine Segmentation: Transfer Learning was applied to images of dendritic spines. The same DRIVE Dataset was modified a bit(the images were negated to suit the needs of another dataset where the objects were of high intensity and background was of low intensity) and trained on the same UNet to segment out the dendritic spines from the different dataset.
-
4. Dendritic Spine Segmentation : The same as the above notebook but with the attention model (reference: Python for Microscripts repo). A concensus based algorithm was used for binarisation of the input images in dendritic spines and the model was trained on the image pair generated with it.
-
5. Disease Prediction from Medical Imaging( non-intrusive tests ) : A part of research at University of Hyderabad, where I had tried to make a tool which would take in various images like a person's face and drawings and predict what diseases they might possibly have. The models used are really simple. This project however is no longer being actively worked on. Thorough literature reviews and datasets used are included in Additional Material
-
SAU-Net.ipynb: This contains an implementation of the Spatial Attention U-Net paper. The model has been tried on the DRIVE Dataset with upto 92% accuracy. This is basically an attention based Convolutional Neural Network model. It is a derivative of the conventional U-net model already in use. Various data augmentation strategies have been used. The images of the test set formed therof are also present in the respective folders. Areas of improvement: data augmentation strategies, use of dropblock to be done properly.
-
Nearest_Neighbour_Algorithms_on_pixelwise_classification_of_DRIVE.ipynb: KNN is performed on pixelwise classification of Retinal images on a modified version of the DRIVE Dataset. The modifications to the dataset are the crux of the project, and to be disclosed at a later date in the future. Random undersampling, random oversampling is done to fix the ratio of non blood vessel pixels and blood vessel pixels which are originally in the ratio of 1:10. SMOTE algorithm followed by random undersampling of the over-used classes is done, bringing the ratio of blood vessel pixels to non-blood vessel pixels to 1:5.
File contains a neural network made from scratch. The intuition, theory and a little maths is included in a blogpost as a sort of tutorial on my blogsite. At first the structure of simple single- layer perceptron is coded and then generalised for n- neurons and layers. It has a very good accuracy of 95.55% on the iris dataset
Ideas collected from some blogs, sites and other cool resources available on the internet that got me started in sentiment analysis as well as some of the results I found while playing around with these NLP tools. The inspiration for each of those notebooks are mentioned as source in the notebooks. Trying to recreate these notebooks would be helpful for anyone interested. Constructive criticism is also welcome.
Tried to apply my knowledge on the IMDB review dataset for sentiment analysis
File-1: How even different vectorisers affect the accuracy of data
File-2: How the different models in Scikit-learn affect the accuracy of predictions
File-3: How ensemble models increase accuracy
File-4: Current methods like LSTM and deep learning(how they affect accuracy)
The Dataset used is here https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews
To upload the dataset on google colab:
https://towardsdatascience.com/3-ways-to-load-csv-files-into-colab-7c14fcbdcb92
Change the runtime to GPU in colab for better performance.
NPTEL(Applied NLP): https://www.youtube.com/playlist?list=PLH-xYrxjfO2WyR3pOAB006CYMhNt4wTqp
NPTEL(Some theory is good too. :)): https://www.youtube.com/playlist?list=PLJJzI13YAXCHxbVgiFaSI88hj-mRSoMtI
Krish Naik: https://www.youtube.com/playlist?list=PLZoTAELRMXVMdJ5sqbCK2LiM0HhQVWNzm
- Blackadder full episodes: Extracted the scripts of blackadder from the web in order to create a dataset for Natural Language Generation. This dataset is available on kaggle and the tutorial is up here on my blog-site.
Description: Deals with a variety of basic techniques That can be used for predicting the next word in a sentence.
- Word_prediction.ipynb: deals with preprocessing text, using bigrams, trigrams and LSTMs
- Created digit classifiers with the help of simple Neural Networks, Convolutional Networks and GANs.