Currently, this repository consists of 2 separate folders of programs
- Main_Files: This section houses all the files that make the program run. Different versions will be categorised into folders.
- Learning: This section contains various files from exemplar code relating to OpenCV and Deep Learning practise. It allows for easy reflection. The footage of Traffic can be found here
Python: install here
PyCharm: this IDE will allow you to easily pull the Git repository, alternatively you can just use it to compile within a separate Project. This can be installed from here
Numpy: after installing Python run this in Command Prompt pip install numpy
makesure you have version 1.21.4 check this in the Python console by typing
import numpy as np
print(np.__version__)
OpenCV: similarly pip install opencv-contrib-python
if this doesn't work just use pip install opencv-python
If you struggle installing please don't hesitate to email me.
For learning purposes I have structured this document to contain everything I have learned and where it can be seen within various files. This will be split into a multitude of different parts quick access to these parts can be found here:
Grayscale.py: Creates a 200 x 200, BGR (OpenCV stores RGB as BGR) with the top half and
bottom half being two separate colours. These colours are then converted to grayscale using OpenCVs cvtColor()
function.
After converting the threshold function from the same library is used then the output displayed.
The purpose of this program is to test the various different types of thresholds and get a better grasp
on how they work on the image. The most common is if the grayscale value is within the bounds of the threshold
then it that value would be changed to the maximum value otherwise it will go to zero or stay the same. These different
types are explained in better detail within the file.
Image-operations.py: This one loads two separate images and then joins them together using
various functions within OpenCVs arsenal. The purpose of this is to grasp a better understanding of how masks work and how
the arrays can be manipulated using various tools. Additionally, using the getTickCount()
function the time taken to undergo
these operations was recorded and compared to other tools like NumPy to see which process is ultimately faster. The result
was that OpenCV is hands down always faster than NumPy so if required always use OpenCV for image operations.
NOTE: Two image files, messi5.jpg and opencv_logo.png were used within this program.
imdb_1.py: This is the initial part of a program that will classify IMDB movie reviews as
either positive or negative. As this is for learning, KERAS has the datasets of reviews and whether they were positive
or negative already sorted as two separate lists; the first contains data for the most common words used within each
review. This means that there is a specific list with what common words were used for each individual review, the second
contains a list for whether it was positive or negative review (0 or 1). Within this dataset there was 25 thousand
different reviews and 10 thousand additional test data and labels. IMDB (1) shows how to import this dataset and also
explains a very useful technique used in for loops and matrices.
imdb_2.py: Continuing from IMDB (1) this also teaches a very useful way to use for loops
with an enumerator, much like the likes found in C-programming. It also continues with useful matrix manipulation
followed by the general process of setting up a model, compiling and validating within KERAS. It uses the
Matplotlib module to verify how validate the training of the model went. This specific dataset and training ended
with it overfitting the data.
imdb_3.py: This wraps up the trilogy of files for the IMDB classification task, using the
knowledge acquired from IMDB (2) reducing the EPOCHS to 4 to prevent the model from overfitting and then passing
in new data to predict whether the review was positive or negative. The model is very confident for specific reviews
which result in a 0.99 or more, or 0.1; however, it is less confident other times (0.6, 0.4).
Ultimately, this was a great fundamental lesson into Deep Learning however there were a multitude of
various hurdles, software and hardware based namely installing CUDA and cudNN integration with NVIDIA GPUs. As it stands
on the 23/12/21 this might not be a viable approach for Traffic Counting due to the hardware requirements. Further
research into Deep Learning and its uses in Cloud Computing might be the approach.
opencv_camera_link.py: This is a very simple program using OpenCV to access
a web-camera and output both the webcams live feed alongside another window with the grayscale image of the webcam.
It also displays the latency of the process of converting the original feed into grayscale and then updating both of the
separate windows. Note; flag cv2.CAP_DSHOW
is used to prevent any errors when porting to the webcam.
basic_counting.py: This is the first instance of a program that actively counts the
traffic, although its process is very basic and has many flaws. This program requires footage P1060692.MP4 but can
be used on any footage available within the Google Drive.
The program operates in the following manner:
- IMPORTS FOOTAGE (could be replaced by a webcam for real-time processing)
- BACKGROUND SUBTRACTION OF FOOTAGE
- SELECTS REGION OF INTEREST (ROI)
- THRESHOLDS THE IMAGE TO YIELD BINARY FORM
- ERODES AND DILATES BINARY FORM (occurs twice for better results)
- MEDIAN BLURS IMAGE
- FINDS CONTOURS
- BOUNDS WITH RECTANGLE AND TRACK WITH CENTROID IF CERTAIN SIZE
- COUNTS IF CENTROID CROSSES CERTAIN LINE
- PRINTS LIVE FEED WITH BOUNDING RECTANGLE AND CENTROID (separate layer with mask is also shown)