The goals / steps of this project are the following:
- Compute the camera calibration matrix and distortion coefficients given a set of chessboard images.
- Apply a distortion correction to raw images.
- Use color transforms, gradients, etc., to create a thresholded binary image.
- Apply a perspective transform to rectify binary image ("birds-eye view").
- Detect lane pixels and fit to find the lane boundary.
- Determine the curvature of the lane and vehicle position with respect to center.
- Warp the detected lane boundaries back onto the original image.
- Output visual display of the lane boundaries and numerical estimation of lane curvature and vehicle position.
Before conducting any lane finding we need to ensure, that the image is not distorted.
To do so calibration of camera is necessary.
For this purpose Camera
class is defined in the file camera.py
.
To calibrate camera just use following code:
# create camera object
camera = Camera()
# calibrate camera and show result
camera.calibrate(9, 6, True)
... where 9
and 6
are the number of checks in columns and rows on source images.
If you don't specify source folder with the images for calibration the default one is taken: camera_cal/calibration*.jpg
To change source folder use following code before running calibration:
camera.__set_source_images('camera_cal/calibration*.jpg')
camera.calibrate(9, 6, True)
When you specify show_result=True
while calling calibration method, the result will be shown:
When camera is calibrated ... it's ready for undistorting images.
Finding lane is implemented in the Lane
class defined in the lane.py
file.
For all graphics operations a Graph
class is defined with static methods doing operations on images.
Graph
class is defined in graph.py
file.
For each image please create new Lane() object to clear the history. Lane class keeps history of previous frames.
lane = Lane(lane_width_m=3.7, lane_length_m=30.)
lane.set_camera(camera)
While creating instance of Lane
object, specify width of lane in meters and width of lane (shall be 30m).
If you are running pipeline on a video, where each next frame relates to the same street, keep 1 Lane
object.
When you have Lane
object, you can use it for lane detection.
All the magic happens just by calling pipeline
method:
result = lane.pipeline(image)
Where result
is an image after entire pipeline.
Let's see, what happens inside pipeline
method:
First step is to undistort the image (camera is calibrated):
# undistort image
undistorted = self.camera.undistort(image)
Next step is to find edges in an image:
# detect edges
combined = self.edge_detection(undistorted)
The edges detection takes following steps:\
- calculate grayscaled image
gray = Graph.to_grayscale(image)
\ - calculate binary image from tresholded saturation channel of original image
hls_binary = Graph.to_hls(image, 2, thresh=(90, 255))
- calculate binary image from tresholded red channel of original image
rgb_binary = Graph.color_treshold(image, 0, (200, 255))
- combine
hls_binary
andrgb_binary
into one binary imagecombined[((hls_binary == 1) & (rgb_binary == 1))] = 1
In a result following edges are found:
When the edges are found, we need to warp central part of the image to bird eye view.
Source region for warping an image is defined with following coefficients of an image, which are defined in Graph
class in grap.py
file
# ```````````````````````````````````````````
# ` (0.643) `
# ` (0.45)_________________(0.55) `
# ` | \ `
# ` | \ `
# ` | \ `
# ` (0.143) ---------------------- (0.857) `
# ` (1.0) `
# ```````````````````````````````````````````
which corresponds to following region on the screen:
To warp perspective following code is used:
warped_edges, M = Graph.get_perspective_transform(combined)
and the following result is returned:
We need to ensure, that we are working on 1-channel only.
warped_1channel = warped_edges[:, :, 0]
All the magic in Find lane pixels happens inside the method Graph.find_lane_pixels(binary_warped)
When we have perspective warped to bird eye view, we need calculate histogram over y-axis.
histogram = np.sum(binary_warped[binary_warped.shape[0]//2:, :], axis=0)
However, histogram is calculated only on first frame.
After first frame, the position of left and right lane is kept and processing next frame starts from same area without calculation of histogram.
Within the histogram we need to detect position of both lanes
# These will be the starting point for the left and right lines
midpoint = np.int(histogram.shape[0] // 2)
# initial position of left line on warped image
leftx_base = np.argmax(histogram[:midpoint])
# initial position of right line on warped image
rightx_base = np.argmax(histogram[midpoint:]) + midpoint
Next step is to divide each detected line onto N parts (I use 15).
In each little part we find central point. How? Just by calculating histogram of the little part only and find the peak value of it.
# find center of the 'lane region'
left_area = binary_warped[win_y_low:win_y_high, win_xleft_low:win_xleft_high]
right_area = binary_warped[win_y_low:win_y_high, win_xright_low:win_xright_high]
l_area_hist = np.sum(left_area, axis=0)
r_area_hist = np.sum(right_area, axis=0)
l_x_index = np.argmax(l_area_hist) + win_xleft_low
r_x_index = np.argmax(r_area_hist) + win_xright_low
y_index = int((win_y_low + win_y_high) / 2)
Those central points wiil be used to fit lines. As a result we get following picture.
For more precise calculation of central points, distance of lines for each little part and kept in list with history of them:
prev_lane_dist = (r_x_index - l_x_index)
lane_dist_hist.append(prev_lane_dist)
When we have set of central points for each of the line, we can fit them to polynomila describing both lines.
# fit 2nd order left line
if len(left_points)>=3:
left_fit = np.polyfit(left_points[:,1], left_points[:,0], deg=2)
# fit 2nd order right line
if len(right_points)>=3:
right_fit = np.polyfit(right_points[:,1], right_points[:,0], deg=2)
When all lines are detected and polynomials calculated, we need to warp perspective to bird eye view from original image in the same way as the binary image previously.
warped_frame, M = Graph.get_perspective_transform(undistorted)
On this newly warped original image we need to draw both lines and lane itself `lane_drawn = Graph.draw_lanes(warped_frame, left_fitx, right_fitx).
# draw both lines
pts = np.array(l_points, np.int32)
cv2.polylines(warped_frame, [pts], False, red_color, thickness)
pts = np.array(r_points, np.int32)
cv2.polylines(warped_frame, [pts], False, blue_color, thickness)
# fill area between
all_points = np.vstack((l_points, np.flipud(r_points)))
pts = np.array(all_points, np.int32)
#print(pts.shape)
cv2.fillConvexPoly(warped_frame, pts, green_color)
Next step is to warp perspective back from bird eye view to driver view.
lane_reversed, M = Graph.get_perspective_transform(lane_drawn, reverse=True)
... and then draw it back on original undistorted image. It's done by creating a mask and copying un-warped image in masked area.
# create mask for merging original image with drawn lane image got from reverse perspective transform
mask = np.expand_dims(((lane_reversed[:,:,0] == 0) & (lane_reversed[:,:,1]==0) & (lane_reversed[:,:,2]==0)), axis=2)
lane_reversed = mask * undistorted + (1 - mask) * lane_reversed
lane_reversed = np.array(lane_reversed, dtype=np.uint8)
Last step is to calculate position of vehicle on the line and radius of curvature of the lane. Position on the lane is calculated in the method:
lane_pos = self.__find_position_on_lane(midpoint, leftx_base, rightx_base)
Finding radius of curvature is a bit more complicated.
For each line on each frame radius is calculated using this formula which is implemented in l_curvature = self.__find_curvature(left_fit)
50 calculated radius values are kept in history of radiuses and average value is returned.
Radius of curvature is adjusted to real one by multiplying by constant value that is strongly dependent on the shape of original source region used to warp perspective.
The very last step is just to print text on a frame with calculated values.
dist_text = 'Radius of curvature = ' + str("%.2f" % curvature) + ' meters'
cv2.putText(result, dist_text, (50, 50), cv2.FONT_HERSHEY_COMPLEX, 1, (255, 255, 255), 3)
dist_text = 'Distance from mid lane = ' + str("%.2f" % lane_pos) + ' meters' + direction_text
cv2.putText(result, dist_text, (50,100), cv2.FONT_HERSHEY_COMPLEX, 1, (255,255,255), 3)
The result of processing the pipeline can be visible on the processing original video --> project_video.mp4
The result is here --> result_project_video
The entire pipeline needs few improvements.
- I have tried several options to tweak hyperparams and usage different
Sobel x & y
as well asmagnitude
edge detections. For the time being current implementation is just OK. But it does not suit all conditions like shadow or over-lighted areas of lane like inchallenge
andharder_challenge
videos. I've implemented method (included in source code) for correcting histogram of lightness channel in HLS color map. -->Graph.adjust_brightness(image)
This could be used to correct lightness and improve the pipeline. I didn't manage to finish it due to lack of time. - Another improvement would be to run hyperparams optimization (check range of values) and verify which are best in certain lightining conditions and on each frame apply different hyperparams based on lightning.
- Entire pipeline needs performance improvements and usage of faster calculation methods. This will be very crucial when installed in a real car.