This is a numpy implementation of Logestic Regression pipline. The accuracy score is 80% and F1-score is almost 77%. The data set is skewed and there is only 24% examples of the people who have income greater than 50K. That is why the Recall score for this label is low which is around 44%
This pipline can be used on any dataset after preprocessing the dataset. The data set must be in the shape(no. of features, no. of examples)
I have tried many things to get ridoff the skewness of the dataset but I failed. if anyone can help to make this better I will be thankful to him.