- Sam Lehman
- John Barczynski
- Kyle Eckenstine
- Mayank Makwana
- Sushrut Shringarputale
- Kyle Johnson
- Tyler Abbatico
Capital One continues to emerge in technology while providing financial services to customers. We continuously look for opportunities to enhance on existing technology by introducing new ideas and frameworks. Recently, Capital One has turned a major focus to Machine Learning (ML). By investing heavily in ML, Capital One plans to create extremely accurate models to predict behavior, helping us become more efficient, accurate, and secure. We are continuously looking for ways to incorporate ML in our ecosystem.
This repository contains the code for our Capstone project created as a part of the CMPSC 483 course at The Pennsylvania State University. Our project is focused on machine learning and presenting insights in a useful and friendly way. We acheived this through two main sub-projects implemented using Keras in Python for Machine Learning, Java Spring Framework, and React.
The first sub-project, a Penn State logo recognizer gave the project team the opportunity to familiarize themselves with the concepts of machine learning and the frameworks chosen for the project. This tool takes in an image and outputs whether the system thinks the image is a Penn State logo or not.
The second sub-project, a credit card approval classifier tool, applied the skills gained during the logo recognizer task to create a prototype for a system that could be potentially used at companies like Capital One to accept or reject credit card applications. This tool takes in relevant data such as an applicant's credit score and history and outputs a decision to accept or reject the applicant.
The deployed version of this project can be viewed at:
The frontend of the system is a React project using several common components, such as React-Router and React Dropzone for various effects. This system can be run locally using the node package manager (NPM).
React and all of its components are managed by the node package manager (NPM). In order to install NPM onto your system, please see the link on npm's website.
All other component installation will be handled by NPM:
cd frontend
npm install
All other components are installed by npm:
cd frontend
npm start
The Java API is implemented using the Java Spring Framework using Speedment to interface with the database. This backend API receives requests from the React frontend, authenticates users (in progress), handles all database operations, and serves as middleware for all communication with the Python ML models.
Running this project requires Java and the Maven software management tool. These are the only two things that must be explicitly installed in order to run the Java API locally.
All other components are installed by Maven:
cd api
mvn clean install
Installation and Build of all other components are handled by Maven. The Java API relies on the use of seven environment variables for storage of database and AWS credentials. Please contact a member of the production time for their values. These variables are:
Variable | Description |
---|---|
RDS_DB_NAME_CAP | The name of the relational database used by the Java API |
RDS_USERNAME | The username of the relational database system |
RDS_PASSWORD | The password of the RDS system |
RDS_HOSTNAME | The RDS hostname |
RDS_PORT | The port at which to access the RDS |
AWS_ACCESS_KEY_ID | Amazon Web Services AccessID |
AWS_SECRET_ACCESS_KEY | Amazon Web Services Access Key |
Once these variables have been set, running the server is as easy as starting spring:
cd api
mvn spring-boot:start
The Python ML models convert input data (images and credit card information) into boolean answers within some level of confidence, returning that level of confidence to the caller. The interface between the Java API and the Python ML models is handled by a lightweight Flask API.
All major Python components can be installed using Pip, Python's package index. In order to prevent future versioning conflicts, it is recommended that you install the project components within a Python virtual environment.
pip install -r ml_backend/ml_backend/requirements.txt
pip install sklearn
The flask CLI tool requires that the name of the flask app be provided as an environment variable.
Variable | Description |
---|---|
FLASK_APP | The name of the Flask App that needs to be run |
To run the ML service locally using (pre-compiled weights), simply start flask:
flask run
To run the training model for images, create a dataset in images/training and a validation set in images/validation and run imageml.py. This will, by default, overwrite the ML weights currently saved in first_try.h5. Be careful what you commit to the repository.