Skip to content

Latest commit

 

History

History
34 lines (29 loc) · 1.64 KB

File metadata and controls

34 lines (29 loc) · 1.64 KB

Job-Prediction-Data-Mining-Analysis-and-ML

Analysis Report has been attached which contains:

  1. Information of the Dataset
      a. type/Count and Non Null information about the dataset
      b. unique values in each Non Numeric column
      c. HeatMap

  2. Data Preprocessing and Analysis
      a. Handling if any missing values
      b. Checking if the percentage values lie between 0 to 100
      c. Label Encoding the Categorical/Non-Numeric Values
      d. Feature Scaling the Numeric Values

  3. Combining the Output Labels
      a. Manually Combining the labels
      b. Combining Features using Cosine Similarity between labels

  4. Feature Selection

  5. Model Evaluation/Experiments and Analysis
      a. ANN Model on both types of Output Clubbing Techniques and Varying Test Sizes
        i. Output Columns Clustering Technique 1 - Manually with Varying Test Size [0.1,0.2,0.3,0.4,0.5]
        ii. Output Columns Clustering Technique 2 - Cosine Similarity with Varying Test Size [0.1,0.2,0.3,0.4,0.5]
      b. ANN Model on both types of Output Clubbing Techniques and Varying Hidden Layers and Neurons
        i. Output Columns Clustering Technique 1 - Manually with varying neurons and layers of the hidden layer: hidden_layers=[(50),(50,50),(25,25),(50,25),(50,50,50),(50,50,25),(50,25,2 5),(50,50,25,25),(50,50,50,25),(50,50,50,50)]
      c. ANN Model on Feature Selection Data on both thee techniques: