|
10 | 10 | In the study of objects within our solar system, there have been many attempts to classify groups of objects to help estimate their properties. However, the classical approach can miss the subtle correlations that machine learning techniques thrive on. This study aims to enhance the prediction of asteroid features using machine learning algorithms. We aim to utilize a dataset provided by Jet Propulsion Laboratory of California Institute of Technology, and apply various regression techniques to achieve higher accuracy and low error rates in feature prediction. The dataset comprises 31 features for 839,714 objects, including their names, semi-major axis, eccentricity, inclination, orbital period, diameter, and other orbital elements. Our project focuses on utilizing feature engineering, linear and polynomial regression models. Additionally, we aim to use clustering algorithms to attempt to classify asteroids. Our findings contribute to the growing intersection between machine learning and astronomy, providing robust tools for potential applications in space warning systems.
|
11 | 11 |
|
12 | 12 | ## Milestone 2:
|
13 |
| -### Data Preprocessing |
| 13 | +### Overview |
| 14 | +This project analyzes a dataset with 839,714 observations and 31 features. The analysis includes data cleaning, encoding, and visualization to understand correlations and distributions. |
| 15 | + |
| 16 | +### Data Description |
| 17 | +Our data consists of: |
| 18 | +<table> |
| 19 | + <tr> |
| 20 | + <th>Feature Name</th> |
| 21 | + <th>Description</th> |
| 22 | + </tr> |
| 23 | + <tr> |
| 24 | + <td>full_name</td> |
| 25 | + <td>Full Name of Body: Contains full unique name of the body.</td> |
| 26 | + </tr> |
| 27 | + <tr> |
| 28 | + <td>a</td> |
| 29 | + <td>Semi-Major Axis (Unit - au): The average distance between the object and the Sun, measured in astronomical units (au).</td> |
| 30 | + </tr> |
| 31 | + <tr> |
| 32 | + <td>e</td> |
| 33 | + <td>Eccentricity: Describes the shape of the object's orbit, with values ranging from 0 (circular) to close to 1 (highly elliptical).</td> |
| 34 | + </tr> |
| 35 | + <tr> |
| 36 | + <td>G</td> |
| 37 | + <td>Magnitude Slope Parameter: Factor in determining the brightness variation of the object, reflecting how its brightness changes with phase angle.</td> |
| 38 | + </tr> |
| 39 | + <tr> |
| 40 | + <td>i</td> |
| 41 | + <td>Inclination (Unit - deg): Angle of the object's orbital plane relative to the plane of the solar system, measured in degrees.</td> |
| 42 | + </tr> |
| 43 | + <tr> |
| 44 | + <td>om</td> |
| 45 | + <td>Longitude of the Ascending Node: Angle from the reference direction (usually the vernal equinox) to the point where the object's orbit crosses the plane of the solar system from South to North.</td> |
| 46 | + </tr> |
| 47 | + <tr> |
| 48 | + <td>w</td> |
| 49 | + <td>Argument of Perihelion: Angle between the ascending node and the point of closest approach to the Sun (perihelion).</td> |
| 50 | + </tr> |
| 51 | + <tr> |
| 52 | + <td>q</td> |
| 53 | + <td>Perihelion Distance (Unit - au): Shortest distance between the object and the Sun during its orbit, measured in astronomical units (au).</td> |
| 54 | + </tr> |
| 55 | + <tr> |
| 56 | + <td>ad</td> |
| 57 | + <td>Aphelion Distance (Unit - au): Farthest distance between the object and the Sun during its orbit, measured in astronomical units (au).</td> |
| 58 | + </tr> |
| 59 | + <tr> |
| 60 | + <td>per_y</td> |
| 61 | + <td>Orbital Period: Time taken for the object to complete one full orbit around the Sun, measured in years.</td> |
| 62 | + </tr> |
| 63 | + <tr> |
| 64 | + <td>data_arc</td> |
| 65 | + <td>Data Arc-Span (Unit - Days): Duration over which observations of the object have been collected, measured in days.</td> |
| 66 | + </tr> |
| 67 | + <tr> |
| 68 | + <td>condition_code</td> |
| 69 | + <td>Orbit Condition Code: Numerical code indicating the quality and reliability of the object's orbital data, with 0 being the most reliable.</td> |
| 70 | + </tr> |
| 71 | + <tr> |
| 72 | + <td>n_obs_used</td> |
| 73 | + <td>Number of Observations Used: Total number of observations used to determine the object's orbital parameters.</td> |
| 74 | + </tr> |
| 75 | + <tr> |
| 76 | + <td>H</td> |
| 77 | + <td>Absolute Magnitude Parameter: Measure of the object's intrinsic brightness, indicating its size and reflectivity.</td> |
| 78 | + </tr> |
| 79 | + <tr> |
| 80 | + <td>diameter</td> |
| 81 | + <td>Diameter of Asteroid (Unit - Km): Physical size of the asteroid, measured in kilometers (km).</td> |
| 82 | + </tr> |
| 83 | + <tr> |
| 84 | + <td>extent</td> |
| 85 | + <td>Object Bi/Tri-Axial Ellipsoid Dimensions (Unit - Km): Dimensions describing the shape and size of the object in terms of its three principal axes, measured in kilometers (km).</td> |
| 86 | + </tr> |
| 87 | + <tr> |
| 88 | + <td>albedo</td> |
| 89 | + <td>Geometric Albedo: Reflectivity of the object's surface, indicating the proportion of sunlight it reflects.</td> |
| 90 | + </tr> |
| 91 | + <tr> |
| 92 | + <td>rot_per</td> |
| 93 | + <td>Rotation Period (Unit - Hours): Time taken for the object to complete one full rotation on its axis, measured in hours.</td> |
| 94 | + </tr> |
| 95 | + <tr> |
| 96 | + <td>GM</td> |
| 97 | + <td>Standard Gravitational Parameter: Product of the gravitational constant and the object's mass, used in gravitational calculations.</td> |
| 98 | + </tr> |
| 99 | + <tr> |
| 100 | + <td>BV</td> |
| 101 | + <td>Color Index B-V Magnitude Difference: Difference in brightness between the object in the B (blue) and V (visual) photometric bands, indicating its color.</td> |
| 102 | + </tr> |
| 103 | + <tr> |
| 104 | + <td>UB</td> |
| 105 | + <td>Color Index U-B Magnitude Difference: Difference in brightness between the object in the U (ultraviolet) and B (blue) photometric bands, providing spectral information.</td> |
| 106 | + </tr> |
| 107 | + <tr> |
| 108 | + <td>IR</td> |
| 109 | + <td>Color Index I-R Magnitude Difference: Difference in brightness between the object in the I (infrared) and R (red) photometric bands, conveying thermal properties.</td> |
| 110 | + </tr> |
| 111 | + <tr> |
| 112 | + <td>spec_B</td> |
| 113 | + <td>Spectral Taxonomic Type (Unit - SMASSII): Spectral classification of the object based on the SMASSII scheme, indicating its mineral composition and surface features.</td> |
| 114 | + </tr> |
| 115 | + <tr> |
| 116 | + <td>spec_T</td> |
| 117 | + <td>Spectral Taxonomic Type (Unit - Tholen): Spectral classification of the object based on the Tholen system, indicating its spectral characteristics, composition, and origin.</td> |
| 118 | + </tr> |
| 119 | + <tr> |
| 120 | + <td>neo</td> |
| 121 | + <td>Near Earth Object: Indicates whether the object is classified as a Near Earth Object (NEO), meaning its orbit brings it close to Earth's orbit.</td> |
| 122 | + </tr> |
| 123 | + <tr> |
| 124 | + <td>pha</td> |
| 125 | + <td>Potentially Hazardous Asteroid: Identifies whether the object is classified as a Potentially Hazardous Asteroid (PHA), posing a potential threat to Earth.</td> |
| 126 | + </tr> |
| 127 | + <tr> |
| 128 | + <td>moid</td> |
| 129 | + <td>Earth Minimum Orbit Intersection Distance (Unit - au): Smallest distance between the object's orbit and Earth's orbit, measured in astronomical units (au), indicating potential close encounters.</td> |
| 130 | + </tr> |
| 131 | + <tr> |
| 132 | + <td>class</td> |
| 133 | + <td>Class of Asteroid: Visit nasa.com to learn more about classes</td> |
| 134 | + </tr> |
| 135 | + <tr> |
| 136 | + <td>n</td> |
| 137 | + <td>Unsure of what this is</td> |
| 138 | + </tr> |
| 139 | + <tr> |
| 140 | + <td>per</td> |
| 141 | + <td>Period</td> |
| 142 | + </tr> |
| 143 | + <tr> |
| 144 | + <td>ma</td> |
| 145 | + <td>ma</td> |
| 146 | + </tr> |
| 147 | +</table> |
| 148 | + |
| 149 | +### Data Description |
| 150 | +We used data.describe() to get a distribution: |
| 151 | + |
14 | 152 |
|
0 commit comments