-
-
Notifications
You must be signed in to change notification settings - Fork 53
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from Jayshah6699/main
PR to update forked repo
- Loading branch information
Showing
4,157 changed files
with
1,479,433 additions
and
22,879 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# Configuration for welcome - https://github.com/behaviorbot/welcome | ||
|
||
# Configuration for new-issue-welcome - https://github.com/behaviorbot/new-issue-welcome | ||
# Comment to be posted to on first time issues | ||
|
||
newIssueWelcomeComment: > | ||
Hello there!👋 Welcome to the project!🚀⚡ | ||
Thank you and congrats🎉 for opening your very first issue in this project. The goal of this project is to have in a single place all data science projects with clean datasets amalgamated with high accuracy models to solve real world problems. | ||
Please adhere to our [Code of Conduct](https://github.com/Jayshah6699/datascience-mashup/blob/main/CODE_OF_CONDUCT.md). | ||
Please make sure not to start working on the issue, unless you get assigned to it.😄 | ||
# Configuration for new-pr-welcome - https://github.com/behaviorbot/new-pr-welcome | ||
# Comment to be posted to on PRs from first time contributors in your repository | ||
|
||
newPRWelcomeComment: > | ||
Hello there!👋 Welcome to the project!💖 | ||
Thank you and congrats🎉 for opening your first pull request. The goal of this project is to have in a single place all data science projects with clean datasets amalgamated with high accuracy models to solve real world problems. | ||
Please make sure you have followed our [Contributing Guidelines](https://github.com/Jayshah6699/datascience-mashup/blob/main/CONTRIBUTING.md).🙌🙌 We will get back to you as soon as we can 😄. | ||
# Configuration for first-pr-merge - https://github.com/behaviorbot/first-pr-merge | ||
# Comment to be posted to on pull requests merged by a first time user | ||
|
||
firstPRMergeComment: > | ||
Congrats on merging your first pull request! 🎉 All the best for your amazing open source journey ahead 🚀. | ||
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# Dataset | ||
|
||
* CSV files for select bitcoin exchanges for the time period of Jan 2012 to December 2020, with minute to minute updates of OHLC (Open, High, Low, Close), Volume in BTC and indicated currency, and weighted bitcoin price. | ||
|
||
* Timestamps are in Unix time. Timestamps without any trades or activity have their data fields filled with NaNs. | ||
|
||
* Link- https://www.kaggle.com/mczielinski/bitcoin-historical-data | ||
|
||
![Bitcoin-Prediciton](https://github.com/AmanSingh0-0/datascience-mashup/raw/main/Bitcoin_Prediction/Bitcoin_prediction.png) |
Large diffs are not rendered by default.
Oops, something went wrong.
2,599 changes: 2,599 additions & 0 deletions
2,599
Butterfly Classification/Butterfly Classification.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
2,599 changes: 2,599 additions & 0 deletions
2,599
Butterfly Classification/ButterflyClassification.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Dataset | ||
* This dataset contains images and textual descriptions for ten categories (species) of butterflies. | ||
|
||
* The image dataset comprises 832 images in total, with the distribution ranging from 55 to 100 images per category. Images were collected from Google Images by querying with the scientific (Latin) name of the species, for example "Danaus plexippus", and manually filtered for those depicting the butterfly of interest. | ||
|
||
* Link - http://www.josiahwang.com/dataset/leedsbutterfly/leedsbutterfly_dataset_v1.0.zip |
2,599 changes: 2,599 additions & 0 deletions
2,599
Butterfly Classification/butterfly_classification.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
### Dataset Details | ||
* The datasets contains transactions made by credit cards in September 2013 by european cardholders. | ||
* This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions. | ||
|
||
* It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, … V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-senstive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise. | ||
|
||
* Link- https://www.kaggle.com/mlg-ulb/creditcardfraud |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"cells":[{"metadata":{"_uuid":"d629ff2d2480ee46fbb7e2d37f6b5fab8052498a","_cell_guid":"79c7e3d0-c299-4dcb-8224-4455121ee9b0","trusted":true},"cell_type":"code","source":"import pandas as pd \nfrom sklearn.model_selection import train_test_split \nfrom sklearn.ensemble import RandomForestClassifier ","execution_count":14,"outputs":[]},{"metadata":{"trusted":true},"cell_type":"code","source":"data = pd.read_csv(\"../input/creditcardfraud/creditcard.csv\") ","execution_count":15,"outputs":[]},{"metadata":{"trusted":true},"cell_type":"code","source":"data.head(5) ","execution_count":16,"outputs":[{"output_type":"execute_result","execution_count":16,"data":{"text/plain":" Time V1 V2 V3 V4 V5 V6 V7 \\\n0 0.0 -1.359807 -0.072781 2.536347 1.378155 -0.338321 0.462388 0.239599 \n1 0.0 1.191857 0.266151 0.166480 0.448154 0.060018 -0.082361 -0.078803 \n2 1.0 -1.358354 -1.340163 1.773209 0.379780 -0.503198 1.800499 0.791461 \n3 1.0 -0.966272 -0.185226 1.792993 -0.863291 -0.010309 1.247203 0.237609 \n4 2.0 -1.158233 0.877737 1.548718 0.403034 -0.407193 0.095921 0.592941 \n\n V8 V9 ... V21 V22 V23 V24 V25 \\\n0 0.098698 0.363787 ... -0.018307 0.277838 -0.110474 0.066928 0.128539 \n1 0.085102 -0.255425 ... -0.225775 -0.638672 0.101288 -0.339846 0.167170 \n2 0.247676 -1.514654 ... 0.247998 0.771679 0.909412 -0.689281 -0.327642 \n3 0.377436 -1.387024 ... -0.108300 0.005274 -0.190321 -1.175575 0.647376 \n4 -0.270533 0.817739 ... -0.009431 0.798278 -0.137458 0.141267 -0.206010 \n\n V26 V27 V28 Amount Class \n0 -0.189115 0.133558 -0.021053 149.62 0 \n1 0.125895 -0.008983 0.014724 2.69 0 \n2 -0.139097 -0.055353 -0.059752 378.66 0 \n3 -0.221929 0.062723 0.061458 123.50 0 \n4 0.502292 0.219422 0.215153 69.99 0 \n\n[5 rows x 31 columns]","text/html":"<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Time</th>\n <th>V1</th>\n <th>V2</th>\n <th>V3</th>\n <th>V4</th>\n <th>V5</th>\n <th>V6</th>\n <th>V7</th>\n <th>V8</th>\n <th>V9</th>\n <th>...</th>\n <th>V21</th>\n <th>V22</th>\n <th>V23</th>\n <th>V24</th>\n <th>V25</th>\n <th>V26</th>\n <th>V27</th>\n <th>V28</th>\n <th>Amount</th>\n <th>Class</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>0.0</td>\n <td>-1.359807</td>\n <td>-0.072781</td>\n <td>2.536347</td>\n <td>1.378155</td>\n <td>-0.338321</td>\n <td>0.462388</td>\n <td>0.239599</td>\n <td>0.098698</td>\n <td>0.363787</td>\n <td>...</td>\n <td>-0.018307</td>\n <td>0.277838</td>\n <td>-0.110474</td>\n <td>0.066928</td>\n <td>0.128539</td>\n <td>-0.189115</td>\n <td>0.133558</td>\n <td>-0.021053</td>\n <td>149.62</td>\n <td>0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>0.0</td>\n <td>1.191857</td>\n <td>0.266151</td>\n <td>0.166480</td>\n <td>0.448154</td>\n <td>0.060018</td>\n <td>-0.082361</td>\n <td>-0.078803</td>\n <td>0.085102</td>\n <td>-0.255425</td>\n <td>...</td>\n <td>-0.225775</td>\n <td>-0.638672</td>\n <td>0.101288</td>\n <td>-0.339846</td>\n <td>0.167170</td>\n <td>0.125895</td>\n <td>-0.008983</td>\n <td>0.014724</td>\n <td>2.69</td>\n <td>0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>1.0</td>\n <td>-1.358354</td>\n <td>-1.340163</td>\n <td>1.773209</td>\n <td>0.379780</td>\n <td>-0.503198</td>\n <td>1.800499</td>\n <td>0.791461</td>\n <td>0.247676</td>\n <td>-1.514654</td>\n <td>...</td>\n <td>0.247998</td>\n <td>0.771679</td>\n <td>0.909412</td>\n <td>-0.689281</td>\n <td>-0.327642</td>\n <td>-0.139097</td>\n <td>-0.055353</td>\n <td>-0.059752</td>\n <td>378.66</td>\n <td>0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>1.0</td>\n <td>-0.966272</td>\n <td>-0.185226</td>\n <td>1.792993</td>\n <td>-0.863291</td>\n <td>-0.010309</td>\n <td>1.247203</td>\n <td>0.237609</td>\n <td>0.377436</td>\n <td>-1.387024</td>\n <td>...</td>\n <td>-0.108300</td>\n <td>0.005274</td>\n <td>-0.190321</td>\n <td>-1.175575</td>\n <td>0.647376</td>\n <td>-0.221929</td>\n <td>0.062723</td>\n <td>0.061458</td>\n <td>123.50</td>\n <td>0</td>\n </tr>\n <tr>\n <th>4</th>\n <td>2.0</td>\n <td>-1.158233</td>\n <td>0.877737</td>\n <td>1.548718</td>\n <td>0.403034</td>\n <td>-0.407193</td>\n <td>0.095921</td>\n <td>0.592941</td>\n <td>-0.270533</td>\n <td>0.817739</td>\n <td>...</td>\n <td>-0.009431</td>\n <td>0.798278</td>\n <td>-0.137458</td>\n <td>0.141267</td>\n <td>-0.206010</td>\n <td>0.502292</td>\n <td>0.219422</td>\n <td>0.215153</td>\n <td>69.99</td>\n <td>0</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows × 31 columns</p>\n</div>"},"metadata":{}}]},{"metadata":{"trusted":true},"cell_type":"code","source":"valid = data[data['Class'] == 0]\nprint('Valid Transactions: {}'.format(len(data[data['Class'] == 0])))","execution_count":17,"outputs":[{"output_type":"stream","text":"Valid Transactions: 284315\n","name":"stdout"}]},{"metadata":{"trusted":true},"cell_type":"code","source":"fraud = data[data['Class'] == 1] \nprint('Fraud Cases: {}'.format(len(data[data['Class'] == 1]))) ","execution_count":18,"outputs":[{"output_type":"stream","text":"Fraud Cases: 492\n","name":"stdout"}]},{"metadata":{"trusted":true},"cell_type":"code","source":"X = data.drop(['Class'], axis = 1) \nY = data[\"Class\"] \nx = X.values \ny = Y.values","execution_count":21,"outputs":[]},{"metadata":{"trusted":true},"cell_type":"code","source":"X_train, X_test, Y_train, Y_Test = train_test_split(x, y, test_size = 0.25, random_state = 128)","execution_count":26,"outputs":[]},{"metadata":{"trusted":true},"cell_type":"code","source":"model = RandomForestClassifier() \nmodel.fit(X_train, Y_train) ","execution_count":27,"outputs":[{"output_type":"execute_result","execution_count":27,"data":{"text/plain":"RandomForestClassifier()"},"metadata":{}}]},{"metadata":{"trusted":true},"cell_type":"code","source":"Y_Pred = model.predict(X_test)","execution_count":29,"outputs":[]},{"metadata":{"trusted":true},"cell_type":"code","source":"from sklearn.metrics import accuracy_score","execution_count":30,"outputs":[]},{"metadata":{"trusted":true},"cell_type":"code","source":"acc = accuracy_score(Y_Test, Y_Pred) \nprint(\"The accuracy is {}\".format(acc)) ","execution_count":33,"outputs":[{"output_type":"stream","text":"The accuracy is 0.9995646189713772\n","name":"stdout"}]}],"metadata":{"kernelspec":{"name":"python3","display_name":"Python 3","language":"python"},"language_info":{"name":"python","version":"3.7.6","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"}},"nbformat":4,"nbformat_minor":4} |
1,015 changes: 1,015 additions & 0 deletions
1,015
Diabetes-Prediction-master/Diabetes_Prediction.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.