bankruptcy-data-exp

The dataset is provided by Sebastian Tomczak and collected from Emerging Markets Information Service (EMIS) : https://archive.ics.uci.edu/ml/datasets/Polish+companies+bankruptcy+data

STARTER BELLOW

The dataset is about bankruptcy prediction of Polish companies. In theses datasets, we retrieve information about emerging markets around the word (or Poland, who knows ?). A dataset is composed of thousands of rows where each row corresponds to a company. The attribute about theses companies is given in data/description.txt file. Here, is a sample of what we can have in a dataset :

bankrupt
0	0.034279	0.42448	-0.075832	0.67532	-77.334	-0.01497	0.044048	1.3558	1.1287	0.57552	0.044048	0.1886	0.11021	0.044048	2069.8	0.17635	2.3558	0.044048	0.064853	22.179	1.0305	0.077574	0.050469	-0.016044	0.57552	0.15333	1.2892	-0.090033	5.1839	0.61859	0.064853	141.67	2.5764	0.18275	0.077574	0.67974	0.60997	0.76644	0.11421	0.04225	0.12876	0.11421	79.459	57.28	0.83056	0.49861	25.035	0.046766	0.068854	0.37158	0.23356	0.38815	0.6833	0.90997	-11581.0	0.11406	0.059561	0.88594	0.33173	16.457	6.3722	125.51	2.908	0.80639
0	0.096308	0.50574	0.48163	1.9523	229.04	0	0.096308	0.97731	3.7981	0.49426	0.15378	0.19043	0.42351	0.096308	114.76	3.1806	1.9773	0.096308	0.025357	6.514	0.60105	0	0.025357	0.32281	0.45095	3.1806	0	38.13	3.0624	0.026525	0.059985	85.534	4.2673	4.2673	0.0045052	3.7981	?	0.49426	0.0011862	0.80652	0.011148	0	55.688	49.174	1.4208	1.8183	11.464	-1.5122	-0.39815	1.9523	0.50574	0.23434	39.13	39.13	556.01	0.43179	0.19485	0.58486	0	56.033	7.4227	48.601	7.5101	300.69
0	-0.20902	1.2022	-0.2562	0.053378	-108.75	-0.38107	-0.20902	-0.16822	0.82685	-0.20224	-0.14916	-0.77232	-0.098138	-0.20902	-5407.8	-0.067495	0.83178	-0.20902	-0.25279	0	?	-0.14916	-0.25279	-0.20902	-0.59009	-0.067495	-2.4917	-0.25995	2.8692	1.454	-0.1804	101.21	3.6062	0.81183	-0.14916	0.82685	0.015507	0.72936	-0.1804	9.9866e-006	-1.883	-0.1804	6.376	6.376	?	0.053378	0	-0.27704	-0.33505	0.012016	0.27064	0.2773	-0.20521	0.74005	-189.58	-0.1804	1.0335	1.2528	-4.6064	?	57.246	119.47	3.0551	0.83897
0	0.20097	0.19291	0.23709	2.229	93.472	0	0.20097	4.1836	2.8936	0.80709	0.20097	1.0418	0.41244	0.20097	59.001	6.1864	5.1836	0.20097	0.069453	9.6467	0.85696	0	0.069453	0.51819	0.76878	6.1864	?	0.41595	3.7389	0.040859	0.15536	43.706	8.3512	8.3512	0.0066855	2.8936	?	0.80709	0.0023105	0.38714	0.0064793	0	44.82	35.174	2.6279	1.8326	17.326	-0.99247	-0.34299	2.229	0.19291	0.11974	1.416	1.416	1299.7	0.44323	0.24901	0.55688	0	37.837	10.377	24.334	14.999	5.0765
0	-0.11132	0.64559	0.0041018	1.0071	-38.084	0	-0.11132	0.54897	2.5568	0.35441	-0.026645	-0.19222	-0.022736	-0.11132	-4053.6	-0.090045	1.549	-0.11132	-0.043539	37.421	1.0872	-0.09603	-0.043539	-0.1175	-0.085977	-0.090045	-1.1341	0.0098424	3.3675	0.25123	-0.033709	81.987	4.4519	3.9938	-0.021489	2.5568	7.0439	0.4	-0.0084045	0.021281	-0.50233	-0.037558	81.502	44.081	-0.42467	0.55446	37.109	-0.14922	-0.058361	0.90344	0.57915	0.22462	0.85041	0.9598	9.56	-0.0084045	-0.31411	1.042	0.12863	9.7539	8.2802	82.676	4.4148	6.1352
1	-0.40937	0.58325	0.20188	1.3461	-0.7769	0.0	-0.40937	0.71453	9.8193	0.41675	-0.25112	-0.70189	-0.024009	-0.40937	-903.02	-0.4042	1.7145	-0.40937	-0.041691	2.7925	?	-0.37487	-0.041691	-0.40937	-0.20825	-0.4042	-2.3689	0.94005	1.9031	0.016142	-0.041691	20.883	17.478	17.478	-0.37487	9.943	?	0.41675	-0.038178	0.98264	-0.096605	-0.038178	7.8804	5.0879	-5.4493	1.1114	2.6898	-0.5485	-0.05586	1.3461	0.58325	0.057214	1.9406	1.9406	16.15	-0.03819	-0.9823	1.0253	0.0	130.71	71.739	21.681	16.835	45.724
1	-0.19899	0.42164	0.57836	2.3717	50.094	-0.20152	-0.19899	1.3717	3.9931	0.57836	-0.19797	-0.47193	-0.041069	-0.19899	-938.46	-0.38893	2.3717	-0.19899	-0.049833	0	?	-0.19831	-0.049833	-0.19899	-0.38529	-0.38893	-195.5	?	1.772	-0.054405	-0.049833	36.719	9.9407	9.9407	-0.19831	3.9932	?	0.57836	-0.049663	1.5152	-0.086059	-0.049663	33.009	33.009	?	1.5152	0	-0.23331	-0.058428	2.3717	0.42164	0.1006	?	?	34.21	-0.049621	-0.34405	1.0496	0.0	?	11.058	38.541	9.4703	?
1	0.14806	0.83471	-0.050636	0.8059	-43.448	-0.34617	0.16452	0.19804	0.82432	0.16531	0.22872	0.63064	0.26792	0.16452	1379.5	0.26459	1.198	0.16452	0.19958	25.982	?	0.2287	0.17961	0.16452	-0.19827	0.24487	3.5621	-0.064112	3.5149	0.95298	0.20139	94.239	3.8957	1.2175	-0.18627	1.2451	0.28541	0.69632	-0.22597	0.21346	0.097614	0.27744	68.433	42.451	2.5232	0.43839	21.074	0.17236	0.2091	0.25187	0.26087	0.25669	0.2093	0.88164	-165.73	-0.22572	0.89565	0.81625	3.2123	14.048	8.5981	115.51	3.1599	1.0437
1	-0.092469	0.8223	-0.18051	0.75948	-200.23	0.0	-0.096245	0.2161	1.1797	0.1777	-0.059312	-0.12824	-0.046759	-0.096245	-5441.4	-0.067079	1.2161	-0.096245	-0.081588	126.65	0.98898	-0.07112	-0.078387	0.91179	0.088247	-0.062487	-1.9257	-0.41978	4.4644	0.68549	-0.058165	258.64	1.4744	1.3456	-0.079529	1.1797	5.0715	0.20938	-0.067417	0.021861	-0.91265	-0.060289	171.28	44.637	-0.22591	0.21409	135.02	-0.11221	-0.095118	0.69316	0.7505	0.67826	0.41323	0.48691	-5259	0.10219	-0.52038	0.9693	0.17829	2.882	8.177	232.21	1.5718	2.7433
1	-0.006009	0.87154	-0.30285	0.65251	-20.725	-2.1532	-0.006009	0.14746	5.5023	0.12852	0.12882	-0.0068951	0.018794	-0.006009	3076.2	0.11865	1.1474	-0.006009	-0.0010922	0.0078939	?	0.12882	-0.0010922	-0.006009	-2.1592	0.11865	0.95543	-0.70217	2.2255	0.10544	-0.0010381	57.891	6.305	6.305	0.007199	5.6238	?	0.12852	0.0013084	0.34244	0.12194	0.023411	17.927	17.919	-50.5	0.34257	0.0079043	0.019397	0.0035252	0.65251	0.87154	0.15861	0.29797	0.29797	-50.9	0.0013192	-0.046759	0.97709	0.0	46239	20.369	57.815	6.3133	12.757

We are in a case of supervised learning, a classification with labelled data which indicates whether the company bankrupted or not.

The goal of this project is to explore, analyse and make data visualisation before applying machine learning algorithms in order to predict and classify, with the best accuracy, the situation of a company.

In total in this dataset, we have more than 43 000 companies' status inequitably distributed on 5 years. The columns represent the 64 variables we will use to predict bankruptcy. Among these variables, we can see there is underlying variables which appear and impact a lot of variables, like total assets, total liabilities, net profit and much more that we can use directly in order to have fewer variables to include in our model without loss of information. We can also see there is multiple missing values symbolised by a "?" in many fields.

To launch the API

Start to clone the project :

git clone https://github.com/a-brice/bankruptcy-data-exp.git
cd bankruptcy-data-exp

To launch the API, you must install some dependencies first. (Window) From a shell from the root directory, enter the following :

python -m venv env
.\env\Scripts\activate
pip install -r requirement.txt

When all installations are completed (a bit long), you can now run the API :

cd api
python manage.py runserver

And after that, go to http://127.0.0.1:8000 and explore !

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
api		api
data		data
formula		formula
model		model
visualisation		visualisation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
formula.py		formula.py
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bankruptcy-data-exp

To launch the API

About

Languages

License

a-brice/bankruptcy-data-exp

Folders and files

Latest commit

History

Repository files navigation

bankruptcy-data-exp

To launch the API

About

Topics

Resources

License

Stars

Watchers

Forks

Languages