Skip to content

Commit

Permalink
Version 2.0.3
Browse files Browse the repository at this point in the history
New Version documentation!
  • Loading branch information
Metalkiler committed Nov 18, 2020
1 parent f14b19f commit 865bf41
Show file tree
Hide file tree
Showing 10 changed files with 87 additions and 72 deletions.
46 changes: 22 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,51 +21,45 @@ This method results in 689 binary inputs, which is much less than the 10690 bina

--> Implementation of a simpler One-Hot-Encoding method.

--> Minmax and Standard scaler (based on sklearn functions) with column selection and multicore support. Also, it is possible to apply these transformations to specific columns only instead of the full dataset (follow the example). However it only works with numerical data (e.g., MSE, decision scores)

--> You can also provide a custom scaler version of your own! (check example)

# Alpha-Release
--> Minmax and Standard scaler (based on sklearn functions) with column selection and multicore support

It is possible to apply these transformations to specific columns only instead of the full dataset (follow the example).

New Feature :

[x] - Introduced a scaler function implementation based from skelearn package but allowing to choose each columns you want to use and multiprocessing function. Also you can provide a custom function of your own! (check Example)
Future Function ideas:
--
MultiColumn scale (based on the implementation of IDF and PCP)
Scaling of IDF values (normalized IDF)




Future Function:

[Future] - MultiColumn scale (based on the implementation of IDF and PCP)




# Installation

## Stable Version
To install this package please run the following command

``` cmd
pip install cane
```

## Alpha Version
To install this package please run the following command
# New
Version 2.0.3:

[x] - Introduced a scaler function implementation based from skelearn package but allowing to choose each columns you want to use and multiprocessing function.

[x] - Also you can provide a custom function of your own! (check Example)

``` cmd
pip install cane==2.0.3b1
```

# New
Version 2.0.2 - This version has the requirements updated

# Suggestions and feedback

Any feedback will be appreciated.
For questions and other suggestions contact luis.matos@dsi.uminho.pt
Found any bugs? Post Them on the github page of the project! (https://github.com/Metalkiler/Cane-Categorical-Attribute-traNsformation-Environment)

Thanks for the support!

# Example

Expand Down Expand Up @@ -160,15 +154,19 @@ print("PCP Time Multicore:",PTM)

```

# Alpha Release Example
# Scaler Example with cane

These examples present the usage of cane with the standard methods (standard scaler e min max scaler).
Also, it is presented how to implement a custom scaler function of your own with cane!
``` python
#New Scaler Function



dfNumbers = pd.DataFrame(np.random.randint(0,100000,size=(100000, 12)), columns=list('ABCDEFGHIJKL'))
cane.scale_data(dfNumbers, n_cores = 3, scaleFunc="min_max")
cane.scale_data(dfNumbers, column=["A","B"], n_cores = 3, scaleFunc="min_max")
cane.scale_data(dfNumbers, n_cores = 3, scaleFunc="min_max") # all columns using 3 cores
cane.scale_data(dfNumbers, column=["A","B"], n_cores = 3, scaleFunc="min_max") # scale specific columns
cane.scale_data(dfNumbers, column=["A","B"], n_cores = 3, scaleFunc="std") #standard Scaler



Expand All @@ -180,7 +178,7 @@ import numpy as np
import cane

def customFunc(val):
return pd.DataFrame([round((i - 1) / 3, 2) for i in val],columns=[val.name + "_custom_scalled_min_max"])
return pd.DataFrame([round((i - 1) / 3, 2) for i in val],columns=[val.name + "_custom_scalled_function])



Expand Down
46 changes: 22 additions & 24 deletions cane/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,51 +21,45 @@ This method results in 689 binary inputs, which is much less than the 10690 bina

--> Implementation of a simpler One-Hot-Encoding method.

--> Minmax and Standard scaler (based on sklearn functions) with column selection and multicore support. Also, it is possible to apply these transformations to specific columns only instead of the full dataset (follow the example). However it only works with numerical data (e.g., MSE, decision scores)

--> You can also provide a custom scaler version of your own! (check example)

# Alpha-Release
--> Minmax and Standard scaler (based on sklearn functions) with column selection and multicore support

It is possible to apply these transformations to specific columns only instead of the full dataset (follow the example).

New Feature :

[x] - Introduced a scaler function implementation based from skelearn package but allowing to choose each columns you want to use and multiprocessing function. Also you can provide a custom function of your own! (check Example)
Future Function ideas:
--
MultiColumn scale (based on the implementation of IDF and PCP)
Scaling of IDF values (normalized IDF)




Future Function:

[Future] - MultiColumn scale (based on the implementation of IDF and PCP)




# Installation

## Stable Version
To install this package please run the following command

``` cmd
pip install cane
```

## Alpha Version
To install this package please run the following command
# New
Version 2.0.3:

[x] - Introduced a scaler function implementation based from skelearn package but allowing to choose each columns you want to use and multiprocessing function.

[x] - Also you can provide a custom function of your own! (check Example)

``` cmd
pip install cane==2.0.3b1
```

# New
Version 2.0.2 - This version has the requirements updated

# Suggestions and feedback

Any feedback will be appreciated.
For questions and other suggestions contact luis.matos@dsi.uminho.pt
Found any bugs? Post Them on the github page of the project! (https://github.com/Metalkiler/Cane-Categorical-Attribute-traNsformation-Environment)

Thanks for the support!

# Example

Expand Down Expand Up @@ -160,15 +154,19 @@ print("PCP Time Multicore:",PTM)

```

# Alpha Release Example
# Scaler Example with cane

These examples present the usage of cane with the standard methods (standard scaler e min max scaler).
Also, it is presented how to implement a custom scaler function of your own with cane!
``` python
#New Scaler Function



dfNumbers = pd.DataFrame(np.random.randint(0,100000,size=(100000, 12)), columns=list('ABCDEFGHIJKL'))
cane.scale_data(dfNumbers, n_cores = 3, scaleFunc="min_max")
cane.scale_data(dfNumbers, column=["A","B"], n_cores = 3, scaleFunc="min_max")
cane.scale_data(dfNumbers, n_cores = 3, scaleFunc="min_max") # all columns using 3 cores
cane.scale_data(dfNumbers, column=["A","B"], n_cores = 3, scaleFunc="min_max") # scale specific columns
cane.scale_data(dfNumbers, column=["A","B"], n_cores = 3, scaleFunc="std") #standard Scaler



Expand All @@ -180,7 +178,7 @@ import numpy as np
import cane

def customFunc(val):
return pd.DataFrame([round((i - 1) / 3, 2) for i in val],columns=[val.name + "_custom_scalled_min_max"])
return pd.DataFrame([round((i - 1) / 3, 2) for i in val],columns=[val.name + "_custom_scalled_function])



Expand Down
8 changes: 6 additions & 2 deletions cane/build/lib/cane/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,10 @@ def scale_data(df, column=[], n_cores=1, scaleFunc="", customfunc=None):
if scaleFunc == 'custom':
assert (callable(customfunc)), "Please provide a function for the custom function you want to use"

if column is not None:
assert all(flag in df.columns for flag in
column), "Use columns specific to the dataset given the columns provided are not found " \
+ ' '.join([j for j in column])
valArgs = []
if len(column) == 0:
columns = df.columns.values
Expand All @@ -330,7 +334,7 @@ def scale_data(df, column=[], n_cores=1, scaleFunc="", customfunc=None):
columns = df.columns.values
for i in column:
valArgs.append(df[i])
diff = list(set(columns) - set(column))
diff = columns
if scaleFunc == "min_max":
func = partial(scale_single_min_max)
elif scaleFunc == "std":
Expand Down Expand Up @@ -361,4 +365,4 @@ def scale_single_std(val):


def __version__():
print("2.0.3a3")
print("2.0.3")
48 changes: 31 additions & 17 deletions cane/cane.egg-info/PKG-INFO
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Metadata-Version: 2.1
Name: cane
Version: 2.0.3a3
Version: 2.0.3.1
Summary: Cane - Categorical Attribute traNsformation Environment
Home-page: https://github.com/Metalkiler/Cane-Categorical-Attribute-traNsformation-Environment
Author: Luís Miguel Matos, Paulo Cortez, Rui Mendes
Expand Down Expand Up @@ -29,42 +29,45 @@ Description: # Cane - Categorical Attribute traNsformation Environment

--> Implementation of a simpler One-Hot-Encoding method.

--> Minmax and Standard scaler (based on sklearn functions) with column selection and multicore support
--> Minmax and Standard scaler (based on sklearn functions) with column selection and multicore support. Also, it is possible to apply these transformations to specific columns only instead of the full dataset (follow the example). However it only works with numerical data (e.g., MSE, decision scores)

It is possible to apply these transformations to specific columns only instead of the full dataset (follow the example).
--> You can also provide a custom scaler version of your own! (check example)

New Feature :

[x] - Introduced a scaler function implementation based from skelearn package but allowing to choose each columns you want to use and multiprocessing function. Also you can provide a custom function of your own! (check Example)


Future Function ideas:
--
MultiColumn scale (based on the implementation of IDF and PCP)
Scaling of IDF values (normalized IDF)


Future Function:

[Future] - MultiColumn scale (based on the implementation of IDF and PCP)




# Installation

## Stable Version
To install this package please run the following command

``` cmd
pip install cane


```

# New
Version 2.0.2 - This version has the requirements updated
Version 2.0.3:

[x] - Introduced a scaler function implementation based from skelearn package but allowing to choose each columns you want to use and multiprocessing function.

[x] - Also you can provide a custom function of your own! (check Example)



# Suggestions and feedback

Any feedback will be appreciated.
For questions and other suggestions contact luis.matos@dsi.uminho.pt
Found any bugs? Post Them on the github page of the project! (https://github.com/Metalkiler/Cane-Categorical-Attribute-traNsformation-Environment)

Thanks for the support!

# Example

Expand Down Expand Up @@ -155,13 +158,23 @@ Description: # Cane - Categorical Attribute traNsformation Environment
print("IDF Time Multicore:",ITM)
print("PCP Time Multicore:",PTM)



```

# Scaler Example with cane

These examples present the usage of cane with the standard methods (standard scaler e min max scaler).
Also, it is presented how to implement a custom scaler function of your own with cane!
``` python
#New Scaler Function



dfNumbers = pd.DataFrame(np.random.randint(0,100000,size=(100000, 12)), columns=list('ABCDEFGHIJKL'))
cane.scale_data(dfNumbers, n_cores = 3, scaleFunc="min_max")
cane.scale_data(dfNumbers, column=["A","B"], n_cores = 3, scaleFunc="min_max")
cane.scale_data(dfNumbers, n_cores = 3, scaleFunc="min_max") # all columns using 3 cores
cane.scale_data(dfNumbers, column=["A","B"], n_cores = 3, scaleFunc="min_max") # scale specific columns
cane.scale_data(dfNumbers, column=["A","B"], n_cores = 3, scaleFunc="std") #standard Scaler



Expand All @@ -173,7 +186,7 @@ Description: # Cane - Categorical Attribute traNsformation Environment
import cane

def customFunc(val):
return pd.DataFrame([round((i - 1) / 3, 2) for i in val],columns=[val.name + "_custom_scalled_min_max"])
return pd.DataFrame([round((i - 1) / 3, 2) for i in val],columns=[val.name + "_custom_scalled_function])



Expand All @@ -182,6 +195,7 @@ Description: # Cane - Categorical Attribute traNsformation Environment
from functions import *
# with a custom function to apply to data:
if __name__ == "__main__":
dfNumbers = pd.DataFrame(np.random.randint(0,100000,size=(100000, 12)), columns=list('ABCDEFGHIJKL'))
cane.scale_data(dfNumbers, column=["A","B"], n_cores = 3, scaleFunc="custom", customfunc = customFunc)


Expand Down
4 changes: 2 additions & 2 deletions cane/cane/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ def scale_data(df, column=[], n_cores=1, scaleFunc="", customfunc=None):
columns = df.columns.values
for i in column:
valArgs.append(df[i])
diff = list(set(columns) - set(column))
diff = columns
if scaleFunc == "min_max":
func = partial(scale_single_min_max)
elif scaleFunc == "std":
Expand Down Expand Up @@ -365,4 +365,4 @@ def scale_single_std(val):


def __version__():
print("2.0.3b1")
print("2.0.3")
Binary file modified cane/cane/__pycache__/__init__.cpython-38.pyc
Binary file not shown.
Binary file added cane/dist/cane-2.0.3.1-py3-none-any.whl
Binary file not shown.
Binary file added cane/dist/cane-2.0.3.1.tar.gz
Binary file not shown.
2 changes: 1 addition & 1 deletion cane/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@


setuptools.setup(name='cane',
version='2.0.3a3',
version='2.0.3.1',
description='Cane - Categorical Attribute traNsformation Environment',
author='Luís Miguel Matos, Paulo Cortez, Rui Mendes',
license='MIT',
Expand Down
5 changes: 3 additions & 2 deletions cane/tests/MultiCoreScale.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@

from cane import scale_data, scale_single_min_max, scale_single_std

class TestPCP(unittest.TestCase):

class Testscalers(unittest.TestCase):

@staticmethod
def create_series():
Expand Down Expand Up @@ -36,7 +37,6 @@ def create_expected():
columns=["A_scalled_std"])
return xA_new


@staticmethod
def create_expected_min_max():
"""
Expand All @@ -56,6 +56,7 @@ def create_expected_min_max():
xA_new = pd.DataFrame([0.0, 0.5, 0.5, 0.5, 0.0, 0.0, 0.0, 0.5, 1.0, 0.5],
columns=["A_scalled_min_max"])
return xA_new

def teststd(self):
df = self.create_series()
a = scale_single_std(df["A"])
Expand Down

0 comments on commit 865bf41

Please sign in to comment.