Releases: googleapis/python-bigquery-dataframes
Releases · googleapis/python-bigquery-dataframes
v0.7.0
0.7.0 (2023-10-11)
Features
- Add aliases for several series properties (#80) (c0efec8)
- Add equals methods to series/dataframe (#76) (636a209)
- Add iat and iloc accessing by tuples of integers (#90) (228aeba)
- Add level param to DataFrame.stack (#88) (97b8bec)
- Allow df.drop to take an index object (#68) (740c451)
- Use default session connection (#87) (4ae4ef9)
Bug Fixes
Documentation
v0.6.0
0.6.0 (2023-10-04)
Features
- Add df.unstack (#63) (4a84714)
- Add idxmin, idxmax to series, dataframe (#74) (781307e)
- Add ml.preprocessing.KBinsDiscretizer (#81) (24c6256)
- Add multi-column dataframe merge (#73) (c9fa85c)
- Add update and align methods to dataframe (#57) (bf050cf)
- Support STRUCT data type with
Series.struct.field
to extract child fields (#71) (17afac9)
Bug Fixes
v0.5.0
0.5.0 (2023-09-28)
Features
- Add
DataFrame.kurtosis
/DF.kurt
method (c1900c2) - Add
DataFrame.rolling
andDataFrame.expanding
methods (c1900c2) - Add
items
,apply
methods toDataFrame
. (#43) (3adc1b3) - Add axis param to simple df aggregations (#52) (9cf9972)
- Add index
dtype
,astype
,drop
,fillna
, aggregate attributes. (#38) (1a254a4) - Add ml.preprocessing.LabelEncoder (#50) (2510461)
- Add ml.preprocessing.MaxAbsScaler (#56) (14b262b)
- Add ml.preprocessing.MinMaxScaler (#64) (392113b)
- Add more index methods (#54) (a6e32aa)
- Support
calculate_p_values
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
class_weights="balanced"
inLogisticRegression
model (c1900c2) - Support
df[column_name] = df_only_one_column
(c1900c2) - Support
early_stop
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
enable_global_explain
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
l2_reg
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
learn_rate_strategy
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
ls_init_learn_rate
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
max_iterations
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
min_rel_progress
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support
optimize_strategy
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2) - Support casting string to integer or float (#59) (3502f83)
Bug Fixes
- Fix header skipping logic in
read_csv
(#49) (d56258c) - Generate unique ids on join to avoid id collisions (#65) (7ab65e8)
- LabelEncoder params consistent with Sklearn (#60) (632caec)
- Loosen filter items tests to accomodate shifting pandas impl (#41) (edabdbb)
Performance Improvements
- Add ability to cache dataframe and series to session table (#51) (416d7cb)
- Inline small
Series
andDataFrames
in query text (#45) (5e199ec) - Reimplement unpivot to use cross join rather than union (#47) (f9a93ce)
- Simplify join order to use multiple order keys instead of string. (#36) (5056da6)
Documentation
- Link to Remote Functions code samples from README and API reference (c1900c2)
v0.4.0
0.4.0 (2023-09-16)
Features
- Add
axis
parameter todroplevel
andreorder_levels
(7c6b0dd) - Add
bfill
andffill
toDataFrame
andSeries
(7c6b0dd) - Add
DataFrame.combine
andDataFrame.combine_first
(#27) (7c6b0dd) - Add
DataFrame.nlargest
,nsmallest
(7c6b0dd) - Add
DataFrame.pct_change
andSeries.pct_change
(7c6b0dd) - Add
DataFrame.skew
andGroupBy.skew
(7c6b0dd) - Add
DataFrame.to_dict
,to_excel
,to_latex
,to_records
,to_string
,to_markdown
,to_pickle
,to_orc
(7c6b0dd) - Add
diff
method toDataFrame
andGroupBy
(7c6b0dd) - Add
filter
andreindex
toSeries
andDataFrame
(7c6b0dd) - Add
reindex_like
toDataFrame
andSeries
(7c6b0dd) - Add
swaplevel
toDataFrame
andSeries
(7c6b0dd) - Add partial support for
Sereies.replace
(7c6b0dd) - Support
DataFrame.loc[bool_series, column] = scalar
(7c6b0dd) - Support a persistent
name
inremote_function
(7c6b0dd)
Bug Fixes
remote_function
uses same credentials as other APIs (7c6b0dd)- Add type hints to models (7c6b0dd)
- Raise error when ARIMAPlus is used with Pipeline (7c6b0dd)
- Remove
transforms
parameter inmodel.fit
(breaking change) (7c6b0dd) - Support column joins with "None indexer" (7c6b0dd)
- Use for literals
Int64Dtype
incut
(7c6b0dd) - Use lowercase strings for parameter literals in
bigframes.ml
(breaking change) (7c6b0dd)
Performance Improvements
bigframes-api
label to I/O query jobs (7c6b0dd)
Documentation
v0.3.2
v0.3.1
v0.3.0
0.3.0 (2023-09-02)
Features
- Add
bigframes.get_global_session()
andbigframes.reset_session()
aliases (a32b747) - Add
bigframes.pandas.read_pickle
function (a32b747) - Add
components_
,explained_variance_
, andexplained_variance_ratio_
properties tobigframes.ml.decomposition.PCA
(89b9503) - Add
fit_transform
tobigquery.ml
transformers (a32b747) - Add
Series.dropna
andDataFrame.fillna
(8fab755) - Add
Series.str
methodsisalpha
,isdigit
,isdecimal
,isalnum
,isspace
,islower
,isupper
,zfill
,center
(a32b747) - Support
bigframes.pandas.merge()
(8fab755) - Support
DataFrame.isin
with list and dict inputs (8fab755) - Support
DataFrame.pivot
(a32b747) - Support
DataFrame.stack
(89b9503) - Support
DataFrame
-DataFrame
binary operations (8fab755) - Support
df[my_column] = [a python list]
(89b9503) - Support
Index.is_monotonic
(8fab755) - Support
np.arcsin
,np.arccos
,np.arctan
,np.sinh
,np.cosh
,np.tanh
,np.arcsinh
,np.arccosh
,np.arctanh
,np.exp
with Series argument (89b9503) - Support
np.sin
,np.cos
,np.tan
,np.log
,np.log10
,np.sqrt
,np.abs
with Series argument (89b9503) - Support
pow()
and power operator inDataFrame
andSeries
(8fab755) - Support
read_json
withengine=bigquery
for newline-delimited JSON files (89b9503) - Support
Series.corr
(89b9503) - Support
Series.map
(8fab755) - Support for
np.add
,np.subtract
,np.multiply
,np.divide
,np.power
(8fab755) - Support MultiIndex for DataFrame columns (a32b747)
- Use
pandas.Index
for column labels (a32b747) - Use default session and connection in
ml.llm
andml.imported
(8fab755)
Bug Fixes
- Add error message to
set_index
(a32b747) - Align column names with pandas in
DataFrame.agg
results (89b9503) - Allow (but still not recommended)
ORDER BY
inread_gbq
input when anindex_col
is defined (89b9503) - Check for IAM role on the BigQuery connection when initializing a
remote_function
(89b9503) - Check that types are specified in
read_gbq_function
(a32b747) - Don't use query cache for Session construction (a32b747)
- Include survey link in abstract
NotImplementedError
exception messages (89b9503) - Label temp table creation jobs with
source=bigquery-dataframes-temp
label (89b9503) - Make
X_train
argument names consistent across methods (8fab755) - Raise AttributeError for unimplemented pandas methods (89b9503)
- Raise exception for invalid function in
read_gbq_function
(a32b747) - Support spaces in column names in
DataFrame
initializater (89b9503)
Performance Improvements
- Add local cache for
__repr_*__
methods (a32b747) - Lazily instantiate client library objects (89b9503)
- Use
row_number()
filter forhead
/tail
(8fab755)
Documentation
- Add ML section under Overview (a32b747)
- Add release status to table of contents (a32b747)
- Add samples and best practices to
read_gbq
docs (a32b747) - Correct the return types of Dataframe and Series (a32b747)
- Create subfolders for notebooks (a32b747)
- Fix link to GitHub (89b9503)
- Highlight bigframes is open-source (a32b747)
- Sample ML Drug Name Generation notebook (a32b747)
- Set
options.bigquery.project
in sample code (89b9503) - Transform remote function user guide into sample code (a32b747)
- Update remote function notebook with read_gbq_function usage (8fab755)
Version 0.2.0
0.2.0 (2023-08-17)
Features
- Add KMeans.cluster_centers_.
- Allow column labels to be any type handled by bq df, column labels can be integers now.
- Add dataframegroupby.agg().
- Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
- Add match, fullmatch, get, pad str methods.
- Add series isin function.
Bug Fixes
- Update ML package to use sessions for queries.
- Optimize
read_gbq
withindex_col
set to cluster byindex_col
. - Raise ValueError if the location mismatched.
read_gbq
no longer uses 'time travel' with query inputs.
Documentation
- Add docstring to _uniform_sampling to avoid user using it.
Version 0.1.1
Documentation
- Correct link to code repository in
setup.py
and use correct terminology for
console.cloud.google.com
links.
Version 0.1.0
0.1.0 (2023-08-11)
Features
- Add
bigframes.pandas
package with an API compatible with
pandas. Supported data sources include:
BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local
and Cloud Storage), and more. - Add
bigframes.ml
package with an API inspired by
scikit-learn. Train machine learning
models and run batch predicition, powered by BigQuery
ML.