Open
Description
I'm using write_dataframe
function to write a pandas DataFrame. My context is that this dataframe containts columns of three different types:
- Object (string) in pandas has None as missing data
- Int64 in pandas has np.nan as missing data
- Floats64 in pandas has np.nan as missing data
When writing to Redshift, these values are converted as such:
- None as NULL using varchar(20) with bytedict encoding
- NaN as -9223372036854775808 using BIGINT with az64 encoding
- NaN as "NaN" using DOUBLE PRECISION with RAW encoding
When I try to query using SQL, based on the column, I have to filter with:
- IS NULL
- = -9223372036854775808
- ::text = "NaN"
Is this intended? I wish to map all None/NaN values of pandas into NULL values. Is this possible?
Activity
LuchiLucs commentedon Feb 4, 2025
When reading back into pandas DataFrame, for instance, the int64 columns has missing data with -9223372036854775808 instead of np.nan, resulting in a NON reproducible mapping.