decimal logical type : TypeError: can only concatenate str (not "int") to str #43

balaby25 · 2022-11-14T08:02:51Z

am trying to convert an existing csv to avro using pandavro.

am not able to resolve the below error:
File "fastavro/_logical_writers.pyx", line 130, in fastavro._logical_writers.prepare_bytes_decimal
File "fastavro/_logical_writers.pyx", line 143, in fastavro._logical_writers.prepare_bytes_decimal
TypeError: can only concatenate str (not "int") to str

i did check my csv, avsc and pandavro lines of code multiple times.. am not able to find what is the problem. am not savvy enough to call it a bug.
can anyone provide me with some pointers.

data in the csv : 999.879
the column p_cost in avsc:
{ "name": "p_cost", "type": {"name": "decimalEntry", "type": "bytes", "logicalType": "decimal", "precision": 15, "scale": 3} },
the lines of code. :

def convert_to_decimal(val):
"""
Convert the string number value to a Decimal
- Must set precision and scale beforehand
"""
return Decimal(val)

   schema_promotion = load_schema("promotion.avsc")
   df_promotion = pd.read_csv( '/scratch/tpcds_1/promotion/promotion.dat' , delimiter='|',header=None,usecols=[0,1,2,3,4,5,6,7,8,9,10,11,12

,13,14,15,16,17,18],names=['p_promo_sk',....,'p_cost',...,'p_discount_active']
,dtype={'p_cost': 'str'})

getcontext().prec = 15 # set precision of all future decimals
type(df_promotion['p_cost'])

df_promotion['p_cost'] = df_promotion['p_cost'].apply(convert_to_decimal)
pdx.to_avro('test_promotion.avro', df_promotion, schema=schema_promotion )

throws below error:

Traceback (most recent call last):
File "perfectlyrandom.py", line 313, in
promotion()
File "perfectlyrandom.py", line 262, in promotion
pdx.to_avro('test_promotion.avro', df_promotion, schema=schema_promotion )
File "/home/opc/.local/lib/python3.8/site-packages/pandavro/init.py", line 322, in to_avro
fastavro.writer(f, schema=schema,
File "fastavro/_write.pyx", line 727, in fastavro._write.writer
File "fastavro/_write.pyx", line 680, in fastavro._write.Writer.write
File "fastavro/_write.pyx", line 432, in fastavro._write.write_data
File "fastavro/_write.pyx", line 422, in fastavro._write.write_data
File "fastavro/_write.pyx", line 366, in fastavro._write.write_record
File "fastavro/_write.pyx", line 387, in fastavro._write.write_data
File "fastavro/_logical_writers.pyx", line 130, in fastavro._logical_writers.prepare_bytes_decimal
File "fastavro/_logical_writers.pyx", line 143, in fastavro._logical_writers.prepare_bytes_decimal
TypeError: can only concatenate str (not "int") to str

if full schema definition and pandas df definition is needed, i shall provide the same.
pip list:
avro-python3 1.10.2
fastavro 1.5.1
numpy 1.23.3
pandas 1.5.0
pandavro 1.7.1

The text was updated successfully, but these errors were encountered:

jbvaningen · 2022-11-28T11:07:58Z

I'm not sure about the details of your code, but I do have a workaround suggestion:

You can avoid using pandavro by using fastavro (also used by pandavro itself) to write to Avro directly. You will have to figure out how to read and parse the CSV yourself. To me it looks like you can convert the CSV to JSON, and then use fastavro.json_read. You will also have to define the Avro schema, this is something that pandavro currently solves for you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decimal logical type : TypeError: can only concatenate str (not "int") to str #43

decimal logical type : TypeError: can only concatenate str (not "int") to str #43

balaby25 commented Nov 14, 2022

jbvaningen commented Nov 28, 2022

decimal logical type : TypeError: can only concatenate str (not "int") to str #43

decimal logical type : TypeError: can only concatenate str (not "int") to str #43

Comments

balaby25 commented Nov 14, 2022

jbvaningen commented Nov 28, 2022