Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataframe columns of lists of strings render the strings as ASCII numbers instead of strings #884

Open
shrianshChari opened this issue Apr 28, 2024 · 2 comments

Comments

@shrianshChari
Copy link

When I attempt to render a dataframe that contains a column that is a list of strings (in this case my dataset is called spl and the column is titled team), the strings will be displayed correctly in Streamlit (run locally on my machine) but not when it gets translated using stlite (run using stlite sharing):

spl['team'].iloc[0:5]

Output in Streamlit:

image

Output in stlite:

image

When I take the first row of the output from stlite and convert each number into its corresponding ASCII character value,

>>> s = '91,34,83,110,111,114,108,97,120,34,44,34,71,111,108,101,109,34,44,34,71,101,110,103,97,114,34,44,34,90,97,112,100,111,115,34,44,34,70,111,114,114,101,116,114,101,115,115,34,44,34,83,116,97,114,109,105,101,34,93'
>>> s = s.split(',')
>>> c = list(map(lambda x: chr(int(x)), s))
>>> ''.join(c)
'["Snorlax","Golem","Gengar","Zapdos","Forretress","Starmie"]'

It seems that it is able to recognize that spl['team'] is a column that contains a list of strings, as when I run:

spl['team'].iloc[0]

I get the same output for both Streamlit and stlite:
image

@whitphx
Copy link
Owner

whitphx commented Apr 29, 2024

Thank you for reporting this!

@whitphx
Copy link
Owner

whitphx commented Jun 14, 2024

df.to_parquet() here is done without any error.
https://github.com/whitphx/streamlit/blob/stlite-1.35.0/lib/streamlit/type_util.py#L1131

Maybe the problem is from fastparquet and/or parquet-wasm?
-> Looks like fastparquet sets the column metadata "pandas_type": "mixed" in this case where pyarrow does "pandas_type": "list[unicode]".
The code is here?
https://github.com/dask/fastparquet/blob/1891a4a55fbe2ac23b29c064258c9c2eba480d28/fastparquet/util.py#L407-L408


Sample code:

import streamlit as st
import pandas as pd

df = pd.DataFrame({
    "names": [["foo", "bar"], ["baz", "quz"]]
})

st.dataframe(df)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants