You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Occasionally, the Sling output contains a message containing an emoji - I'm not sure of the pattern of why it only sometimes appears...but its often enough that it has been causing failures in my pipelines
The error seen is: UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f449' in position 25: character maps to <undefined>
What did you expect to happen?
Can any emojis be stripped from the output to ensure this failure doesn't occur?
Assume this sort of approach in the SlingResource would resolve the issue? (note i've not tested this....just a suggestion!)
import re
def _process_stdout(self, stdout: IO[AnyStr], encoding="utf8") -> Iterator[str]:
"""Process stdout from the Sling CLI."""
emoji_pattern = re.compile(
"["
"\U0001F600-\U0001F64F" # emoticons
"\U0001F300-\U0001F5FF" # symbols & pictographs
"\U0001F680-\U0001F6FF" # transport & map symbols
"\U0001F700-\U0001F77F" # alchemical symbols
"\U0001F780-\U0001F7FF" # Geometric Shapes Extended
"\U0001F800-\U0001F8FF" # Supplemental Arrows-C
"\U0001F900-\U0001F9FF" # Supplemental Symbols and Pictographs
"\U0001FA00-\U0001FA6F" # Chess Symbols
"\U0001FA70-\U0001FAFF" # Symbols and Pictographs Extended-A
"\U00002702-\U000027B0" # Dingbats
"\U000024C2-\U0001F251"
"]+", flags=re.UNICODE
)
for line in stdout:
assert isinstance(line, bytes)
fmt_line = bytes.decode(line, encoding=encoding, errors="replace")
fmt_line = emoji_pattern.sub(r'', fmt_line)
yield self._clean_line(fmt_line)
How to reproduce?
When running a large number of sling resources i found that maybe 1/20 ish seemed to have the message appear
What's the issue?
Discussed here in Dagster Slack - https://dagster.slack.com/archives/C06LQ9H1064/p1737370879891879
Occasionally, the Sling output contains a message containing an emoji - I'm not sure of the pattern of why it only sometimes appears...but its often enough that it has been causing failures in my pipelines
The error seen is:
UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f449' in position 25: character maps to <undefined>
What did you expect to happen?
Can any emojis be stripped from the output to ensure this failure doesn't occur?
Assume this sort of approach in the SlingResource would resolve the issue? (note i've not tested this....just a suggestion!)
How to reproduce?
When running a large number of sling resources i found that maybe 1/20 ish seemed to have the message appear
https://github.com/slingdata-io/sling-cli/blob/main/cmd/sling/sling_media.go
Dagster version
1.9.8
Deployment type
Local
Deployment details
Only observed in local, didn't try pushing to dagster+
Additional information
No response
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
The text was updated successfully, but these errors were encountered: