You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Extract records from marketo that contain UTF-8 characters
View results in an editor that supports UTF-8
Expected Results
UTF-8 characters are displayed properly
Actual Results
UTF-8 characters are corrupted
Proposed Solution
IMO this is actually a bug with the Marketo API call https://.mktorest.com/bulk/v1//export/<job_id>/file.json
The above call returns a response with the encoding set to ISO-8859-1, even if the response contains UTF-8 characters.
This causes Python's requests.models.iter_content(decode_unicode=True) to use the incorrect decoder. (Since the response says the encoding is IOS-8859-1, python just ignores the decode_unicode parameter).
My proposed fix would be to set the encoding to 'utf-8' in the response before we request python to iter_content. This change would be in tap-marketo.sync.py right after we make the call "resp = client.stream_export(stream_type, export_id)."
tap-marketo.sync.py
def stream_rows(client, stream_type, export_id):
with tempfile.NamedTemporaryFile(mode="w+", encoding="utf8", delete=False) as csv_file:
singer.log_info("Download starting.")
resp = client.stream_export(stream_type, export_id)
# Force response encoding to 'utf-8' since Marketo doesn't set this properly
resp.encoding = 'utf-8'
for chunk in resp.iter_content(chunk_size=CHUNK_SIZE_BYTES, decode_unicode=True):
if chunk:
# Replace CR
chunk = chunk.replace('\r', '')
csv_file.write(chunk)
The text was updated successfully, but these errors were encountered:
abrittis
changed the title
UTF-8 characters from marketo characters are not handled properly
UTF-8 characters from marketo are improperly decoded
Jul 29, 2021
abrittis
pushed a commit
to abrittis/tap-marketo
that referenced
this issue
Jul 29, 2021
Steps to reproduce
Expected Results
Actual Results
Proposed Solution
tap-marketo.sync.py
The text was updated successfully, but these errors were encountered: