EncodingRs is a character encoding library for converting between UTF-8 and legacy encodings (Shift_JIS, GBK, Windows-1252, etc.). It uses a Rust NIF powered by Mozilla's encoding_rs crate.
Use for complete binaries where all data is available at once.
{:ok, string} = EncodingRs.decode(binary, "shift_jis")
{:ok, binary} = EncodingRs.encode(string, "windows-1252")Use when processing many separate items for better throughput. Batch operations always use dirty schedulers.
items = [{binary1, "shift_jis"}, {binary2, "gbk"}]
results = EncodingRs.decode_batch(items)Use for chunked data (file streams, network data) where multibyte characters may be split across chunks.
File.stream!("data.txt", [], 4096)
|> EncodingRs.Decoder.stream("shift_jis")
|> Enum.join()Important: One-shot decode/2 on chunked data will corrupt multibyte characters split across chunk boundaries, producing replacement characters (�).
All functions return tagged tuples. Always pattern match on results:
case EncodingRs.decode(binary, encoding) do
{:ok, string} -> process(string)
{:error, :unknown_encoding} -> handle_error()
endUse bang variants (decode!/2, encode!/2) only when you're certain the encoding is valid.
- Use WHATWG encoding labels:
"shift_jis","gbk","windows-1252","utf-8" - Labels are case-insensitive
- Use
EncodingRs.encoding_exists?/1to validate user-provided encodings - Use
EncodingRs.canonical_name/1to normalize aliases (e.g.,"latin1"→"windows-1252")
For files that may have a Byte Order Mark:
case EncodingRs.detect_and_strip_bom(data) do
{:ok, encoding, data_without_bom} ->
EncodingRs.decode(data_without_bom, encoding)
{:error, :no_bom} ->
EncodingRs.decode(data, default_encoding)
end- Operations on binaries larger than 64KB automatically use dirty schedulers (configurable via
config :encoding_rs, dirty_threshold: bytes) - Batch operations always use dirty schedulers regardless of size
- For streaming large files, use
EncodingRs.Decoder.stream/2with reasonable chunk sizes (64KB recommended)
- Using
decode/2on streamed chunks - UseEncodingRs.Decoderfor chunked data - Not handling
:errortuples - Unknown encodings return{:error, :unknown_encoding} - Sharing decoder across processes - Each
EncodingRs.Decodermaintains mutable state; create one per process - Forgetting
is_last: true- Always passtruefor the final chunk to flush buffered bytes