Skip to content

Commit ea238d6

Browse files
committed
strip special characters out of header names. excel likes to leave odd unicode items, including the unicode bom, laying around. This causes havic. By stopping it right from the start we should prevent saving invisible characters to raw_metadata and other places they get stuck
1 parent f703621 commit ea238d6

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

app/models/bulkrax/csv_entry.rb

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,12 @@ def self.fields_from_data(data)
1616
class_attribute(:csv_read_data_options, default: {})
1717

1818
# there's a risk that this reads the whole file into memory and could cause a memory leak
19+
# we strip any special characters out of the headers. looking at you Excel
1920
def self.read_data(path)
2021
raise StandardError, 'CSV path empty' if path.blank?
2122
options = {
2223
headers: true,
23-
header_converters: ->(h) { h.to_s.strip.to_sym },
24+
header_converters: ->(h) { h.to_s.gsub(/[^\w\d -]+/, '').strip.to_sym },
2425
encoding: 'utf-8'
2526
}.merge(csv_read_data_options)
2627

0 commit comments

Comments
 (0)