Skip to content

Commit

Permalink
strip special characters out of header names. excel likes to leave od…
Browse files Browse the repository at this point in the history
…d unicode items, including the unicode bom, laying around. This causes havic. By stopping it right from the start we should prevent saving invisible characters to raw_metadata and other places they get stuck
  • Loading branch information
orangewolf committed Feb 8, 2024
1 parent f50128c commit 1adeff0
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion app/models/bulkrax/csv_entry.rb
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,12 @@ def self.fields_from_data(data)
class_attribute(:csv_read_data_options, default: {})

# there's a risk that this reads the whole file into memory and could cause a memory leak
# we strip any special characters out of the headers. looking at you Excel
def self.read_data(path)
raise StandardError, 'CSV path empty' if path.blank?
options = {
headers: true,
header_converters: ->(h) { h.to_s.strip.to_sym },
header_converters: ->(h) { h.to_s.gsub(/[^\w\d -]+/, '').strip.to_sym },
encoding: 'utf-8'
}.merge(csv_read_data_options)

Expand Down

0 comments on commit 1adeff0

Please sign in to comment.