Skip to content

Commit

Permalink
feat: Languages.search now only works with Regexp
Browse files Browse the repository at this point in the history
  • Loading branch information
bbenno committed Nov 21, 2024
1 parent 91adb7d commit d6cba77
Show file tree
Hide file tree
Showing 5 changed files with 29 additions and 36 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Changed

- Update ISO 639 data incorporating approved changes of [2023 series](https://iso639-3.sil.org/sites/iso639-3/files/reports/2023%20Summary%20of%20Outcomes.pdf), [2024 Quarter 2](https://iso639-3.sil.org/sites/iso639-3/files/reports/2024%20Quarter%202%20639%20MA%20newsletter.pdf), and [2024 Quater 3](https://iso639-3.sil.org/sites/iso639-3/files/reports/2024%20Quarter%203%20639%20MA%20newsletter.pdf)
- Interface of `Languages.search` changed. See <https://github.com/bbenno/languages/pull/123> for more details.
- Argument `case_sensitive` has been removed.
- Argument `pattern` can no longer be String. Its type has to be `Regexp`

### Deprecated

Expand Down
11 changes: 5 additions & 6 deletions README.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -58,19 +58,18 @@ invalid = Languages[:invalid] # invalid or unknown names or ISO codes returns n
[source]
Languages.all

.Get languages by name
.Get languages by name (regexp search)
[source]
----
Languages.search "^Germ"
Languages.search /Germ/i
Languages.search /\AJapan/
----

[CAUTION]
--
Passing a string to `Languages.search` results in case-sensitive search.
If case-insensitive search is intended, use ignorecase regexp like `/search_pattern/i` or pass optional `case_sensitive` parameter.
[source]
Languages.search('search_pattern', case_sensitive: false)
Searching languages by name is only allowed via `Regexp` that has been prepared and validated (if it comes from a untrusted user) in terms of case sensitivity and security / timeout.
The support of passing search pattern of type String has been removed in v0.9.0.
See https://github.com/bbenno/languages/pull/123[#123] for more details.
--

.Since ISO 639-3 categorizes the languages by scope and type, one can filter by them
Expand Down
5 changes: 3 additions & 2 deletions lib/languages.rb
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,9 @@ def [](key)
end
end

def search(pattern, case_sensitive: true)
pattern = Regexp.new(pattern, Regexp::IGNORECASE).freeze unless case_sensitive
def search(pattern)
raise(ArgumentError, 'Pattern must be a Regexp') unless pattern.is_a?(Regexp)

all.select { |l| l.name.match? pattern }
end

Expand Down
2 changes: 1 addition & 1 deletion sig/languages.rbs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ module Languages
@@data: Array[Language]

def self?.[]: (String | Symbol) -> Language?
def self?.search: (String|Regexp pattern, ?bool case_sensitive) -> Array[Language]
def self?.search: (Regexp pattern) -> Array[Language]

def self?.all: () -> Array[Language]
def self?.names: () -> Array[String]
Expand Down
44 changes: 17 additions & 27 deletions test/test_languages.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
require 'test_helper'

class TestLanguages < Minitest::Test
def setup
@search_pattern = /Germ/
end

def test_that_it_has_a_version_number
refute_nil ::Languages::VERSION
end
Expand Down Expand Up @@ -86,15 +90,13 @@ def test_single_language_lookup_key_is_case_insensitive
end

def test_search_provides_enumerable
assert_kind_of Enumerable, ::Languages.search('Japanese')
assert_kind_of Enumerable, ::Languages.search(@search_pattern)
end

def test_search_with_string_pattern
pattern = 'Japanese'
search_result = ::Languages.search(pattern)
def test_search_with_string_pattern_fails
pattern = @search_pattern.source

assert(search_result.map(&:name).all? { |n| n.match?(pattern) })
refute((Languages.all - search_result).map(&:name).any? { |n| n.match?(pattern) })
assert_raises(ArgumentError) { ::Languages.search(pattern) }
end

def test_search_with_regex_pattern
Expand All @@ -105,30 +107,18 @@ def test_search_with_regex_pattern
refute((Languages.all - search_result).map(&:name).any? { |n| n.match?(pattern) })
end

def test_search_is_case_sensitive
pattern1 = 'Germ'
pattern2 = pattern1.downcase
search_result1 = ::Languages.search(pattern1)
search_result2 = ::Languages.search(pattern2)
def test_search_can_be_case_insensitive
case_sensitive_pattern = /tib/
case_insensitive_pattern = Regexp.new(case_sensitive_pattern.source, Regexp::IGNORECASE)

refute_equal(search_result1.count, search_result2.count)
end

def test_search_can_be_case_sensitive_if_specified
pattern1 = 'Germ'
pattern2 = /germ/i
search_result1 = ::Languages.search(pattern1)
search_result2 = ::Languages.search(pattern2)

assert_equal(search_result1.count, search_result2.count)
end
case_sensitive_result = ::Languages.search(case_sensitive_pattern)
assert_equal(case_sensitive_result.count, 1) # Celtiberian

def test_search_is_case_insensitive_if_specified
pattern = 'Germ'
search_result1 = ::Languages.search(pattern)
search_result2 = ::Languages.search(pattern, case_sensitive: false)
case_insensitive_search_result = ::Languages.search(case_insensitive_pattern)
refute_equal(case_insensitive_search_result.count, 1) # also Tibet

assert_equal(search_result1.count, search_result2.count)
assert(case_insensitive_search_result.map(&:name).all? { |n| n.match?(case_insensitive_pattern) })
refute((Languages.all - case_insensitive_search_result).map(&:name).any? { |n| n.match?(case_insensitive_pattern) })
end

def test_reference_to_macrolanguage
Expand Down

0 comments on commit d6cba77

Please sign in to comment.