Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(starr): expand dual audio regex #1979

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

adapowers
Copy link

@adapowers adapowers commented Jun 17, 2024

Pull Request

Purpose

Two common conventions for indicating dual audio releases are not currently captured by the regex:

  • The release group VARYG appends it directly to their name: DUAL-VARYG
  • Other release groups will use DUAL {Resolution} or {Resolution} DUAL

I attempted to add a pattern which was flexible, while having a low risk of false positives.

Approach

The following patterns have been added into the regex:

(?-i)DUAL-VARYG(?i)

  • (?-i) disables case-insensitivity for the rest of the pattern
  • DUAL-VARYG matches literally
  • (?i) turns case-insensitivity back on for the rest of the pattern

dual[ ._-]?(\d{3,4}p|ultrahd|4k)

  • dual[ ._-] matches dual (case insensitive) plus common separator characters
  • \d{3,4}p|ultrahd|4k matches any 3-4 digits followed by p (1080p, 720p, etc.) or 4K or UltraHD

(\d{3,4}p|ultrahd|4k)[ ._-]?dual

  • Same match as above, but in reverse order: 1080p.DUAL, 4K UltraHD DUAL, etc.

Notes:

  • I did not choose to mess with case sensitivity lightly; as dual is a dictionary word, we want to prevent matching on any uppercase, hyphen-delimited filename that contains it. This is the same reason I chose to encode the release group directly. Without both, there could be too many false positives.
  • In general, this approach attempts to leverage the fact that DUAL (as a single word indicating dual audio) is often placed directly before or after the video resolution.

Regex

https://regex101.com/r/p1Rt67/6

Open Questions and Pre-Merge TODOs

Requirements

@github-actions github-actions bot added Area: Sonarr Sonarr Related Area: Radarr Radarr Related Area: Backend Backend Changes, not related to a specific section Area: Starr Custom Formats Issue is related to custom formats labels Jun 17, 2024
@adapowers adapowers changed the title fix(starr anime) add regex for DUAL-VARYG pattern fix(starr anime): add regex for DUAL-VARYG pattern Jun 17, 2024
@adapowers adapowers changed the title fix(starr anime): add regex for DUAL-VARYG pattern fix(starr anime): add dual audio regex for DUAL-VARYG pattern Jun 17, 2024
@adapowers adapowers changed the title fix(starr anime): add dual audio regex for DUAL-VARYG pattern fix(starr): add dual audio regex for DUAL-VARYG pattern Jun 17, 2024
@adapowers adapowers changed the title fix(starr): add dual audio regex for DUAL-VARYG pattern feat(starr): expand dual audio regex Jun 17, 2024
@adapowers
Copy link
Author

adapowers commented Jun 17, 2024

Another thought: it would be nice if this could also be captured in the {custom formats} for renaming, to better track which files have dual audio and which don't. But that would require it having a shorter name like DA or Dual, which I imagine we wouldn't want. Any thoughts, besides simply changing the name on our own instances?

@adapowers
Copy link
Author

Another thought: it would be nice if this could also be captured in the {custom formats} for renaming, to better track which files have dual audio and which don't. But that would require it having a shorter name like DA or Dual, which I imagine we wouldn't want. Any thoughts, besides simply changing the name on our own instances?

Ah, nevermind—I realized that's already accounted for in the {MediaInfo AudioLanguages} piece of the recommended anime naming scheme.

@bakerboy448 bakerboy448 requested a review from a team June 17, 2024 10:39
@FonduemangVI
Copy link
Contributor

@rg9400 could I get you to take a look at this given the regex changes

@nuxencs nuxencs force-pushed the fix/add-varyg-anime-dual-audio branch from 5783172 to 97a0780 Compare June 28, 2024 14:22
Copy link
Contributor

@zakkarry zakkarry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could probably be improved further but this seems to do the same job.

not sure why the varyg is case sensitive, that seems like its unnecessary - when would this be lowercase and not be a match?

docs/json/radarr/cf/anime-dual-audio.json Outdated Show resolved Hide resolved
docs/json/sonarr/cf/anime-dual-audio.json Outdated Show resolved Hide resolved
@adapowers
Copy link
Author

could probably be improved further but this seems to do the same job.

not sure why the varyg is case sensitive, that seems like its unnecessary - when would this be lowercase and not be a match?

Fair enough—I think I was just trying to be super safe, but you make a good point. And I appreciate the additional refactor! Changes accepted.

@zakkarry zakkarry dismissed their stale review June 29, 2024 05:40

still needs review from anime team - my changes have been committed

@nuxencs nuxencs force-pushed the fix/add-varyg-anime-dual-audio branch from 370a931 to 41b551e Compare July 9, 2024 14:18
@rg9400
Copy link
Contributor

rg9400 commented Jul 17, 2024

Sorry, was on vacation for the last few weeks. Can you share some test cases against this regex? The VARYG format has Dual-Audio in their naming near the end which matches even if DUAL-VARYG does not. The dual regex you shared matches scene naming, but anime is rarely consistent, so it's not always the case that the resolution follows DUAL. For example, it's not matching NanDesuKa's format: Helck.S01E16.1080p.HIDI.WEB-DL.DUAL.AAC2.0.H.264-NanDesuKa.mkv. Another random example: [OhDeer] Shikanoko Nokonoko Koshitantan - 01 (WEB 1080p Multi Audio) | (Dual) (My Deer Friend Nokotan). I get that if we are too lenient, it can match episode and anime titles that use the word dual, but I am not sure how many additional releases it is capturing.

The regex itself looks good, but sometimes it's hard to identify potential issues, so just trying to do a bit of due diligence to validate the changes are solving a problem.

@github-actions github-actions bot added the Status: Conflicted Pull Request is Conflicted label Sep 6, 2024
@FonduemangVI
Copy link
Contributor

@rg9400 given your recent changes is this still valid/needed?

@github-actions github-actions bot added Status: Conflicted Pull Request is Conflicted and removed Status: Conflicted Pull Request is Conflicted labels Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Backend Backend Changes, not related to a specific section Area: Radarr Radarr Related Area: Sonarr Sonarr Related Area: Starr Custom Formats Issue is related to custom formats Status: Conflicted Pull Request is Conflicted
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants