-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hacked together a replacement of the ellitic filters with the tabuli #73
Draft
zond
wants to merge
35
commits into
main
Choose a base branch
from
tabuli
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
zond
commented
Jun 14, 2024
cpp/zimt/masking.cc
Outdated
@@ -64,7 +64,7 @@ void HwyComputeEnergy(const hwy::AlignedNDArray<float, 2>& sample_channels, | |||
samples = | |||
Load(d, sample_channels[{sample_index + downscale_index}].data() + | |||
channel_index); | |||
accumulator = MulAdd(samples, samples, accumulator); | |||
accumulator = Add(samples, accumulator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove the square here?
zond
commented
Jun 14, 2024
@@ -203,14 +203,14 @@ func (c CorrelationTable) String() string { | |||
for _, scores := range c { | |||
row := Row{string(scores[0].ScoreTypeA)} | |||
for _, score := range scores { | |||
row = append(row, fmt.Sprintf("%.2f", score.Score)) | |||
row = append(row, fmt.Sprintf("%.15f", score.Score)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you only use -leaderboard
this won't have any effect.
|Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.101162755033022 |0.563709113431054 |0.773872237318597 |0.691086927997167 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 | real 3m37.704s user 184m0.443s sys 42m52.675s
less memory traffic, 3x speedup for tabuli
|Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.101162746942526 |0.563709113431054 |0.773872237318597 |0.691086968125479 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 | real 3m39.800s user 186m13.002s sys 42m43.169s
less memory access by streaming the log10s for dbs
|Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.098637219950560 |0.567187736693103 |0.777010861372363 |0.694493861132456 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 | real 4m19.323s user 222m49.575s sys 43m16.317s
more rotators (128->150) improves MSE by a few %
|Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.097175453861064 |0.581050442393818 |0.776654974298645 |0.696445877155239 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 |
more optimization of heuristics, ~1 % improvement
not quite sure why this improved so much changes are too complicated to trace it exactly much of the improvement was related how I refactored the log10 calculations to be simpler and just log based |Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.090239608814795 |0.534520229950808 |0.772224479921073 |0.707988288758192 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 | real 4m18.248s user 221m18.093s sys 46m19.418s
~8 % improvement
this incrementalizes/simplifies the tabuli path |Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.089327353427112 |0.559850956043716 |0.780744060923064 |0.708638004179837 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 | real 4m19.928s user 223m47.057s sys 46m23.955s
simple loudness in fourier_bank
previous tabuli zimtohrli MSE score is 14 % higher than this after this PR VisQOL will have 47 % higher MSE score than Zimtohrli |Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.078112668596219 |0.607368005163725 |0.828363954766286 |0.727804879676665 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 | real 4m15.939s user 220m7.740s sys 46m34.613s
temporal masking
loudness happens first after this commit this will help to reduce computation slightly later |Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.078443138275192 |0.602595657239603 |0.824102489042876 |0.727318570435013 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 | it is a 0.5 % degradation for now
switching loudness and masking
|Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.077647743787295 |0.608127552926780 |0.829014091535645 |0.728679787505096 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 |
~0.5 % improvement
further solidifies Zimtohrli as the best correlating metric |Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.072593165859890 |0.619257037510440 |0.814230081078870 |0.737751119792009 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 |
7-8 % improvement
after something likely went wrong in previous comment git clumsiness |Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.072654586073989 |0.619257037510440 |0.814288552764735 |0.737622587952872 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 |
quality fix
|Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.076254461155707 |0.589172424723762 |0.806090520228513 |0.731427610098147 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 | reducing the gammatone filters from 150 to 128 reducing NSIM size from 32x16 to 16x8 -- 32 bands would be more than 2 octaves of blurring!?! revamping masking into something simpler, but also not axis separating temporal and frequency masking effects
~5% worse, but faster and simpler
taking correct frequency buckets from ISO loudness reducing structural similarity support area ''' Score type |MSE |Min score |Max score |Mean score | -----------|------------------|------------------|------------------|------------------| Zimtohrli |0.074958337059633 |0.609170013051075 |0.804501676659602 |0.732603836275400 | ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | 2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 | '''
further simplifications
adjusting the nsim stabilization parameters ''' |Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.073622255247300 |0.611416768760542 |0.797709696340402 |0.734967593303992 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 | '''
minor improvement
|Score type |MSE |Min score |Max score |Mean score | |-----------|------------------|------------------|------------------|------------------| |Zimtohrli |0.072503005846665 |0.613305088893694 |0.801783318140190 |0.737440109899061 | |ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 | |2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 | |PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 | |CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 | |PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 | |AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 | |PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 | |DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 | |WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 | |GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 |
simplifications and more elegance
tuning the existing structures '|Score type |MSE |Min score |Max score |Mean score |' '|-----------|------------------|------------------|------------------|------------------|' '|Zimtohrli |0.069100905174817 |0.628756659185876 |0.804728638916679 |0.744043538017276 |' '|ViSQOL |0.115330916105424 |0.520833375452983 |0.801480831107469 |0.675101633981268 |' '|2f |0.129541391104905 |0.484687555319526 |0.797475783883375 |0.661870345773127 |' '|PESQ |0.147425552045669 |0.342342966279351 |0.841271127756762 |0.647128996775172 |' '|CDPAM |0.153471222942756 |0.441558428344727 |0.728779141125759 |0.620699318941738 |' '|PARLAQ |0.185057687192323 |0.445261140223642 |0.784370761057963 |0.587162756572532 |' '|AQUA |0.223207996944378 |0.331645933512413 |0.739286336419790 |0.547804951221731 |' '|PEAQB |0.225217321572038 |0.278744167467764 |0.851011116004117 |0.553935720513487 |' '|DPAM |0.315810440183130 |0.186717781679534 |0.690564701717118 |0.460415212267967 |' '|WARP-Q |0.339686211572685 |0.067600137543649 |0.777119464646524 |0.475793617709890 |' '|GVPMOS |0.412937133868407 |0.006851162794410 |0.783946603687895 |0.412912222208318 |' real 3m31.783s user 177m9.966s sys 40m50.441s
several quality improvements
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
rotator filters.