Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resampling engine - edges processing error #431

Open
DTL2020 opened this issue Mar 15, 2025 · 38 comments
Open

Resampling engine - edges processing error #431

DTL2020 opened this issue Mar 15, 2025 · 38 comments

Comments

@DTL2020
Copy link

DTL2020 commented Mar 15, 2025

Test script:

LoadPlugin("avsresize.dll")

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")

AddBorders(0, 2, 0, 2, r=2)

avs=LanczosResize(width*2, height*2, taps=16).Subtitle("AVS+ LanczosResize taps=16", align=5)
avsr=z_ConvertFormat(width=width*2, height=height*2, resample_filter="lanczos", filter_param_a=16).Subtitle("avsresize lanczosResize taps=16", align=5)

StackVertical(avs, avsr)


It looks resampler engine for Resize() filters in AVS+ core do not handle edge conditions as expected (last/edge sample duplication at least to match 'support' of the resize kernel at the beginning of line or column processing or some other method). Tested with r4246.

Image

@pinterf
Copy link

pinterf commented Mar 15, 2025

Sorry, I don't understand this duplication issue. The only difference I see is that there are more shades in AVS+.

I have a few questions if you could check them:

How does fmtconv behave?
Does it need to be similar at the bottom of the image?
How do other bit depths look?

@DTL2020
Copy link
Author

DTL2020 commented Mar 16, 2025

The essence of the issue: AVS+ resampler create ringed frame's edge result if the (good/perfect) anti-ring conditioned transient is located 'too close' to the frame's buffer edge.

Other (avsresize and expected other correct) resamplers - make perfect result without ringing at this case.

The issue is ringed resize result. The 'duplication' is possible workaround to handle edge special cases (where discontinuity for resampler engine happens). If we 'pad framebuffer' with its edge samples to the length about support_size/2 we throw this discontinuity out of the output. It is one possible solution of the issue.

Possible source of the issue - incorrect handling of the 'edge special case' for resampler where there is not enough input samples for full filter's support covering. Expected solution for this special edge case - make duplication of the last sample to fill all required samples for the filter's support at the start (and end) of the framebuffer processing (it can be simulated with AddBorders(filter_support/2, r=0) (where filter_support is LanczosResize(taps=16) support of 16).

I expect this special edge case must be handled in the 'resampling program' creation - but do not found correctly looking solution - starting from this https://github.com/pinterf/AviSynthPlus/blob/33fa8d6a085b37a74b61a5b298517e75b3cd18fc/avs_core/filters/resample_functions.cpp#L372

And at the resampler's processing it must be used - in C-implementation looks like here - https://github.com/pinterf/AviSynthPlus/blob/33fa8d6a085b37a74b61a5b298517e75b3cd18fc/avs_core/filters/resample.cpp#L104 . But I do not see any visible attempt to 'virtual frame expansion' at start and end of the processing loop to handle these special cases (with a way like first and last sample duplication for example).

We need to look at other resampler;s implementations how it is designed to handle such edge case- like avsresize sources (or may be even 'more simple' JincResize sources - they are lower in size and 'resampling program' is somehow more simple - starting from
https://github.com/Asd-g/AviSynth-JincResize/blob/9d813f95e2aee549800e64470ddd6b2841a51c84/src/JincResize.cpp#L341 - it have special bool 'is_border' to switch from 'inter_frame' processing to special border-case.

As the convolution+resampling loops do not have special border_case handling - https://github.com/Asd-g/AviSynth-JincResize/blob/9d813f95e2aee549800e64470ddd6b2841a51c84/src/JincResize.cpp#L500 it looks everything is somehow solved at the table of coefficients generation step. May be JincResize engine uses different way of solving the issue like altering kernel function so it not rings near borders (?). Though it may be not the single possible solution. I need to found sources of resampling engine from avsresize to look how it is designed to attempt to show difference with AVS+ resampler.

@DTL2020
Copy link
Author

DTL2020 commented Mar 16, 2025

Testing with fmtconv - it shows some residual ringing (the p=10 for Gauss is not perfect but gives somehow less blurring, p=8 make less ringing in 8bit). But the total ringing is much less. I make Photoshop levels expansion to show better.

Script:

LoadPlugin("avsresize.dll")
LoadPlugin("fmtconv.dll")

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")

AddBorders(0, 2, 0, 2, r=2)

avs=LanczosResize(width*2, height*2, taps=16).Subtitle("AVS+ LanczosResize taps=16", align=5)
avsr=z_ConvertFormat(width=width*2, height=height*2, resample_filter="lanczos", filter_param_a=16).Subtitle("avsresize lanczosResize taps=16", align=5)
fmtconv=fmtc_resample(w=width*2, h=height*2, kernel="lanczos", taps=16).Subtitle("fmt_conv lanczosResize taps=16", align=5)

#return fmtconv.ConvertToRGB24()
return avs.ConvertToRGB24()

#StackVertical(avs, fmtconv)

Image

If process transient with p=8 - the ringing difference is even more:

LoadPlugin("avsresize.dll")
LoadPlugin("fmtconv.dll")

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")

AddBorders(0, 2, 0, 2, r=2, param1=8)

avs=LanczosResize(width*2, height*2, taps=16).Subtitle("AVS+ LanczosResize taps=16", align=5)
avsr=z_ConvertFormat(width=width*2, height=height*2, resample_filter="lanczos", filter_param_a=16).Subtitle("avsresize lanczosResize taps=16", align=5)
fmtconv=fmtc_resample(w=width*2, h=height*2, kernel="lanczos", taps=16).Subtitle("fmt_conv lanczosResize taps=16", align=5)

#return fmtconv.ConvertToRGB24()
return avs.ConvertToRGB24()

Image

I even thinking of recommend to change default p-param for AddBorders/LetterBox default Gauss kernel from 10 to 8 as it create close to no-ringing at least for 8bit result. Though it create more blurring.

@pinterf
Copy link

pinterf commented Mar 16, 2025

Here, in fmtconv is an edge condition mentioned:
https://gitlab.com/EleonoreMizo/fmtconv/-/blob/master/src/fmtcl/Scaler.cpp#L864
and in Avisynth (as I can see, it's not handled):
https://github.com/AviSynth/AviSynthPlus/blob/master/avs_core/filters/resample_functions.cpp#L419
using coeffs from first pass the normalizing sum loop
https://gitlab.com/EleonoreMizo/fmtconv/-/blob/master/src/fmtcl/Scaler.cpp#L818
It's worth investigating.

@DTL2020
Copy link
Author

DTL2020 commented Mar 16, 2025

Does it need to be similar at the bottom of the image?

Yes - it must be equal at all 4 frame edges. Top and bottom were only first simple example.

@DTL2020
Copy link
Author

DTL2020 commented Mar 16, 2025

Possible workaround without changing current kernel for resampling (and may be best quality as all 'active frame samples' being processed far from edges of the buffer with non-changed filter's kernel):

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")

AddBorders(2, 2, 2, 2, r=2, param1=8)

pad=50 #should be big enough like resizer's filter support (not less than support/2) or larger

AddBorders(pad,pad,pad,pad, r=0)
avs=LanczosResize(width*2, height*2, taps=16).Subtitle("AVS+ LanczosResize taps=16", align=5)
avs=Crop(avs,pad*2,pad*2,-pad*2,-pad*2)

return avs

Other solutions may be compared with this method for quality. If the method of adjusting kernel near edge will provide lower quality - it may be recommended in the Notes in Resize documentation or implemented as second way of handling edge conditions (with lower performance because of more samples to process and more RAM usage if being implemented as intermediate buffer size increasing instead of edge samples duplicating at row/column reading into SIMD with edge sample broadcasting).

Also tested 'negative crop' as Resize parameter (to replace additional AddBorders(pad,pad,pad,pad) before Resize) - it do not help (looks like edges samples are duplicated after resampling instaed of before).

@DTL2020
Copy link
Author

DTL2020 commented Mar 16, 2025

Testing of padded-AVS+ resize vs avsresize and fmtconv with internal edge handling:

LoadPlugin("avsresize.dll")
LoadPlugin("fmtconv.dll")

Function Padded2xLanczosResize(clip c, int pad)
{
  padded=AddBorders(c,pad,pad,pad,pad, r=0)
  res_2x=LanczosResize(padded, padded.width*2, padded.height*2, taps=16)
  return Crop(res_2x,pad*2,pad*2,-pad*2,-pad*2)
}

Function Diff(clip src1, clip src2)
{
  return Subtract(src1.ConvertBits(8),src2.ConvertBits(8)).Levels(120, 1, 255-120, 0, 255, coring=false)
}

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")

AddBorders(2, 2, 2, 2, r=2, param1=8)


pad=50

std=LanczosResize(width*2, height*2, taps=16).Subtitle("AVS+ Std 2xLanczosResize taps=16", align=5)

avs_p=Padded2xLanczosResize(last, pad).Subtitle("AVS+ Padded2xLanczosResize taps=16", align=5)
avsr=z_ConvertFormat(width=width*2, height=height*2, resample_filter="lanczos", filter_param_a=16).Subtitle("avsresize lanczosResize taps=16", align=5)
fmtconv=fmtc_resample(w=width*2, h=height*2, kernel="lanczos", taps=16).Subtitle("fmt_conv lanczosResize taps=16", align=5)

d1 = Diff(avs_p,std)
#d2 = Diff(avs_p,avsr)
d2 = Diff(avsr,fmtconv)
d3 = Diff(avs_p,fmtconv)

StackHorizontal(StackVertical(std, avs_p, avsr), Stackvertical(d1, d2, d3))

Image

The avsresize vs fmtconv is about equal (with about 1LSB difference at some corner samples) but all slightly different from 'padded resize'. The worst at r4246 is unpadded AVS+ (std) resize.

@DTL2020
Copy link
Author

DTL2020 commented Mar 16, 2025

The used in avsresize and fmtc workarounds for handling edge conditions are also not perfect and can not provide equal quality of resampler for center frame area and at the edges.

LoadPlugin("avsresize.dll")
LoadPlugin("fmtconv.dll")

Function Padded2xLanczosResize(clip c, int pad)
{
  padded=AddBorders(c,pad,pad,pad,pad, r=0)
  res_2x=LanczosResize(padded, padded.width*2, padded.height*2, taps=16)
  return Crop(res_2x,pad*2,pad*2,-pad*2,-pad*2)
}

Function Padded2xLanczosResizeAVSR(clip c, int pad)
{
  padded=AddBorders(c,pad,pad,pad,pad, r=0)
  res_2x=z_ConvertFormat(padded, width=padded.width*2, height=padded.height*2, resample_filter="lanczos", filter_param_a=16)
  return Crop(res_2x,pad*2,pad*2,-pad*2,-pad*2)
}

Function Padded2xLanczosResizeFMTC(clip c, int pad)
{
  padded=AddBorders(c,pad,pad,pad,pad, r=0)
  res_2x=fmtc_resample(padded, w=padded.width*2, h=padded.height*2, kernel="lanczos", taps=16)
  return Crop(res_2x,pad*2,pad*2,-pad*2,-pad*2)
}


Function Diff(clip src1, clip src2)
{
  return Subtract(src1.ConvertBits(8),src2.ConvertBits(8)).Levels(120, 1, 255-120, 0, 255, coring=false)
}

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")

AddBorders(2, 2, 2, 2, r=2, param1=8)


pad=50

std=LanczosResize(width*2, height*2, taps=16).Subtitle("AVS+ Std 2xLanczosResize taps=16", align=5)

avs_p=Padded2xLanczosResize(last, pad).Subtitle("AVS+ Padded2xLanczosResize taps=16", align=5)
avsr_p=Padded2xLanczosResizeAVSR(last, pad).Subtitle("avsresize Padded2xLanczosResize taps=16", align=5)
fmtc_p=Padded2xLanczosResizeFMTC(last, pad).Subtitle("FMTC Padded2xLanczosResize taps=16", align=5).ConvertBits(8)
avsr=z_ConvertFormat(width=width*2, height=height*2, resample_filter="lanczos", filter_param_a=16).Subtitle("avsresize lanczosResize taps=16", align=5)
fmtconv=fmtc_resample(w=width*2, h=height*2, kernel="lanczos", taps=16).Subtitle("fmt_conv lanczosResize taps=16", align=5)

d1 = Diff(avs_p,std)
#d2 = Diff(avs_p,avsr)
#d2 = Diff(avsr,fmtconv)
#d2 = Diff(avsr,avs_p)
#d3 = Diff(avs_p,fmtconv)

d2 = Diff(avsr,avsr_p)
d3 = Diff(fmtconv, fmtc_p)


StackHorizontal(StackVertical(std, avs_p, avsr), Stackvertical(d1, d2, d3))


Image

The padded versions of both avsresize and fmtconv also make some different outputs in comparison with unpadded resize by same engines. So different edges conditions handling workarounds (in different resize engines) gives still different results (vs padded method). It looks changing of kernel of resize filter in the 'resampling program' for handling edge conditions is not perfect way (but may give best performance because resampler process lowest data size).

@DTL2020
Copy link
Author

DTL2020 commented Mar 17, 2025

Also looks like other bug in Resize:

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")
pad=50
PointResize(width, height, src_left=-pad, src_top=-pad, src_width=-(width+2*pad), src_height=-(height+2*pad))

Return green frame after PointResize()

While trying to do any content of the frame padded resize with edges samples repeating via negative-cropping padding like:

Function Padded2xLanczosResize(clip c, int pad)
{
  padded=PointResize(c, c.width, c.height, src_left=-pad, src_top=-pad, src_width=-(c.width+2*pad), src_height=-(c.height+2*pad))
 res_2x=LanczosResize(padded, padded.width*2, padded.height*2, taps=16)
 return Crop(res_2x,pad*2,pad*2,-pad*2,-pad*2)
}

@pinterf
Copy link

pinterf commented Mar 17, 2025

Yes, PointResize fails. I tried to fix it, but it's better to revamp our resizers altogether. There are also quite a few hacks that I'd like to stop using. So, thank you, I have everything needed for the reproductions and the other artifacts as well, and I'll start working on it in the next week(s).

@DTL2020
Copy link
Author

DTL2020 commented Mar 17, 2025

I really not sure what is the correct values for 'negative crop' for Resize for
src_width=, and src_height=

Are the values of
src_width=-(width+2*pad), src_height=-(height+2*pad)
correct at all?

May be negative should be added new border size only ? Like
src_width=-pad, src_height=-pad
?

@pinterf
Copy link

pinterf commented Mar 18, 2025

Meanwhile, there have been many achievements in fixing existing resizer issues.

The green pointresize result is fixed.

I understood the edge condition from fmtconv and adopted its logic. This led to the need to handle variable kernel sizes at the beginning and end, allowing the resizing process to no longer restrict actual frame sizes. Thanks to this, I successfully removed all limitations that resulted in errors like "this width/height is too small for this resizer/support/whatever."

Then, significant code simplification is in progress (finally!) by dropping some SSSE3 and SSE4.1 8-bit resizer variants, which sacrificed accuracy for a slightly quicker mulhrs. I believe this was a mid-2000s hack. I retained SSE2, which is correct. Similarly, the AVX2 version was rewritten to avoid using mulhrs. I came across these while looking into the old resizer sources.

Having different effective kernel sizes made the fixed-filter-size template-cunami unnecessary as well (simplifying this is in progress).

Now, there is a (hidden) ability to make the resizers have an option not working in "keep-the-image-center" mode (which will probably be needed for more accurate convolution use). This is now fixed to how Avisynth worked so far, but may be extracted to parameter level. The "// TODO this look wrong, gotta check" comment is deleted, finally. We still don't know if it was wrong, anyway it's done in another logical way.

Other hacks that were needed to protect the end-of-scanline access for the edge conditions can likely be removed as well.

I guess, I'd need this week for doing the cleanup, depending on a possible ongoing release process, this may or may not fit into the time frame.

@DTL2020
Copy link
Author

DTL2020 commented Mar 18, 2025

This led to the need to handle variable kernel sizes at the beginning and end,

It is great news on updating an old AVS resampling engine. But anyway changing kernel at some areas of the frame (line/row) will degrade quality. As you see even workarounds from avsresize and fmtconv can not provide even quality between inner part of frame and at the edges of still 'active frame area'.

So for the better future it is nice to have an optimized (for performance) version of the 'padded-resize' method when we get equal quality over all 'active frame areas'.

Currently in scripting form it is a 3 stages process:

  1. Add borders to the input frame using edge samples duplication.
  2. Apply a resize filter (with or without handling special edge cases).
  3. Crop out the supplementing borders.
    It is not complex in scripting function but eats performance and RAM for lots of cached frames between 3 filters.
    If it can be implemented in a single resize filter as optional mode it will be the best solution. It can be slower in comparison with 'typical resize with edges conditions workaround by kernel altering' so it may be enabled by user-controlled param if best quality over all active frame area is required.

Or maybe we can warp stages 1 and 2 using the 'negative crop' option ? But as I remember my tests it is applied after resampling not before and can not help in this way.

@pinterf
Copy link

pinterf commented Mar 18, 2025

Here is an actual build, x64 only, it may crash here and there, but in recent builds I was not able to crash it and did not see unwanted artifacts.
https://drive.google.com/uc?export=download&id=1OKDAE4Sc-QWzOZvn0BFD5vu_ZSv2nw74
If it crashed or gave ugly artifacts somewhere, try setting SetMaxCpu and always note the parameters. I wonder how it behaves in your tests. Thanks.

Btw I mentioned changing kernel size, really, the kernel size does not change, but if there is no valid position on left (top) or right (bottom) side, it is using the accumulated coefficients on the first/last valid pixel. The same as if it was padded on the left and the right with the edge pixel value.

@DTL2020
Copy link
Author

DTL2020 commented Mar 19, 2025

First test with old script -

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")
AddBorders(2, 2, 2, 2, r=2, param1=8)
pad=50
PointResize(width, height, src_left=-pad, src_top=-pad, src_width=-(width+2*pad), src_height=-(height+2*pad))
Info()

VirtualDub return grey frame and report System exception - Access violation.

It looks like still memory corruption somewhere exist - complex scripts report access violation on simple lines:

Function Padded2xLanczosResize(clip c, int pad)
{
  padded=AddBorders(c,pad,pad,pad,pad, r=0)
  res_2x=LanczosResize(padded, padded.width*2, padded.height*2, taps=16)
  return Crop(res_2x,pad*2,pad*2,-pad*2,-pad*2)
}

Function Diff(clip src1, clip src2)
{
  return Subtract(src1.ConvertBits(8),src2.ConvertBits(8)).Levels(120, 1, 255-120, 0, 255, coring=false)
}

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")

AddBorders(2, 2, 2, 2, r=2, param1=8)

pad=50

std=LanczosResize(width*2, height*2, taps=16).Subtitle("AVS+ Std 2xLanczosResize taps=16", align=5)

avs_p=Padded2xLanczosResize(last, pad).Subtitle("AVS+ Padded2xLanczosResize taps=16", align=5)

d1 = Diff(avs_p,std)

StackHorizontal(StackVertical(std, avs_p), Stackvertical(d1, d1))

Return error: Access violation at line 10 and line 23.

Simple AddBorders also stop working as expected - either no borders added (frame size is as expected larger) and only grey frame returned or access violation with green frame (and some random pixels at the top) returned:

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")

AddBorders(2, 2, 2, 2, r=2, param1=8)

return last

Image

Still unstable and can not even got source to start test resizer.

Setting SetMaxCPU("none") looks like start to help.
Good news : script with AddBorders(4, 4, 4, 4, r=2, param1=8) and LanczosResize2x(taps=16) finally looks return equal result with 'padded resize':

SetMaxCPU("none")

Function Padded2xLanczosResize(clip c, int pad)
{
  padded=AddBorders(c,pad,pad,pad,pad, r=0)
  res_2x=LanczosResize(padded, padded.width*2, padded.height*2, taps=16)
  return Crop(res_2x,pad*2,pad*2,-pad*2,-pad*2)
}

Function Diff(clip src1, clip src2)
{
  return Subtract(src1.ConvertBits(8),src2.ConvertBits(8)).Levels(120, 1, 255-120, 0, 255, coring=false)
}

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")

#AddBorders(2, 2, 2, 2, r=2, param1=8)
AddBorders(4, 4, 4, 4, r=2, param1=8)

std=LanczosResize(width*2, height*2, taps=16).Subtitle("AVS+ Std 2xLanczosResize taps=16", align=5)

pad=50

avs_p=Padded2xLanczosResize(last, pad).Subtitle("AVS+ Padded2xLanczosResize taps=16", align=5)

d1 = Diff(avs_p,std)

StackHorizontal(StackVertical(std, avs_p), Stackvertical(d1, d1))

Image

The version with AddBorders(2, 2, 2, 2, r=2, param1=8) is not totally clean - but I not sure if it is enough 'ideal' with too short transient size near edge. But it is anyway much better in comparison with old AVS resampler.

Image

For 4:4:4 format it allow to create transient with r=3 and it also looks clean (in 8bit) - very good. As was noted in that ARIB document good quality transient must be from 6 samples size and more. The 4 samples size of r=2 may be not enough.

This also create equal result with 'padded resize'

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV24")
AddBorders(3, 3, 3, 3, r=3, param1=8)

Image

Now we need SIMD-enabled version of functions to be stable. Will try to test next days more.

@pinterf
Copy link

pinterf commented Mar 19, 2025

Yep, C vs SIMD work in progress, minor adjustsments (like loop end limits) are needed but at many places, I just wanted you to see the results when it works fine.

@DTL2020
Copy link
Author

DTL2020 commented Mar 19, 2025

Well - what is working syntax for negative src_width=, src_height= for PointResize ?

I tested different values - but all failed. Only left and top padding-duplication is working.

Right and bottom either start to crop or return black frame:

  padded=PointResize(c, c.width, c.height, src_left=-pad, src_top=-pad, src_width=-(c.width+2*pad), src_height=-(c.height+2*pad))
  padded=PointResize(c, c.width, c.height, src_left=-pad, src_top=-pad, src_width=-pad, src_height=-pad)

With any values for src_width=, src_height= the size of the frame never expanded to 2_x_pad for width and height even.

It looks PointResize(c, c.width, c.height, ...) always returns an image of size width_x_height and any src_left/right/width/height can not change image size but only shifts input image inside the same frame size. Looks this way is not applicable if we need to expand frame size by edges samples duplication and we need to use other plugins like FillBorders/FillMargins ? Though it also can not increase frame size. Sad. This makes testing more complex - I need to check edge sample value in some editor and use manual AddBorders() with the same color value. May we have a special mode for AddBorders with edge color (sample value) duplication ? Maybe as a special color-command word for parameter 'color'.

After some testing: Looks the AddBorders(2,2,2,2,r=2) source processing is good enough. The difference is because r=2 makes an edge sample of about value=2 and it is not equal to the current 'padded resize' virtual border expansion with value=0 (color=black). So it is a limitation of both current test tools and too narrow length of transient. The AddBorders(1,1,1,1,r=1) will make an even larger difference. It is partially recommended to be added to documentation of AddBorders/LetterBox: The r=1 and r=2 transient half-length is not enough for best quality. It can be treated as usual digital imaging issues like 'not enough data for upsampling/decoding engine to understand the initial shape of digitally encoded transient between 2 levels' . Same applicable for unfiltered transients causing 'Gibbs-ringing'. Here we have some smaller versions of the same source.

Currently users have a limitation for 4:2:0 and 4:2:2 formats - the r=3 transient length processing is not supported and looks like truncated to 2:

ab_r3=AddBorders(2, 2, 2, 2, r=3, param1=8)
ab_r2=AddBorders(2, 2, 2, 2, r=2, param1=8)

return StackVertical(ab_r3, ab_r2, Diff(ab_r3, ab_r2))

Though we have full-band Y plane and r=3 may be applied to this plane (and r=2 to UV planes) to make quality somehow better before going into more blurring and/or more changing frame size r=4 processing value. Making conversion to 4:4:4 and back to get r=3 quality or edges transients only will add more distortions to the all frame area and also slower.

Current documentation of AddBorders : https://avisynthplus.readthedocs.io/en/latest/avisynthdoc/corefilters/addborders.html The r radius is automatically adjusted with the video format subsampling requirements. AddBorders won't give error if e.g. for a YV12 r=1 is given: due to the chroma subampling it will be automatically promoted to 2.

But it not note limiting of r=3 to r=2 for both Y and UV planes for YV12 ? For better quality even with AddBorders(2, r=3) for YV12 it is better to use r=3 for Y plane and r=2 for UV (though it looks only r=1 possible for UVs ?).

Current documentation note: "The exact dimensions can be a bit different, e.g. if r is larger than the actually added left border. Then the radius is reduced accordingly."

It may be not best way and we can allow some asymmetrical filtering result to copy to output frame for a bit better quality (need check) with very narrow used number of samples like 1,2,3. So AddBorders(2, r=3) with r non-truncated to 2 may give some better quality in comparison with AddBorders(2, r=2). Same is with AddBorders(1,r=2) vs AddBorders(1,r=1)

@DTL2020
Copy link
Author

DTL2020 commented Mar 19, 2025

" This led to the need to handle variable kernel sizes at the beginning and end, allowing the resizing process to no longer restrict actual frame sizes. Thanks to this, I successfully removed all limitations that resulted in errors like "this width/height is too small for this resizer/support/whatever.""

Now you can try to remove 'magic-numbers' of +-10 from AddBorders input image cropped parts to filtering engine and replace with +-(r+support/2). Or if it is not enough - max expected is (r+support). Where support is the 'support' value returned by current selected and configured filtering kernel - member function

virtual double support() = 0;

With this changes user can set r-param value to AddBorders/LetterBox to high enough values (also p-param to Gauss kernel and other kernels with big enough supports possible) and not got kernel and/or total convolution result truncated and distorted output if larger r-values is required.

@pinterf
Copy link

pinterf commented Mar 20, 2025

Please, try this one. I don't know why I said that the previous version didn't crash, I had lucky dimensions or so, hah, almost nothing worked after that - except the C, but even the C did not work then in a scenario. I had a black vertical line in C mode, spent two hours on it then it turned out that ConvertBits 16->8 C mode was buggy ( (0xFFFF + 128 rounder) >> 8 was 256, so it gave me a nice white line in 8 bit, which I thought the 1000x tested C resizer was buggy.

C, SSExx or AVX2, 8 bit, 10-16 bit a float, all must be OK now. Unless you find something.
https://drive.google.com/uc?export=download&id=1JSBPKsZLO7b0VIaE0fxirsBXR-CUrWCe

It's late here, so I'm gonna read the AddBorder stuff later.

@DTL2020
Copy link
Author

DTL2020 commented Mar 20, 2025

Yes - it become much more stable and can run with default SetMaxCPU() (all SIMD enabled). At least with some short tests with YV12 and YV24 with a few frame sizes.

I found working better combination of YV24 source and
AddBorders(3, 3, 3, 3, r=2, param1=8).ConvertToYV12()
for r=2 filtered area size and first idea was to allow transient placement for Y plane to be measured in integer samples even for 4:2:x formats. But next idea may be more simple and universal: We can add sub-sample shifting param to AddBorders to control the transient placement in float coordinates now (as we in the 'moving pictures domain' and any object can have any float/real numbered coordinate in the integer-digital space of the frame without changing its shape/view and even moves).

Now we have new transient after filtering being 'centered' between 2 samples. But we can add some shift to the kernel and it will make shifting (up to +-0.5 in integer coordinate) of the transient to the inner or outer part of the frame. It may be the same as using src_top and src_left < 1.0 crop-shift params. Though with 'simple' implementation it will also shift part of the frame and it may be (or not) visible. In the more 'complex' implementation we can shift only transient's shape leaving original content of the frame at the filtered area non-shifted. But it require some thinking - may be first make sub-sample cropping of the frame-part of the transient to make shift to the opposite direction. And this cropping must be performed with 'interpolating' filter kernel like sinc. It require some experiments and thinking in the math. The simple implementation may be only add 'src_left/top' sub-shifting param to the standard resampler and see if it cause more or less visible shifted (and possible visible border at the edge of crop-cut-paste area of filtered into original frame). This may be some 'nice to have' feature for next versions of AddBorders/LetterBox.
It expected to make quality of very narrow new borders like
AddBorders(2,2,2,2,r=2) for 4:2:x formats a bit better (if we sub-shift the new created transent a bit into the center of the frame),

@DTL2020
Copy link
Author

DTL2020 commented Mar 20, 2025

Here is drawing of what we have now with integer-only (and granularity of coordinates of 2 for 4:2:0 makes this even more worse)

Image
The r=2 and some shape of kernel with p-param (Gauss) cause the new edge sample to be non-zero and this also adds to discontinuity (if we expect to add zero=0 bodrer color).

If we can contol the position of transient in float/real coordinate - we can add some sub-sample shift of shaped with same kernel transient and its edge sample (with 0-coordinate) will be closer to zero (required added border color) without going into much larger AddBorder(4,r=3) for 4:2:0 formats as we have now.

Image
It may be more easier to understand (and control) in LetterBox() filter - we simply need to have ability to use float/real numbered widths of borders. So no new params for LetterBox is required - only change from integer left/top/right/bottom to float. Looks simple in description but may require some more complex programming. And it will finally allow to have transients positioned as in full-blood digital moving pictures with any real-numbered coordinate of objects in the integer digitally sampled frame. Also we will have ability to run Animate(LetterBox()) to slowly (at any slow speed) move new added border transient without distortions or visibly stepped motion as we have now with integer-only coordinate setting.
For AddBorders() filter it may be more complex in control because it have variable integer size of the frame. So programmer need to calculate integer and fractional parts of the coordinates in some solution.

The numbers at the example drawings (like LetterBox(2.5) may be not completely correct because in current versions it looks we already have -0.5 shift of the mid-level of transient to the edges of a frame. It may be equal to 'real' LetterBox(1.5) instead of 'requested' LetterBox(2.0).

The test script for the more nice future - smooth transient motion expected instead of a big steps of integer coordinates only (also mod2 for 4:2:0):

BlankClip(100, 200, 100, color=$7F7F7F, pixel_type="YV12")

Animate(0, 100, "LetterBox",\
4,4,4,4,0,$008080,"gauss", 10, 2.71, 0, 2,\
10,10,10,10,0,$008080,"gauss", 10, 2.71, 0, 2)

LanczosResize(width*20, height*20, taps=16)

Currently it throws error at frame 8 "LetterBox: YUV images must
be divideable by 2 (left side)."

@DTL2020
Copy link
Author

DTL2020 commented Mar 20, 2025

I read at forum some more complain on AVS+ core resampler like it process additional chroma-sub sample shift at downscale (or any rescale ?). https://forum.doom9.org/showthread.php?p=2016643#post2016643
May be it also the same source as we have with -0.5 offset of the center of transient filtered ? Though the 'offset' is sort of relative to the 'luma sample position in the sampling grid' - is it 'center' or 'left' (as example). I will try to make a comparison with fmtconv (or also avsresize) resamping engine in script-based emulation of AddBodrer(2, r=2) how is the center of transient is positioned relative to the initial 2 levels transient from 1.0 to 0.0 levels. If it is also 0.5 samplig step shifted - it mean AVS+ resampler have (or had in 3.7.3 and older) such shift (relative to other plugins).

Starting to check the samples values - looks like found issue with PointResize: Is it duplicates left and top sample twice (line shifting to right and bottom):

BlankClip(100, 200, 10, color=$7F7F7F, pixel_type="YV24")

flt=AddBorders(2, 2, 2, 2, r=2, param1=8).Crop(1,0,0,0)
uf=AddBorders(2, 2, 2, 2, r=0, param1=8).Crop(1,0,0,0)
StackVertical(flt, uf)

PointResize(width*16, height*16)

Output looks in VirtualDub 400% -

Image

@pinterf
Copy link

pinterf commented Mar 21, 2025

I've prepared another build for you to try. This version includes significant code cleanup and optimization. It has been changed a lot, mainly the 8 bit horizontals.

https://drive.google.com/uc?export=download&id=16erBvpSd6SynlLMtyGBwjGHTitgqhUy_

From now on, I'll focus on making the code ready for commits.
And of course the fixes, if needed.

Btw you mentioned the magic 10 (+/-10) which I set as a minimum for the transient filtering width, do I have to do anything about that?

@DTL2020
Copy link
Author

DTL2020 commented Mar 21, 2025

Thank you for next build. Will try it next days.

Btw you mentioned the magic 10 (+/-10) which I set as a minimum for the transient filtering width, do I have to do anything about that?

It is too small if user want to set higher gauss fliter radius/support/size (like AddBorders(20,r=10, param1=1) to get long size soft borders. You can try either set it to (r+support) or at least conditional max(10, r+support) . Where 'support' is return value from ResamplingFunction class (after constructor init its members and support for gauss kernel is calculated in auto-mode). Need testing.

@DTL2020
Copy link
Author

DTL2020 commented Mar 21, 2025

https://drive.google.com/uc?export=download&id=16erBvpSd6SynlLMtyGBwjGHTitgqhUy_

Google shows 404 error for the link. Have you delete that file ?

@pinterf
Copy link

pinterf commented Mar 22, 2025

Don't know why. Reuploaded a new build, (with the max(10, r, support()) addition)
https://drive.google.com/uc?export=download&id=1uzGK_ek0oFrpXKrVF1pKxMmLBcoFp_Aj

@DTL2020
Copy link
Author

DTL2020 commented Mar 22, 2025

One more bug found (with previous build while testing):

BlankClip(100, 200, 20, color=$7F7F7F, pixel_type="YV12")
r1=2
left=20
std=AddBorders(left, 0, 0, 0, r=r1).SubTitle("AVS 3.7.4", align=5)
return std

return grey field. No added border and no subtitle.

The BlankClip(100, 200, 20, color=$7F7F7F, pixel_type="YV24") - YV24 input format is working.

@DTL2020
Copy link
Author

DTL2020 commented Mar 22, 2025

(with the max(10, r, support()) addition)

Expected minimum required number of samples to input to filter for +-(r) required filtered output is +-(r+support). So the half-width of the cropped area to filter expected to be enough as
max(10, r+support()) or simply r+support

not max (10 or r or support) as I understand max(10, r, support())

Though as returned support is double - it is nice to have clip it to nearest larger integer as we have in the 'filter_size' calculation to not miss something possibly useful.

int fir_filter_size = int(ceil(filter_support * 2));

So it is better to use r+(int(ceil(support)) .

Reuploaded a new build, (with the max(10, r, support()) addition)
https://drive.google.com/uc?export=download&id=1uzGK_ek0oFrpXKrVF1pKxMmLBcoFp_Aj

This download link is working now. Got downloaded and will test now more.

@pinterf
Copy link

pinterf commented Mar 22, 2025

Thanks, I get it now. I will check your report as well in the evening session.

@DTL2020
Copy link
Author

DTL2020 commented Mar 22, 2025

For the 'very low' supports like 2.01 the usage of int(ceil(2.01)) return 3 and it make better protection from missing something useful for the input to filtering process. And as I understand (currently think) - the convolution process do not input samples at the distance longer than +-support (ceiled to nearest integer) from current sample to output so more samples in the input do not change output result. And for radius of filter r - the max distance to used input samples is r+support (to one side of processing).

Addition: Tested with build from 210325 - the script of

BlankClip(100, 200, 20, color=$7F7F7F, pixel_type="YV12")
r1=2
left=20
std=AddBorders(left, 0, 0, 0, r=r1).SubTitle("AVS 3.7.4", align=5)
return std

Is working now as expected with both YV12 and YV24 formats.

@DTL2020
Copy link
Author

DTL2020 commented Mar 22, 2025

While testing transient samples output with:

LoadPlugin("fmtconv.dll")

Function AddBordersHF(clip c, int left, int right, int flt_rad)
{
  unflt=AddBorders(c, left, 0, right, 0)
#  flt=GaussResize(unflt, unflt.width, unflt.height, p=10, b=2.71828, s=0, force=1)
  flt=GaussResize(unflt, unflt.width, unflt.height, p=10, b=2.0, s=4, force=1)
  uf_internal=Crop(c, flt_rad, 0, c.width-flt_rad*2, c.height)
  return Overlay(flt, uf_internal, x=left+flt_rad, y=0)
}

Function AddBordersHF_FMTC(clip c, int left, int right, int flt_rad)
{
  unflt=AddBorders(c, left, 0, right, 0)
  flt=fmtc_resample(unflt, w=unflt.width, h=unflt.height, kernel="gauss", taps=10, a1=10, sx=0.000001).ConvertBits(8)
  uf_internal=Crop(c, flt_rad, 0, c.width-flt_rad*2, c.height)
  return Overlay(flt, uf_internal, x=left+flt_rad, y=0)
}

Function Diff(clip src1, clip src2)
{
  return Subtract(src1.ConvertBits(8),src2.ConvertBits(8)).Levels(120, 1, 255-120, 0, 255, coring=false)
}

BlankClip(100, 200, 20, color=$7F7F7F, pixel_type="YV12")

r1=2

left=20

#std=AddBorders(left, 0, 0, 0, r=r1).SubTitle("AVS 3.7.4", align=5)
std=AddBorders(left, 0, 0, 0, r=r1, param1=10, param2=2.0, param3=4.0).SubTitle("AVS 3.7.4", align=5)

#return std

s1=AddBordersHF(last, left, 0, r1).SubTitle("AVS s1", align=5)

s2=AddBordersHF_FMTC(last, left, 0, r1).SubTitle("FMTC p=10", align=5)

StackVertical(std, s1, s2)

PointResize(width*10, height*10)

Good results:

  1. All 3 methods of transient filtering (with b=2.0 and support=4) with AVS+ scripting, AVS+ core AddBorders and scripting with FMTC gauss resize with equal params for kernel give equal samples sequence at transient from 0 to 127:
    0, 3, 34, 93, 123, 127
    This also mean the 'Y-sample position' assuming is equal between AVS+ and FMTC and builds r4246 and 21.03.25. The transient is centered around 0 and 127 input samples (the 0.5 level of transient located in-between samples).

Some unknown quality result:
Using build 21.03.25 the all image looks like shifted to the bottom and right direction in comparison with r4246 resampler result:

Image

Image
Also the chroma relative to luma positions looks not match also. If it is not expected shift caused by updated resampler engine it need to be fixed for compatibility with old versions.

@pinterf
Copy link

pinterf commented Mar 22, 2025

O.k, it must be the image center calculation, there is a +0.5 inside until I figured out hor it matches exactly with old avs and the others. I will check it to match exactly as expected.

@DTL2020
Copy link
Author

DTL2020 commented Mar 22, 2025

Now what we want in the future for free float-quality control of the position of transient (relative to integer sampling grit): Example AddBordersHF_Shift() is only for Left border processing but all 4 borders required.

Function AddBordersHF(clip c, int left, int right, int flt_rad)
{
  unflt=AddBorders(c, left, 0, right, 0)
  flt=GaussResize(unflt, unflt.width, unflt.height, p=10, b=2.71828, s=0, force=1)
  uf_internal=Crop(c, flt_rad, 0, c.width-flt_rad*2, c.height)
  return Overlay(flt, uf_internal, x=left+flt_rad, y=0)
}

Function AddBordersHF_Shift(clip c, int left, int right, int flt_rad, float h_shift)
{
  unflt=AddBorders(c, left, 0, right, 0)
  flt=GaussResize(unflt, unflt.width, unflt.height, src_left=h_shift, p=10, b=2.71828, s=0, force=1)
  uf_internal=Crop(c, flt_rad, 0, c.width-flt_rad*2, c.height)
  return Overlay(flt, uf_internal, x=left+flt_rad, y=0)
}

BlankClip(100, 200, 20, color=$7F7F7F, pixel_type="YV12")

r1=2

left=20

s1=AddBordersHF(last, left, 0, r1).SubTitle("AVS s1", align=5)

s2=AddBordersHF_Shift(last, left, 0, r1, -0.5).SubTitle("AVS sh -0.5", align=5)

StackVertical(s1, s2)

#PointResize(width*10, height*10)
LanczosResize(width*10, height, taps=16)
PointResize(width, height*10)

The current non-shifted transient between 0 and 127 sequence is (with p=10 and b=e-number for gauss kernel)
0, 1, 28, 99, 126, 127
The shifted by 0.5 step of the sampling grid sequence is
0, 0, 7, 63, 119, 127 - this have 0.5 level centered as the mid-sample and it get the value of (127+0)/2=63

Results with sinc-based LanczosResize is about equal in shape as expected (small deviation in residual ringing because p=10 for gauss kernel is not perfect even for 8bit and lower values will give less ringing):

Image
Using the combination of such shifted transient shaping and integer coordinates skipping we can finally make the full-float control over position of transient in both added border in AddBorders() and in the new border created inside original frame with LetterBox(). But in AddBorders it looks like required additional sub-sample shifted param to be added (or some attempt to make size params floats ?). And in LetterBox() it is completely natural to add float-measured new border from outside borders of the frame to inner space. So only required to change integer left/right/top/bottom params to floats. But how easy will be implementation of better LetterBox as combination of Crop+AddBorders require some thinking. May be it is even easier to make completely separated LetterBox() with float coordinates first.

@pinterf
Copy link

pinterf commented Mar 22, 2025

One more bug found (with previous build while testing):

BlankClip(100, 200, 20, color=$7F7F7F, pixel_type="YV12")
r1=2
left=20
std=AddBorders(left, 0, 0, 0, r=r1).SubTitle("AVS 3.7.4", align=5)
return std

return grey field. No added border and no subtitle.

The BlankClip(100, 200, 20, color=$7F7F7F, pixel_type="YV24") - YV24 input format is working.

In 20250318 it was wrong, since 20250320 it's OK.

@DTL2020
Copy link
Author

DTL2020 commented Mar 22, 2025

Reuploaded a new build, (with the max(10, r, support()) addition)

It really helps with big gauss kernels and big blur-radiuses.

Function AddBordersHF(clip c, int left, int right, int flt_rad, float param1)
{
  unflt=AddBorders(c, left, 0, right, 0)
  flt=GaussResize(unflt, unflt.width, unflt.height, p=param1, b=2.71828, s=0, force=1)
  uf_internal=Crop(c, flt_rad, 0, c.width - flt_rad*2, c.height)
  return Overlay(flt, uf_internal, x=left+flt_rad, y=0)
}

Function AddBordersVF(clip c, int top, int bottom, int flt_rad, float param1)
{
  unflt=AddBorders(c, 0, top, 0, bottom)
  flt=GaussResize(unflt, unflt.width, unflt.height, p=param1, b=2.71828, s=0, force=2)
  uf_internal=Crop(c, 0, flt_rad, 0, c.height - flt_rad*2)
  return Overlay(flt, uf_internal, x=0, y=top+flt_rad)
}

Function Diff(clip src1, clip src2)
{
  return Subtract(src1.ConvertBits(8),src2.ConvertBits(8)).Levels(120, 1, 255-120, 0, 255, coring=false)
}

ColorBarsHD(2000,2000)
UserDefined2Resize(width/10, height/10)

#TurnRight()

# filtering area, means +/- around the border boundaries
r1=20
r2=20

left=20
top=20
right=20
bottom=20

param=0.25

std=AddBorders(left, top, right, bottom, r=r1, param1=param)
a=last
a=AddBordersHF(a, left, right, r1, param)
a=AddBordersVF(a, top, bottom, r1, param).SubTitle("Scriptbased", align=5)

b=last
b=AddBorders(b, left, top, right, bottom, param1=param, param2=2.71828, param3=0, r=r2).SubTitle("AVS 3.7.4", align=5)

d1 = Diff(a,b)
d2 = Diff(std,a)
d3 = Diff(std,b)

StackHorizontal(StackVertical(std, a, b), Stackvertical(d1, d2, d3))

LanczosResize(width*4, height*2, taps=16)

Image
Good thing: the latest build with 'auto-size' of the filtered area works better.

Not very good finding: The script-based filterting found to be worst in quality with big enough r and added width in comparison with internal AddBorders(). At least for ColorBarsHD() source.

Most of difference visible with params set like
r1=20
r2=20

left=20
top=20
right=20
bottom=20

param=0.25

Image

And it do not depend on the direction (the colorbars are not H and V equal - if we TurnRight colorbars - the distortion will be at the top and bottom borders).
Lets assume this method is good enough for small size transients and may not be smooth enough at the large size filtering with large size of low-pass filtering kernel.

@pinterf
Copy link

pinterf commented Mar 22, 2025

Next build, If it really works, it was worth digging into the resizer update project. Namely taking the chroma position into account. Personally I did not meet with the problem, but I remembered it, and your link is showing one such issue. Check the readme_history inside for details.

In brief:

After understanding the reason for chroma shifts and the "keep center" parameters, they have been made able to be parameterized or used automatically.

Until now Avisynth was resizing the chroma with the same center position like luma. (0.5, 0.5)

New Parameters:

  • "placement": Specifies chroma placement, with options such as "auto", "mpeg2", "center", etc.,
    similar to ConvertToXXXX and Text. The default is "auto", which reads the frame property
    _ChromaLocation for 420, 422, and 411 formats.

  • "keepc" (boolean, default: true): Determines whether to "keep center".
    If true, the chroma shift from "placement" is now considered when resizing chroma.

    GaussResize(width3, height3) # keepc=true, placement="auto"
    GaussResize(width3, height3, placement="auto") # the new default, read frame props
    GaussResize(width3, height3, placement="bottom") #center, top, etc visible differences in chroma
    GaussResize(width3, height3, placement="center") #legacy Avisynth worked like this
    GaussResize(width3, height3, keepc=false) #dont keep pixel center, not even for the luma

AddBorders, LetterBox

The internal blurring resizers take the chroma placement into account (_ChromaLocation frame prop)
I don't know if it has visible effect, but I think it's better and more correct than without it.

The resizer area is +/- (r + ceil(filter.support())

https://drive.google.com/uc?export=download&id=1TCrWka9pI3GmAW81aoyGOFtZDZ5678QV

@pinterf
Copy link

pinterf commented Mar 22, 2025

And yes, there is a difference. I chose a format which has not the center as its default placement.

ColorBarsHD(1024,768)
ConvertToYV16() # frame prop is set to "left"
# PropSet("_ChromaLocation", 0) # 0 left / mpeg2
propShow()
a=GaussResize(width/6, height/6, placement="center") # former Avs mode, center only. e.g. Right blue bar is washed out
b=GaussResize(width/6, height/6, placement="left") # left and auto are the same, since frame property is "left"
c=GaussResize(width/6, height/6, placement="auto") # or omit placement, its default is "auto"
Interleave(a,b,c)
/*
  AVS_CHROMA_UNUSED = -1,
  AVS_CHROMA_LEFT = 0,
  AVS_CHROMA_CENTER = 1,
  AVS_CHROMA_TOP_LEFT = 2,
  AVS_CHROMA_TOP = 3,
  AVS_CHROMA_BOTTOM_LEFT = 4,
  AVS_CHROMA_BOTTOM = 5,
  AVS_CHROMA_DV = 6 // Special to Avisynth
*/

@DTL2020
Copy link
Author

DTL2020 commented Mar 23, 2025

The resizer area is +/- (r + ceil(filter.support())

Looks like working good. Even the output of script-based AddBordersHF() with filtered transients is now equal to AddBorders() with very large r and low p-param. Looks like something more was changed also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants