Rework GPU #58

SwissalpS · 2024-04-28T18:19:10Z

Refactor of GPU code to avoid code duplications and generaly cleaner code.
Removes some bugs and only changes behaviour slightly.

less aborts as sizes and coordinates are clamped
e.g. when copy/pasting sections, instead of aborting because of overflow, the copied area is adjusted to source and destination real estate.

Fixes #45
Hopefully also fixes #44

reducing repetative code and some more whitespace changes

TODO: check formname JIC

to reduce repetative code

SwissalpS · 2024-04-30T22:44:56Z

Output from observer mooncontroller:

in observer mooncontroller I have this code:

local sET = event.type

if 'digiline' == sET then

  local sO

  mem.sBM = mem.sBM .. '\n'
  for y, t in ipairs(event.msg) do
    mem.sBM = mem.sBM .. '\n'
    for x, s in ipairs(t) do

      sO = 'f' == s:sub(1, 1) and '0' or '1'
      mem.sBM = mem.sBM .. sO

    end
  end

elseif 'terminal' == sET then

  print(mem.sBM)

elseif 'program' == sET then

  mem.sBM = ''

end

In master lua/moon-controller I have:

if 'program' == event.type then

  local d = digiline_send

  local tDB = {
    { 1, 1, 5, 5 },
    { 1, 1, 1, 5 },
    { 1, 1, 7, 9 },
    { 1, 1, 1, 11 },
    { 1, 1, 4, 5 },
  }

  local sI, iB
  for i, t in ipairs(tDB) do

    sI = tostring(i)
    iB = i - 1
    d('g', {
      {
        command = 'createbuffer',
        buffer = iB,
        xsize = t[3],
        ysize = t[4],
        fill = 'ffffff',
      },
      {
        command = 'drawline', --rect',
        buffer = iB,
        x2 = t[3],
        y2 = t[4],
        x1 = t[1],
        y1 = t[2],
        color = 'aa00aa'
        --edge = "332211",
        --fill = "010101",
        --antialias = true
      },
      {
        command = 'sendregion',
        buffer = iB,
        x2 = t[3],
        y2 = t[4],
        x1 = t[1],
        y1 = t[2],
        channel = sI,
      },
    })
    
  end
end

---- Original message ----
How does the new drawline implementation handle these 5 simple test cases:
1,1 to 5,5
1,1 to 1,5
1,1 to 7,9
1,1 to 1,11
1,1 to 4,5
... when those are also the corners of the buffer?

TheEt1234 · 2024-05-18T15:07:43Z

From issue #44:

The original case that started the issue was fixed, so has the (65, 1, 1, 1) case when buffer size is 63x63

So the line bug is most likely patched

SwissalpS · 2024-05-18T20:22:51Z

Thanks for testing.

It has been three weeks and nobody has asked for any other fixes, so I've marked this PR as ready for review/merge.

BuckarooBanzay

code changes look good but i haven't tested anything (i haven't used that part of the mod that much anyway :D)

cheapie · 2024-09-19T01:31:45Z

I tested this with a few of my more demanding programs and it seems to be working fine, but I didn't explicitly try to break it so I can't comment on the validation parts... not that the validation was much good in my original.

I think it might also be a little faster, although I've updated MT since the last time I did much performance testing - I was able to hit 56 FPS in my donut ad (32x32, copies about three 16x16 areas per frame) and 25 FPS in my wireframe 3D cube (64x64, 12 lines drawn per frame) when outputting to digiscreen, of course with Luacontroller overheating disabled.

TheEt1234 · 2024-10-24T18:28:37Z

gpu.lua

-			local packeddata = ""
-			for y=1,buffer.ysize,1 do
-				for x=1,buffer.xsize,1 do
-					packeddata = packeddata..packpixel(buffer[y][x])


i think this would be laggy to do this with string concatination (though, remains to be tested)

edit: i put up the comment in the wrong (but still relevant) place, sorry

I suggest (pseudocode):

packeddata = {} for each pixel: table.insert(packeddata, packpixel(pixel)) end packeddata = table.concat(packeddata) -- rest of the code can work as normal

I doubt there would be much difference either way, not that I've observed a sendpacked command taking non-negligible time anyway.

I agree, the loops are so short I doubt a difference can be noticed. That's partly why I hadn't touched it.

However, I am considering to add the suggested change for "good practice" reasons. It might be faster without string concat but there will probably be more overhead constructing table and then looping it again to table.concat :D

packpixel() also uses string.concat. I don't want to go and change that too without actual proof that there is benifit. If we really want to optimize at that level, we'd better not call packpixel() in the loop but add that code directly in the loop.

(I'm assuming you ran that benchmark multiple times, because single runs of this type aren't sufficent)

I did originally test it only a single time, i wasn't aware i needed to test multiple times but i have now tested it multiple times:

(Everything is in seconds)
The repeated concatination (tested 8 times):
lowest - 0.000522
highest - 0.001384

The table concatination (tested 7 times):
highest - 0.000289
lowest - 5.4999999999999e-05

So i think its pretty clear that table.concat is faster than string concatination if you have like a lot of stuff you want to concat

Me explaining the tests

The table was testing it on was a 1D array with 1000 elements
i was basically testing if:

ret = {} for i=1,1000 do ret[i] = t[i] end return table.concat(ret)

was faster than

ret = "" for i=1,1000 do ret = ret .. t[i] end return ret

Thanks. One slight difference:
your test does:

t[i] = x

while the actual code uses a function:

table.insert(t, x)

Yeah, table.concat is clearly faster for 1000 strings, even with the overhead for the table, creating and destroying one table is faster than creating and destroying all those intermediate strings.

It might be worth benchmarking the actual code though, because that should perform at most 255 string concatenations (max 64x64 image size), which might be a lot closer in performance, or even possibly slower.

If the longest the old method took was half a millisecond with 4x the maximum data size it would usually handle and it's for a command this infrequently-used... I think you'd be hard-pressed to call any variant here "slow".

(random thought experiment, though: at what point does the discussion here end up using more CPU time than the change would actually ever save?)

Some players build big. Arrays upon arrays of RGB-glass etc. ;)
But yes, we are splitting small hairs here.

when packing packed pixels

SwissalpS · 2024-11-04T18:41:23Z

Please re-test. I didn't have the time to do so.

SwissalpS added 7 commits April 28, 2024 19:58

whitespace changes

de616b4

drop DIR_DELIM usage

6cb3d3d

reand and write buffer functions

23fdce5

reducing repetative code and some more whitespace changes

on_receive_fields refactor

5233b04

TODO: check formname JIC

add validate_color method

07bd563

to reduce repetative code

slightly different color behaviour on read

6f81f96

more validation methods and code repetition removals

48c95ef

This comment was marked as spam.

Sign in to view

fix signature error

607df7a

This was referenced Apr 28, 2024

2nd crash w/ gpu #45

Open

crash w/ gpu #44

Open

SwissalpS added 3 commits April 28, 2024 22:57

draw lines inclusive (first and last)

40de591

zero or negative length lines don't exist anymore

337be49

comment on line algo

ba1e2a2

SwissalpS added the needs testing Possible compatibility issues label Apr 28, 2024

update comment

f91b6dc

SwissalpS marked this pull request as ready for review May 18, 2024 20:20

BuckarooBanzay approved these changes May 23, 2024

View reviewed changes

TheEt1234 reviewed Oct 24, 2024

View reviewed changes

SwissalpS added 2 commits October 26, 2024 10:04

Merge remote-tracking branch 'mtm/master' into reworkGPU

13cc1fd

use table.concat instead of string.concat

cc0e860

when packing packed pixels

SwissalpS force-pushed the reworkGPU branch from e021570 to cc0e860 Compare November 4, 2024 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework GPU #58

Rework GPU #58

SwissalpS commented Apr 28, 2024

This comment was marked as spam.

SwissalpS commented Apr 30, 2024

TheEt1234 commented May 18, 2024

SwissalpS commented May 18, 2024

BuckarooBanzay left a comment •

edited

Loading

cheapie commented Sep 19, 2024

TheEt1234 Oct 24, 2024 •

edited

Loading

cheapie Oct 25, 2024

SwissalpS Oct 26, 2024

SwissalpS Oct 26, 2024

SwissalpS Oct 26, 2024

TheEt1234 Nov 5, 2024 •

edited

Loading

SwissalpS Nov 5, 2024

OgelGames Nov 5, 2024

cheapie Nov 5, 2024

SwissalpS Nov 6, 2024 •

edited

Loading

SwissalpS commented Nov 4, 2024

Rework GPU #58

Are you sure you want to change the base?

Rework GPU #58

Conversation

SwissalpS commented Apr 28, 2024

This comment was marked as spam.

SwissalpS commented Apr 30, 2024

TheEt1234 commented May 18, 2024

SwissalpS commented May 18, 2024

BuckarooBanzay left a comment • edited Loading

Choose a reason for hiding this comment

cheapie commented Sep 19, 2024

TheEt1234 Oct 24, 2024 • edited Loading

Choose a reason for hiding this comment

cheapie Oct 25, 2024

Choose a reason for hiding this comment

SwissalpS Oct 26, 2024

Choose a reason for hiding this comment

SwissalpS Oct 26, 2024

Choose a reason for hiding this comment

SwissalpS Oct 26, 2024

Choose a reason for hiding this comment

TheEt1234 Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

SwissalpS Nov 5, 2024

Choose a reason for hiding this comment

OgelGames Nov 5, 2024

Choose a reason for hiding this comment

cheapie Nov 5, 2024

Choose a reason for hiding this comment

SwissalpS Nov 6, 2024 • edited Loading

Choose a reason for hiding this comment

SwissalpS commented Nov 4, 2024

BuckarooBanzay left a comment •

edited

Loading

TheEt1234 Oct 24, 2024 •

edited

Loading

TheEt1234 Nov 5, 2024 •

edited

Loading

SwissalpS Nov 6, 2024 •

edited

Loading