Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize createFpAndTrim with shrinkMutableByteArray# #576

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

oberblastmeister
Copy link

Addresses #549

@clyring
Copy link
Member

clyring commented Feb 16, 2023

It looks like GHC indeed behaves poorly when shrinking large MutableByteArray#s. I've created GHC issue 22994 to track that upstream.

@@ -1023,6 +1028,12 @@ memcpy p q s = void $ c_memcpy p q (fromIntegral s)
memcpyFp :: ForeignPtr Word8 -> ForeignPtr Word8 -> Int -> IO ()
memcpyFp fp fq s = unsafeWithForeignPtr fp $ \p ->
unsafeWithForeignPtr fq $ \q -> memcpy p q s

shrinkFp :: ForeignPtr Word8 -> Int -> IO ()
shrinkFp (ForeignPtr _ (PlainPtr marr)) (I# l#) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would rather not depend on the implementation details of base that mean we always get PlainPtr here, even though the alternative is touching the MutableByteArray# primitives ourselves locally.

Thoughts? @Bodigrim @sjakobi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd write shrinkFp _ _ = pure () instead of error, but otherwise looks sane to me as is.

if assert (0 <= l' && l' <= l) $ l' >= l
then return $! BS fp l
else
if l < 4096
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is the main tricky bit.

  • Large MutableByteArray#s do not return their memory to the RTS when we shrink them in-place. But they do return their memory to the RTS when we make a copy.
  • Small pinned MutableByteArray#s may well retain all of their memory regardless of whether we shrink in place or create a copy, because of how they are currently allocated in GHC. (There are probably several GHC issues related to this topic. See for example this one and this one. It's not clear if or when this will be improved.)

So we want to make a copy when:

  • We expect the underlying buffer to be a large heap object, and
  • the amount of space shrinking would leak is large enough to care about.

If I recall correctly, the exact threshold for when a buffer is allocated as a large heap object depends on both the platform and the GHC version, but has been somewhere around 3K for 64-bit systems and half that for 32-bit systems for some time now.

(This reasoning should be documented in the code.)

@Bodigrim
Copy link
Contributor

@oberblastmeister could you please rebase?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants