Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renamed to ASTSize, changed to Maybe CoverageIndex #6081

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bezirg
Copy link
Contributor

@bezirg bezirg commented May 22, 2024

My initial intentions was to stop using ast size in Code.sizePlc because it may be confusing,
but use flat byte size instead.

Instead I renamed Size related functions and modules to ASTSize to make it more clear what this size is supposed to be.

I will do the real switch to flat/cbor size in another PR.

Also CoverageIndex had 2 problems:

  1. It was not wrapped in Maybe (when the coverageindex is missing) . So there was ambiguity when the coverageindex was empty: it could either mean the coverage yielded empty results or the coverage was never run.
  2. It was not serialised in case of SerialisedCode

Pre-submit checklist:

  • Branch
    • Tests are provided (if possible)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
    • Changelog fragments have been written (if appropriate)
    • Relevant tickets are mentioned in commit messages
    • Formatting, PNG optimization, etc. are updated
  • PR
    • (For external contributions) Corresponding issue exists and is linked in the description
    • Targeting master unless this is a cherry-pick backport
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested

@bezirg bezirg changed the title wip Renamed to ASTSize, changed to Maybe CoverageIndex May 22, 2024
@bezirg bezirg marked this pull request as ready for review May 22, 2024 13:33
@bezirg bezirg requested a review from zliu41 May 22, 2024 13:33
@bezirg bezirg self-assigned this May 22, 2024
@bezirg bezirg requested a review from Unisay May 28, 2024 09:47
@bezirg bezirg requested review from kwxm and removed request for zliu41 May 28, 2024 12:28
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this because I was compiling with -O0 , and it complained that there was no unfolding for GHC.++ in quickSort. Turns out this module was using too much of ghc stdlib. Is it good that I changed it @kwxm?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. The Haskell programs that these are based on are about 30 years old and are probably a bit old-fashioned to modern eyes: a lot of stuff that's commonly used these days just didn't exist when they were written. I just ported them with minimal changes and resisted the temptation to start "improving" them. My inclination would be to leave them as they are because changing the source makes it harder to see what changes are due to changes in our compiler: the differences in this file (which are quite small!) seem to have made the PIR for the whole program quite significantly different (current version, new version), which presumably accounts for the budget changes that show up later. Unless the programs stop working completely it's probably better not to change them.

@@ -99,9 +101,9 @@ randomIntegers s1 s2 =
if 1 <= s2 && s2 <= 2147483398 then
rands s1 s2
else
error "randomIntegers: Bad second seed."
error () -- "randomIntegers: Bad second seed."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use traceError "randomIntegers: Bad second seed."?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use traceError "randomIntegers: Bad second seed."?

I think the reason for that is that traceError didn't exist when these were written (or more accurately, were stolen from here). I don't think that error will ever occur anyway.

({cpu: 2556922462
| mem: 8409416})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the changes above, do you have an idea why did the budget go up?

@@ -0,0 +1,66 @@
module PlutusCore.ASTSize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this module is new I suggest to use:

  • 2 space indentation, as per the CODESTYLE.adoc
  • Explicit imports.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the 2space I agree, but about the explicit imports I kind of disagree.

I used to be back in the days also a proponent of explicit imports, but thanks to mpj I learned that managing and constantly editing the import list is a burden, even with hls helping you out (when it works).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I learned that managing and constantly editing the import list is a burden

I agree that its a burden for a writer, but its a relief for a reader. Wildcard import shift the effort away from writers to readers. Since code is read more times that its written, I'd argue that we should optimise for reading.

@@ -0,0 +1,25 @@
module PlutusIR.ASTSize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about 2-space indentation and explicit imports.

SerializedCode
BS.ByteString -- ^ UPLC.Program flat-encoded
(Maybe BS.ByteString) -- ^ PlutusIR.Program flat-encoded
(Maybe BS.ByteString) -- ^ CoverageIndex flat-encoded
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments!

getCovIdx wrapper = case wrapper of
SerializedCode _ _ idx -> idx
SerializedCode _ _ idx -> unsafeFromEither . unflat <$> idx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error emitted by unsafeFromEither is not telling what invariant was violated, its too generic.
I suggest to pattern match and throw a more explanatory error, e.g. "SerializedCode was expected to contain information about CodeCoverage because of ... but it doesn't"

@@ -29,7 +29,7 @@ loadFromFile fp = TH.liftSplice $ do
-- We don't have a 'Lift' instance for 'CompiledCode' (we could but it would be tedious),
-- so we lift the bytestring and construct the value in the quote.
bs <- liftIO $ BS.readFile fp
TH.examineSplice [|| SerializedCode bs Nothing mempty ||]
TH.examineSplice [|| SerializedCode bs Nothing Nothing ||]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I this point I am wondering if there are any reasons not to make SerializedCode a record with fields and use them to increase readability?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I this point I am wondering if there are any reasons not to make SerializedCode a record with fields and use them to increase readability?

That might indeed be helpful.

Copy link
Contributor

@Unisay Unisay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some suggestions but they aren't critical.

@kwxm
Copy link
Contributor

kwxm commented Jun 3, 2024

/benchmark nofib

Copy link
Contributor

github-actions bot commented Jun 3, 2024

Click here to check the status of your benchmark.

@kwxm
Copy link
Contributor

kwxm commented Jun 3, 2024

Click here to check the status of your benchmark.

Failed :(

Copy link
Contributor

@kwxm kwxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks basically OK, but I'm not too keen on modifying the nofib Knights example if we can avoid it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. The Haskell programs that these are based on are about 30 years old and are probably a bit old-fashioned to modern eyes: a lot of stuff that's commonly used these days just didn't exist when they were written. I just ported them with minimal changes and resisted the temptation to start "improving" them. My inclination would be to leave them as they are because changing the source makes it harder to see what changes are due to changes in our compiler: the differences in this file (which are quite small!) seem to have made the PIR for the whole program quite significantly different (current version, new version), which presumably accounts for the budget changes that show up later. Unless the programs stop working completely it's probably better not to change them.

@@ -99,9 +101,9 @@ randomIntegers s1 s2 =
if 1 <= s2 && s2 <= 2147483398 then
rands s1 s2
else
error "randomIntegers: Bad second seed."
error () -- "randomIntegers: Bad second seed."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use traceError "randomIntegers: Bad second seed."?

I think the reason for that is that traceError didn't exist when these were written (or more accurately, were stolen from here). I don't think that error will ever occur anyway.

import Control.Lens
import Data.Monoid

newtype ASTSize = ASTSize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be annoying, maybe ASTSize isn't as informative as it could be: there might be lots of different ways you could measure the size of an AST and the name doesn't make it obvious what's actually being measured. Something likeNumberOfASTNodes might be more informative. I don't mind all that much though.

@@ -29,7 +29,7 @@ loadFromFile fp = TH.liftSplice $ do
-- We don't have a 'Lift' instance for 'CompiledCode' (we could but it would be tedious),
-- so we lift the bytestring and construct the value in the quote.
bs <- liftIO $ BS.readFile fp
TH.examineSplice [|| SerializedCode bs Nothing mempty ||]
TH.examineSplice [|| SerializedCode bs Nothing Nothing ||]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I this point I am wondering if there are any reasons not to make SerializedCode a record with fields and use them to increase readability?

That might indeed be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants