Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: use string gaps more consistently #4918

Closed
wants to merge 7 commits into from

Conversation

kmill
Copy link
Collaborator

@kmill kmill commented Aug 5, 2024

Found all missing string gaps by compiling Lean with a modified parser that rejects newlines within strings. Made changes almost everywhere but in cases where the strings were embedded documents (like some Javascript, or the multiline strings throughout Lake).

@kmill kmill marked this pull request as draft August 5, 2024 15:21
@kmill
Copy link
Collaborator Author

kmill commented Aug 5, 2024

@tydeu There are some large string literals in lake that don't look great with string gaps. Want me to revert?

@tydeu
Copy link
Member

tydeu commented Aug 5, 2024

@kmill Yeah, I don't think string gaps work well stylistically for strings that are meant to be multiple lines.

I think those need indented heredocs or a related form of multiline string literal,

@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Aug 5, 2024
@kmill
Copy link
Collaborator Author

kmill commented Aug 5, 2024

@tydeu Perhaps you don't like how they look (which is understandable, they're ugly), but I was using our version of multiline string literal notation. Each new line starts with \n.

def s := s!"\
    this is a multi-\
  \nline string using\
  \nindented 'heredoc' notation"

That's fine if you don't want to use it in Lake, and I reverted it, though I added string gaps to eliminate newlines in paragraphs — you weren't flowing the help text to a particular number of columns, right?

@leanprover-community-mathlib4-bot
Copy link
Collaborator

leanprover-community-mathlib4-bot commented Aug 5, 2024

Mathlib CI status (docs):

  • ❗ Mathlib CI can not be attempted yet, as the nightly-testing-2024-08-05 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-mathlib, Mathlib CI should run now. (2024-08-05 15:51:58)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 7bea3c150813c545b3db318f39fbf922b02b1dca --onto 647a5e94925791f6a16b629b6c16ffad60a7f478. (2024-08-06 22:55:32)

@tydeu
Copy link
Member

tydeu commented Aug 6, 2024

@kmill

h I added string gaps to eliminate newlines in paragraphs — you weren't flowing the help text to a particular number of columns, right?

Yes, the help text was wrapped at 80 characters. Unfortunately, some terminals like to scroll rather than flow text, so relying on the terminal wrapping is unreliable.

@kmill kmill changed the title chore: use string gaps chore: use string gaps more consistently Aug 6, 2024
@kmill kmill marked this pull request as ready for review August 6, 2024 22:40
@kmill
Copy link
Collaborator Author

kmill commented Aug 6, 2024

@tydeu I've removed all the changes to Lake. (One reason I was asking is that when I spot checked, I found some could be rewrapped.)

@Kha
Copy link
Member

Kha commented Aug 7, 2024

I'm not sure what to think about these changes, I can't say they increase readability

@tydeu
Copy link
Member

tydeu commented Aug 7, 2024

@kmill

Each new line starts with \n.

One oddity I noticed is that there appears to be a mix of lines starting with \n and lines ending with \n\ in the PR. Do we have a standard style? Visually, I would prefer the later. (As it is more intuitive to me for the newline to appear at end of the line.) However, I see the former is useful when you want indentation in the text of the following line (i.e., \n -), All in all, this further reinforces to me that wee need a different style of multiline strings for such text. String gaps are still very good for other use cases (e.g., wrapping strings in the source that are not wrapped in the output).

@kmill
Copy link
Collaborator Author

kmill commented Aug 7, 2024

@Kha My motivation here is primarily to be able to properly indent everything and make it indentation-independent. There are also some places that remove some unintended newlines, though only in attribute descriptions if I remember correctly. I think that if there's any harm to readability of the string contents, it's made up for by making program structure clearer.

@tydeu The idea of starting lines with \n is something that came from Rust, and it was brought up in the Zulip discussions as being a reason to go for the current lexical syntax since it supports multiline strings without any additional support. What would a different style of multiline strings fundamentally give you? Wouldn't it effectively just be a way to avoid \ at the ends of lines? This is a matter of taste, right?

I don't want to prematurely choose a particular style, other than agreeing that it's better avoiding strings contents spilling into column 1 (except for perhaps strings that are global constants, like the ones in Lake). I agree that it looks better having \n\ at the ends of lines, but it's not always possible. A future PR could clean some of this up, for example to decide to make - lists consistently indented or not.

@tydeu
Copy link
Member

tydeu commented Aug 8, 2024

@kmill

@tydeu What would a different style of multiline strings fundamentally give you? Wouldn't it effectively just be a way to avoid \ at the ends of lines? T

A indention-sensitive heredoc (<<~ ) would avoid both \n and \ while enabling indentation. For example

<<~EOF
  This line is indented two spaces.
    This line is indented four spaces.
      This line is indented six spaces.
  EOF

becomes

This line is indented two spaces.
  This line is indented four spaces.
    This line is indented six spaces.

That is, is strips the initial indentation from each line. Thus it addresses the primary use case you mentioned. Furthermore, instead of using the heredoc delimiters, we could use a simpler delimiter like """ for this syntax (as is used for multiline strings in TOML).

This is a matter of taste, right?

Admittedly, yes. However, I think the \n approach is a pretty clear readability downgrade. Furthermore, as there exist well-established syntaxes to avoid this, I think such alternatives should at least be considered.

@kmill kmill closed this Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants