Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow parsing just the vault name without requiring the secret path #305

Open
wants to merge 8 commits into
base: staging
Choose a base branch
from

Conversation

aryanjassal
Copy link
Contributor

@aryanjassal aryanjassal commented Oct 14, 2024

Description

When specifying a secret path, currently the format needs to be <vaultName>:<secretPath>, otherwise a parser error is thrown. This should not be the case, and instead paths like <vaultName> should also be allowed, which would point to the root of the directory.

To do this, a parser for vaultName will also be implemented, ensuring consistency in vault's allowed characters. This is important as currently, there are no parsers for vault name, but there is a regex validation for vault names when they are being used in the secret commands. This means that inadvertently vaults can be created which cannot be actually used in any commands like matrix.ai.

Issues Fixed

Tasks

  • 1. Create parser for vaultName
  • 2. Use the vaultName parser to parse the vault name consistently in vault and secret commands
  • 3. Make the default parser accept missing secret names
  • 4. Convert each command to use this new parser and substitute '/' if the secret path is undefined
  • 5. Run tests to ensure no regression
  • 6. Add tests for parsers
  • [ ] 7. Split tests for vaults/scanNode.test.ts not sure if it is even needed
  • 8. Add tests for every handler to confirm no secret to be an alias for the root directory
    • cat
    • create
    • dir
    • edit
    • env
    • list
    • mkdir
    • remove
    • rename
    • stat
    • write

Final checklist

  • Domain specific tests
  • Full tests
  • Updated inline-comment documentation
  • Lint fixed
  • Squash and rebased
  • Sanity check the final build

@aryanjassal aryanjassal self-assigned this Oct 14, 2024
@CMCDragonkai
Copy link
Member

CMCDragonkai commented Oct 14, 2024 via email

@tegefaulkes
Copy link
Contributor

I think I prefer null in those scenarios? Undefined is usually meant to mean void.

Undefined usually means it wasn't provided. Null would mean it was intentionally not provided. We always use undefined if it's optional.

@CMCDragonkai
Copy link
Member

I mean in TS, undefined is for void situations. I prefer null as an explicit Maybe type.

@CMCDragonkai
Copy link
Member

It depends, you need to review other places in the code where this has occurred.

@CMCDragonkai
Copy link
Member

Look at the other parsers.

@aryanjassal aryanjassal marked this pull request as ready for review October 18, 2024 03:04
@aryanjassal
Copy link
Contributor Author

While working on this PR, I actually discovered a bug in secrets write, where it doesn't properly handle writing to directories. I have made an appropriate bug report for that.

https://github.com/MatrixAI/Polykey-CLI/issues/311

There could have been other issues like this which might have slipped past testing and reviewing. I will go through all the currently implemented commands and ensure that the tests properly test everything so similar bugs can't catch us by surprise.

Comment on lines +75 to +76
// Make sure we don't accidentally return garbage data
return vaultName.match(vaultNameRegex)![0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't modify the vaultname here. For this parser we're just checking if its correct. It needs to remain as a string and unmodified.

Comment on lines +79 to +82
// E.g. If 'vault1:a/b/c', ['vault1', 'a/b/c'] is returned
// If 'vault1', ['vault1, undefined] is returned
// If 'vault1:', an error is thrown
// If 'a/b/c', an error is thrown
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commentary needs to be updated.

Comment on lines +103 to 107
if (splitSecretPath != null && !secretPathRegex.test(splitSecretPath)) {
throw new commander.InvalidArgumentError(
`${secretPath} is not of the format <vaultName>:<directoryPath>[=<value>]`,
`${secretPath} is not of the format <vaultName>[:<secretPath>][=<value>]`,
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When it comes to if statements that check a few boolean conditions at once, you always want to add a comment explaining in clear terms what you are checking for. It makes it easier to follow the code. If there was a bug in the logic which is very easy to do with boolean logic then it would be much easier to fix if the intention was known.

? secretPathPart
: secretPathPart.substring(0, equalIndex);
const value =
equalIndex === -1 ? undefined : secretPathPart.substring(equalIndex + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General formatting of ternary operators is that the main thing comes first then the fallback or secondary value comes second.

This isn't a hard rule. Just how I usually structure it.

@aryanjassal
Copy link
Contributor Author

I just realised a potential issue that would crop up in the future. Lets say that I have a vault called test, and I have a local directory called test. If we add local filesystem support, then by running pk secrets rm test --recursive, it would be ambiguous what is being referred to, the local directory test, or all the vault contents under test (basically being an alias to test:/).

This same issue applies to basically all secrets commands which are expected to operate on the user's local file system. This needs to be addressed before we go about merging this PR.

@CMCDragonkai

@aryanjassal
Copy link
Contributor Author

I have observed an interesting behaviour with secrets cat. If we use secrets write to write a secret without a trailing newline (like how UNIX dictates it's standard should be), then the last line written using secrets write will fail to render.

[aryanj@matrix-34xx:~]$ pk secrets write vault:test
some content
more content
i will press ctrl d on the next line
this line will not be shown

[aryanj@matrix-34xx:~]$ pk secrets cat vault:test
some content
more content
i will press ctrl d on the next line

[aryanj@matrix-34xx:~]$ pk secrets cat vault:test > tempfile

[aryanj@matrix-34xx:~]$ cat tempfile
some content
more content
i will press ctrl d on the next line
this line will not be shown%

Terminal emulators add a % to indicate that a file wasn't ended properly (no newline at end of file). Node's process.stdout actually doesn't interact with the terminal emulator's standard in and out. Rather, it writes directly to /dev/stdout (or something else. I'm not exactly sure on what it does, but it interacts more closely with IO than commands running on terminal emulators do). If a command ends without a newline on a terminal emulator, it knows that a new line needs to be added at the end to display the prompt and other things elegantly, so it outputs a % to let the users know that the command exited without a proper newline. This is strictly visual and doesn't impact the output or anything else, and is basically standard behaviour across all terminal emulators.

This is not supported by Node's process.stdout as it directly interacts with the standard in/out streams. As such, the terminal emulator doesn't know when the application finishes it's streams. As such, when that is the case, then the last line gets overwritten by the prompt. In my case, the prompt starts with a newline character to separate the output of the previous command better. This clears the previous line's contents and overwrites it with a blank newline.

As a workaround, currently we are manually writing a \n character to stderr so this issue doesn't happen, and doesn't break piping by providing incorrect output as compared to regular cat.

This still needs to be changed to be more intelligent, and detect if the last character was a newline, and do this only if it wasn't.

Another interesting discovery was made in regards to secrets write. When we press <Ctrl-D> on the keyboard, if characters have already been written on that line of stdin, then it would actually break to allow writing into a newline without actually inserting a newline. Then, when we press <Ctrl-D> again, that would signal the end of stream and close the stream. So, in many cases, we would need to press <Ctrl-D> twice in order to actually close the stream. This is not a bug with my code, but rather the behaviour of terminal emulators themselves. However, I'm not 100% certain about this, so more testing is actually needed here.

We should also consider writing a message to stderr informing the user that the message will now be written to the file, as writing to the file takes about a second, and the users won't be sure if the program needs another <Ctrl-D> or is just writing the file. In the future, when stdin will be directly streamed to the handler (currently it's a unary handler. this is being tracked in MatrixAI/Polykey#822), this would become largely irrelevant, but would be good feedback to have right now.

@aryanjassal
Copy link
Contributor Author

aryanjassal commented Oct 18, 2024

This PR will take longer to finish, so the relevant part required for #289 has been broken off and migrated to it's own PR, which will be completed much sooner than this one.

#312

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Oct 18, 2024

I just realised a potential issue that would crop up in the future. Lets say that I have a vault called test, and I have a local directory called test. If we add local filesystem support, then by running pk secrets rm test --recursive, it would be ambiguous what is being referred to, the local directory test, or all the vault contents under test (basically being an alias to test:/).

This same issue applies to basically all secrets commands which are expected to operate on the user's local file system. This needs to be addressed before we go about merging this PR.

@CMCDragonkai

I think firstly there are only some commands that operate on both local and vault namespaces.

That's the mv and cp commands, all other commands only operate in vault namespace only. Therefore test, test: test:/ all refer to the vault, vault and vault root directory path respectively.

So in the case of mv and cp you do have the situation of cp test test or mv test test where there's ambiguity of moving the test vault to the test local path or the test local path to the test vault.

The solutions:

  1. Syntax separation: vault: and vault:/ always mean the vault, I think though I dislike vault: and should not be parsed as vault path, but should be a syntax error (that is vault is ok, and vault:/ pr vault:abc is ok but not vault)
  2. Assume that the parameter is a vault path, unless it is explicit that it is not. That is secrets cp test test should be assumed that test and test are both vault paths, and therefore this would an error, cannot copy vault into vault.

Do note that you don't currently have PCC nor OCC with vault operations via RPC, until you've developed the stream-lifecycle representing an RPC conversation where the stream lifetime means 1 PCC lock. OCC requires MVCC and that's not possible without snapshots of the vault state which we cannot do at the moment.


Please ensure that:

pk secrets cp test test # this is an error
pk secrets cp -r testA testB # this means copying testA into testB, so `testA:/testB`
pk secrets mv test ./test # means local path
pk secrets mv test test/test # means moving test into test/test

Note that mv of a vault itself is equivalent to cp -r because you are not allowed to delete the vault entirely. That has to be a vaults * command and not a secrets command.

There's actually alot of constraints here. @tegefaulkes you should be working with @aryanjassal to ensure that all scenarios are being fast checked here and clearly in one big list.

I want to have a clear list here in the spec.

@CMCDragonkai
Copy link
Member

Terminal emulators add a % to indicate that a file wasn't ended properly (no newline at end of file). Node's process.stdout actually doesn't interact with the terminal emulator's standard in and out. Rather, it writes directly to /dev/stdout (or something else. I'm not exactly sure on what it does, but it interacts more closely with IO than commands running on terminal emulators do). If a command ends without a newline on a terminal emulator, it knows that a new line needs to be added at the end to display the prompt and other things elegantly, so it outputs a % to let the users know that the command exited without a proper newline. This is strictly visual and doesn't impact the output or anything else, and is basically standard behaviour across all terminal emulators.

The % isn't necessarily true, it depends on the terminal emulator.

@CMCDragonkai
Copy link
Member

As a workaround, currently we are manually writing a \n character to stderr so this issue doesn't happen, and doesn't break piping by providing incorrect output as compared to regular cat.

This is a bad idea.

@CMCDragonkai
Copy link
Member

Another interesting discovery was made in regards to secrets write. When we press <Ctrl-D> on the keyboard, if characters have already been written on that line of stdin, then it would actually break to allow writing into a newline without actually inserting a newline. Then, when we press <Ctrl-D> again, that would signal the end of stream and close the stream. So, in many cases, we would need to press <Ctrl-D> twice in order to actually close the stream. This is not a bug with my code, but rather the behaviour of terminal emulators themselves. However, I'm not 100% certain about this, so more testing is actually needed here.

I've never seen this with other unix commands, you should check how you're taking in STDIN here.

@CMCDragonkai
Copy link
Member

I have observed an interesting behaviour with secrets cat. If we use secrets write to write a secret without a trailing newline (like how UNIX dictates it's standard should be), then the last line written using secrets write will fail to render.

This is definitely a bug and not something terminal emulators fail at doing.

The reason for this is due to line buffering. You just need to flush the buffer explicitly at the end when the \n doesn't exist.

@CMCDragonkai
Copy link
Member

I have observed an interesting behaviour with secrets cat. If we use secrets write to write a secret without a trailing newline (like how UNIX dictates it's standard should be), then the last line written using secrets write will fail to render.

This is definitely a bug and not something terminal emulators fail at doing.

The reason for this is due to line buffering. You just need to flush the buffer explicitly at the end when the \n doesn't exist.

In fact you SHOULD ALWAYS be flushing the buffer explicitly at the end of every CLI command.

@CMCDragonkai
Copy link
Member

Assume that the parameter is a vault path, unless it is explicit that it is not.

This a key constraint. You need to fast check this well.

The point is abc is just going to be interpreted as a vault path ALWAYS.

But abc/ is not, it's going to be a local path.

Let's get some variants:

Here’s the updated table based on the clarified rules that vault paths must follow the :/ pattern, with local paths distinguished by their explicit indicators (e.g., ./, /, etc.):

Path Interpretation Rationale
abc Vault path Simple string, assumed to be a vault path (abc).
abc:/ Vault root Explicitly refers to the root of the vault abc.
abc:/abc Vault path Refers to subpath abc within the vault abc.
abc:/subdir/file Vault path Refers to a file within a subdirectory in the vault abc.
abc:/xyz Vault path Refers to xyz subpath in the vault abc.
abc:/xyz/ Vault path Refers to the xyz/ directory inside vault abc.
./abc Local path Prefixed with ./, making it explicitly local.
abc/ Local path Ends with /, indicating it's a local directory.
/abc Local path Absolute path starting with /, clearly a local path.
abc:xyz Local path : allowed in Unix filenames, no :/ pattern detected.
./abc/xyz Local path Prefixed with ./, clearly local with subdirectory xyz.
/xyz/file Local path Absolute path, explicitly local, referring to a file.

Key Rules Recap:

  1. Vault Paths:

    • Any path with :/ is a vault path.
    • A simple path like abc is also treated as a vault path.
  2. Local Paths:

    • Paths prefixed with ./, /, or ending with / are explicitly local.
    • Paths with colons (:) but without the :/ format are treated as local, given that Unix allows colons in filenames.

This table and rule set should now provide a clear, unambiguous interpretation of paths in the context of vault and local operations.

@aryanjassal
Copy link
Contributor Author

Here’s the updated table based on the clarified rules that vault paths must follow the :/ pattern, with local paths distinguished by their explicit indicators (e.g., ./, /, etc.):

This would mean that vault names cannot have a / symbol, as otherwise we wouldn't be able to differentiate between the local file path or a vault name.

Also, what about abc:.xyz? That is technically an allowed unix path, but also an allowed vault path. How should names like this be captured?

Should we allow vault names to start with .? .abc is a local path, but .abc:/xyz is a valid vault path.

@CMCDragonkai
Copy link
Member

Here’s the updated table based on the clarified rules that vault paths must follow the :/ pattern, with local paths distinguished by their explicit indicators (e.g., ./, /, etc.):

This would mean that vault names cannot have a / symbol, as otherwise we wouldn't be able to differentiate between the local file path or a vault name.

Also, what about abc:.xyz? That is technically an allowed unix path, but also an allowed vault path. How should names like this be captured?

Should we allow vault names to start with .? .abc is a local path, but .abc:/xyz is a valid vault path.

Yes / is a reserved symbol anyway.

Under the rules then it will need to be /.abc to always start the resulting vault path.

Well we have to make some rules on a vault name - it would be more constrained than any path. At least 2 symbols are not allowed here: : and /.

We may also want to disallow non-printable characters and control characters but utf-8 symbols are ok.

Use a regex rule - ask chatgpt to help generate and use fastcheck and regex101 to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants