fixpath: generalize prefix stripping instead of hardcoded PS/quote checks
Problem
fixpath currently has specific checks for PowerShell prefix (PS ) and surrounding quotes. But the real pattern is more general: extract the actual path from whatever wraps it.
Current code:
# Strip PowerShell prompt prefix
if path.upper().startswith("PS "):
rest = path[3:].lstrip()
if rest and (rest[0] in "/\\~" or (len(rest) > 1 and rest[1] == ":")):
path = rest
# Strip surrounding quotes and backticks
if len(path) >= 2:
if (path[0] == '"' and path[-1] == '"') or ...
These are special cases of a generic problem: "there's a real path embedded in this string, extract it."
Proposed solution
Replace the hardcoded prefix/suffix checks with a generic path extractor that finds the first valid path pattern in the input string:
# Match a drive-letter path (C:\...) anywhere in the string
m = re.search(r'([a-zA-Z]):\\', path)
if m and m.start() > 0:
# Everything before the drive letter is prefix noise
path = path[m.start():]
# Also match Unix-style paths (/home/..., ~/..., /c/...)
# and UNC (\\server\...)
This would handle:
PS C:\code\file.md -- PowerShell prompt
>>> C:\code\file.md -- Python REPL
In [1]: C:\code\file.md -- IPython/Jupyter
user@host: C:\code\file.md -- SSH prompt copy-paste
"C:\code\file.md" -- quotes (the drive letter is inside)
`C:\code\file.md` -- backticks
The surrounding-character stripping (quotes, backticks) should still happen first as a cheap check, but the prefix stripping should be the generic regex approach.
Design considerations
- The regex
[a-zA-Z]:\\ is safe for Windows paths -- a single letter followed by :\ is unambiguous
- For Unix paths, finding
/home/ or ~/ or /mnt/ in a string is less reliable (could be part of a URL or sentence)
- Keep quote/backtick stripping as a fast pre-pass (handles the common case without regex)
- The generic extractor should be a fallback, not the first thing that runs
- Consider: what if the "prefix" IS part of the path? e.g.,
PS C:\ could theoretically be a folder named PS C:\ -- but that's not a realistic scenario
Acceptance criteria
fixpath: generalize prefix stripping instead of hardcoded PS/quote checks
Problem
fixpath currently has specific checks for PowerShell prefix (
PS) and surrounding quotes. But the real pattern is more general: extract the actual path from whatever wraps it.Current code:
These are special cases of a generic problem: "there's a real path embedded in this string, extract it."
Proposed solution
Replace the hardcoded prefix/suffix checks with a generic path extractor that finds the first valid path pattern in the input string:
This would handle:
PS C:\code\file.md-- PowerShell prompt>>> C:\code\file.md-- Python REPLIn [1]: C:\code\file.md-- IPython/Jupyteruser@host: C:\code\file.md-- SSH prompt copy-paste"C:\code\file.md"-- quotes (the drive letter is inside)`C:\code\file.md`-- backticksThe surrounding-character stripping (quotes, backticks) should still happen first as a cheap check, but the prefix stripping should be the generic regex approach.
Design considerations
[a-zA-Z]:\\is safe for Windows paths -- a single letter followed by:\is unambiguous/home/or~/or/mnt/in a string is less reliable (could be part of a URL or sentence)PS C:\could theoretically be a folder namedPS C:\-- but that's not a realistic scenarioAcceptance criteria
PScheck>>>), IPython (In [N]:), SSH prompts~/,/home/,/mnt/)