Skip to content

✨ feat(markdown): add standardize function for Telegram MarkdownV2#70

Merged
sudoskys merged 15 commits intomainfrom
dev
Mar 11, 2025
Merged

✨ feat(markdown): add standardize function for Telegram MarkdownV2#70
sudoskys merged 15 commits intomainfrom
dev

Conversation

@sudoskys
Copy link
Owner

@sudoskys sudoskys commented Mar 7, 2025

No description provided.

Ste1io and others added 4 commits March 6, 2025 19:08
…anges made after library initialization are reflected

bug: `strict_markdown` obsolete, format entity syntax outdated #68
Refactor `customize` module into singleton pattern
…ntax

- Introduced new `standardize()` function in `__init__.py` to convert unstandardized Telegram MarkdownV2 syntax
- Created `TelegramMarkdownFormatter` in `render.py` to support standardization
- Added new test case for standardization in `exp_test.py`
- Supports custom rendering of spoilers, strikethrough, and task list items
- Provides a more flexible approach to markdown formatting for Telegram
@sudoskys sudoskys linked an issue Mar 7, 2025 that may be closed by this pull request
sudoskys added 3 commits March 7, 2025 18:46
Added a new test case to ensure the `standardize` function
converts markdown consistently, improving coverage and
reliability. 🧪

🎨 chore: format and refine playground markdown examples

Refined markdown examples in `simple_case.py` and
`standardize_case.py` by using raw strings for clarity and
consistency. Removed unnecessary imports for cleaner code.
Removed the strict_markdown property and its setter method.
Set strict_markdown to True by default. This simplifies the code
and removes redundancy, enhancing maintainability.
@sudoskys sudoskys linked an issue Mar 7, 2025 that may be closed by this pull request
@sudoskys sudoskys assigned sudoskys and unassigned sudoskys Mar 7, 2025
@sudoskys
Copy link
Owner Author

sudoskys commented Mar 7, 2025

@Ste1io
An Invite has been sent.

You can test the new changes.
After initial testing it works fine on my pc

sudoskys and others added 3 commits March 7, 2025 19:04
- Added documentation for the new `standardize` function in README
- Expanded function descriptions for `markdownify` and `telegramify`
- Included a code example for the `standardize` function
- Clarified the purpose of each markdown conversion function
see bug: `strict_markdown` obsolete, format entity syntax outdated #68
@Ste1io
Copy link
Collaborator

Ste1io commented Mar 7, 2025

@Ste1io An Invite has been sent.

You can test the new changes. After initial testing it works fine on my pc

Looks good to me. Good idea with using two functions. 👍 I rebased and added the bold/italics fix, see #71

✨ fix(render): emphasis/strong rendering
@sudoskys
Copy link
Owner Author

sudoskys commented Mar 8, 2025

@Ste1io

telebot.apihelper.ApiTelegramException: A request to the Telegram API was unsuccessful. Error code: 400. Description: Bad Request: can't parse entities: Can't find end of Bold entity at byte offset 3676

Something seems to have gone wrong, let me check...

@sudoskys
Copy link
Owner Author

sudoskys commented Mar 8, 2025

Sample:

*bold _italic bold ~italic bold strikethrough ||italic bold strikethrough spoiler||~ __underline italic bold___ bold*

After standardize

*bold _italic bold ~italic bold strikethrough ||italic bold strikethrough spoiler||~ _underline italic bold__ bold*

__underline italic bold___ -> _underline italic bold__

refused by Telegram server

@Ste1io
Copy link
Collaborator

Ste1io commented Mar 8, 2025

telebot.apihelper.ApiTelegramException: A request to the Telegram API was unsuccessful. Error code: 400. Description: Bad Request: can't parse entities: Can't find end of Bold entity at byte offset 3676

Ah yes, that was a pre-existing bug, but hadn't surfaced prior, due to the fact that the md string variable was never being processed or sent to telegram; only printed to console (preprocessed). I encountered it when I first cloned the project and ran the test after noticing that the string was not actually being parsed and adding it myself; it is related to the bug mentioned in my comment in #68 regarding nested tokens:

Finally, the documented behavior of closing multiple nested entities is not being handled when parsing the syntax tree (eg, using the empty bold entity to separate the closing underline from the closing italic entities: __**_*).

I'll take a look at it today; I have a PR I'm about to push anyway fixing the ||spoiler|| rendering (which was inadvertantly removed from markdown rendering during the refactor), and the __underline__ rendering which was still using the previous token character check for markdown resulting in italics formatting in the new telegram rendering class.

I have created a separate issue for this (#72) as the fix may be a little more involved than I had hoped due to the telegram parsing grammar rules. Either they didn't put a lot of thought into their markdown v3 format spec when they wrote it, or there's a very good reason markdown hasn't pursued adding it to theirs. Or both. 😅

Ste1io added 2 commits March 8, 2025 08:28
…ering:

- move `Spoiler` renderer to markdown rendering class, and delegate telegram rendering there
Ste1io and others added 2 commits March 8, 2025 13:40
Fix Rendering Bugs, Refactor Spoiler Logic in Telegram Markdown Formatter
@sudoskys sudoskys merged commit b9aa71a into main Mar 11, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: strict_markdown obsolete, format entity syntax outdated Mixed number fractions

2 participants