Introduce RichText type #526

gagarski · 2024-02-11T22:11:35Z

Disclaimer

I understand that I am suggesting a breaking change, but the breaking changes are sometimes being implemented in telegram-bot-api (maintaining compatibility for some time). The introduction of ReplyParameters type kinda resembles what I am suggesting, yet it may be done for better reasons.

Problem

There are a lot of places in bot API where fields like text+entities+parse_mode come together. Together they represent formatting for the text. There are 4 kinds of these groups in different methods and the naming of the fields is barely consistent:

text, entities, parse_mode
caption, caption_entities, parse_mode
quote, quote_entities, quote_parse_mode
explanation, explanation_entities, explanation_parse_mode

Cons

As mentioned before, it's a breaking change. The old formats of course can be supported for some time. This is also a change to a very core feature of the API
More boilerplate nesting for simple cases (without formatting). This can be softened a bit by allowing a field caption in above example be of type RichText or String

Pros

Less duplicates in documentation and client-library code.
Client-library developers can provide type-safe means to produce this rich text and prevent library client form passing HTML text with Markdown parse mode. Nothing prevents library developers from doing so now, but in that case they're responsible for converting data from their input format to telegram-bot-api format which can cause bugs and also confusing for library clients (the more library API is aligned with Telegram API the better)

Additional concern: output rich text

Besides the method arguments, there are data returned as a response or update, which can also contain rich text. In that case there is always no parse_mode. This can be solved by splitting RichText into subtypes:

RichTextWithEntities
RichTextMarkdown
RichTextMarkdownV2
RichTextHTML

In this case, the input type would be RichText (or String?) and the output type would be RichTextWithEntities.

The text was updated successfully, but these errors were encountered:

levlam · 2024-02-12T11:34:19Z

You correctly and precisely described consequences of that API change. I have nothing to disprove, but let me add some comments on the mentioned pros and cons.

As mentioned before, it's a breaking change. The old formats of course can be supported for some time. This is also a change to a very core feature of the API

This isn't an issue, given the old way will be supported forever and will be removed only from the documentation as happens with all Bot API changes.

More boilerplate nesting for simple cases (without formatting).

This is a real issue and the main reason why API looks like it is. Bot API very often flattens API to simplify common usages, like it is done with all methods for messages sending, or was done with "reply_to_message_id" or "disable_web_page_preview".

This can be softened a bit by allowing a field caption in above example be of type RichText or String

I like this proposal, but there are a lot of popular programming languages that will not be able to implement the same. Also, it could be harder to understand, how to use such fields.

Less duplicates in documentation and client-library code.

This is also true, but changing many similar fields in the documentation is only slightly harder than changing one. Also, it affects mostly writers of the documentation and client-library code.

Client-library developers can provide type-safe means to produce this rich text and prevent library client form passing HTML text with Markdown parse mode. Nothing prevents library developers from doing so now, but in that case they're responsible for converting data from their input format to telegram-bot-api format which can cause bugs and also confusing for library clients (the more library API is aligned with Telegram API the better)

Yes, this can already be done by client libraries. I don't agree that parameters flattening is hard or likely to have bugs. I also don't agree that API of client libraries should be perfectly aligned with Telegram API. They should provide simple access to most raw methods, but libraries advantage comes from the provided syntactic sugar, shorthands, language-specific features, update processing pipelines, the ability to keep user state. Otherwise, it would be simpler to use raw API directly.

Summarizing, this can be implemented someday, but currently I prefer to have the simplest API for common usages and leave all syntactic sugar and language-specific type safety to the developers of client libraries. The issue can be kept open, given this can be changed in the future.

gagarski · 2024-02-12T19:45:50Z

Fair points and nothing to argue with, however I'll anser some of your comments as well.

or was done with "reply_to_message_id" or "disable_web_page_preview"

That's what I meant by "better reasons", you added a couple of new fields to that and made it more structured.

I like this proposal, but there are a lot of popular programming languages that will not be able to implement the same. Also, it could be harder to understand, how to use such fields.

That's true that a lot of languages do not support union types. However, we already have a union type in API: chat_id usually has a type Integer or String. That's how that can be worked out:

// @JsonDeserialize(using = ChatIdDeserializer::class) // Deserialization is out of scope for this example, 
// however can still work
sealed interface ChatId

@JvmInline
value class LongChatId(
    @JsonValue // this is a hint to serialize it as a number
    val long: Long
) : ChatId

@JvmInline
value class StringChatId(
    @JsonValue // this is a hint to serialize it as a string
    val string: String
) : ChatId

// Helper methods to create a ChatId from primitive types
fun String.toChatId() = StringChatId(this)
fun Long.toChatId() = LongChatId(this)
fun Int.toChatId() = LongChatId(this.toLong())

Same approach can be applied for String or RichText, however, since you still need to have a wrapper for String, you can give up on plain string wrapper and just use RichTextWithEntities, because if will be effectively the same. The option with plain String though might be useful in the languages with support for union types (or without types at all) and for quick testing using browser.

I don't agree that parameters flattening is hard or likely to have bugs

Well, it's hard to argue that it's not hard, yet it can look like that:

suspend fun Telegram.copyMessage(
    chatId: ChatId,
    messageThreadId: Long? = null,
    fromChatId: ChatId,
    messageId: Long,
    richCaption: RichCaption? = null,
    disableNotification: Boolean = false,
    protectContent: Boolean = false,
    replyParameters: ReplyParameters? = null,
    replyMarkup: ReplyMarkup? = null,
): MessageId = call(
    CopyMessage(
        chatId = chatId, 
        messageThreadId = messageThreadId, 
        fromChatId = fromChatId, 
        messageId = messageId, 
        caption = richCaption?.caption, 
        parseMode = richCaption?.parseMode, 
        captionEntities = richCaption?.captionEntities, 
        disableNotification = disableNotification, 
        protectContent = protectContent, 
        replyParameters = replyParameters, 
        replyMarkup = replyMarkup
    )
)

From my experience, this kind of code is exactly the type of code you can make mistakes in (okay, I involved code-generation to make it)! Given the changing nature of Telegram API and a wide use of optional parameters, It's easy to make a mistake in such mapping, bot when the method is first defined and during updates to catch up wtih Telegram API.

The other approach might be to do some post-processing after serializing to JSON (or Multipart). This involves less field moving-around and allows you to apply the patch directly to the patched part (in our case RichText part). However, in that case these manipulations are no longer type-safe (and therefore error-prone).

I also don't agree that API of client libraries should be perfectly aligned with Telegram API. They should provide simple access to most raw methods, but libraries advantage comes from the provided syntactic sugar, shorthands, language-specific features, update processing pipelines, the ability to keep user state.

Agreed. However, no matter how good are your update processing pipelines and state-keeping, the library user will probably eventually want to send a message. First thing to do is to define structures for such operations (either by an explicit type or by the structure of function params). Even having that (even dumbly reproducing Telegram Bot API docs) is better than directly using HTTP client: at least it is harder for a library user to make a typo in a parameter name. Now the next temptation for a library developer is to write in the documentation something like This is a sendMessage Telegram method. For more info on the parameters please refer to https://core.telegram.org/bots/api#sendmessage. Of course some improvements can be done on the library side: dates can be represented as dates (not an integer), One other improvement might be to split some types (which are not split to subtypes in Telegram docs) to sub-types. So far the library user can jump to Telegram API docs and read some more about fields/parameters. Wrapping/unwrapping can also be done, but that makes things much harder (yet I understand that this "much harder" border is subjective).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce RichText type #526

Introduce RichText type #526

gagarski commented Feb 11, 2024 •

edited

Loading

levlam commented Feb 12, 2024

gagarski commented Feb 12, 2024

Introduce RichText type #526

Introduce RichText type #526

Comments

gagarski commented Feb 11, 2024 • edited Loading

Disclaimer

Problem

Suggested solution

Cons

Pros

Additional concern: output rich text

levlam commented Feb 12, 2024

gagarski commented Feb 12, 2024

gagarski commented Feb 11, 2024 •

edited

Loading