improve: aggregate user infos as standalone user context within prompt #47

nekomeowww · 2023-05-06T02:43:53Z

No description provided.

rafiramadhana · 2023-05-07T18:40:49Z

@nekomeowww I am interested to work on this issue.

However, maybe some questions before getting started.

Would you mind to explain more about these terms:

User infos
Standalone user context
Within prompt

Thanks.

nekomeowww · 2023-05-08T03:02:49Z

@nekomeowww I am interested to work on this issue.

However, maybe some questions before getting started.

Would you mind to explain more about these terms:

User infos

Standalone user context

Within prompt

Thanks.

Thank you!

Yes you can pick this issue and work on it, but I am afraid of it might be too hard to understand and make changes to OpenAI prompt in pkg/openai/prompt.go for you due to it was written in Simplified Chinese. You will also need a valid OpenAI account to do the prompt engineering.

This issue is meant to improve the chat histories summarization feature (aka. recap), the mean goal is to reduce the token usage of OpenAI prompt.

Before dive into this feature, please allow me to summarize how recap works. There is a middleware called RecordMessage, it will extract and format the chat messages coming from Telegram and then store them into Postgres. When user send the /recap command or it was the time to send a automatic recap message to chat groups (implemented in internal/services/auto recap), the chathistories models will format the messages into the following pattern (implementation at

insights-bot/internal/models/chathistories/chat_histories.go

Line 235 in 8665bef

    
           func (m *Model) SummarizeChatHistories(chatID int64, histories []*ent.ChatHistories) (string, error) {

):

msgId:1 UserName1 sent: ```Hello!```
msgId:2 UserName2 replying to [UserName1 sent msgId:1]: ```Hello! How are you today?```

And then it will inject the pattern into OpenAI prompt template for OpenAI's GPT-3.5 model to summarize the chat histories for us.

You may find out the UserName1 and UserName2 is explicitly stated each time they appears. This can be inefficient and use a lot of tokens for prompt when multiple users appears multiple times when they chatted in a same group.

Therefore I came up with a idea: why don't we aggregate the usernames appeared in the chat histories, and then place a formatted username map before the chat histories, additionally, use userId or array index to represent the username just like this:

Users:"""
1: UserName1
2: UserName2
...
10: UserName10
"""

Chat histories:"""
msgId:1 user:1 sent: """Hello!"""
msgId:2 user:2 replying to [user:1 sent msgId:1]: """Hello! How are you today?"""
msgId:3 user:3 sent: """Nice to meet you guys!"""
msgId:4 user:10 sent """The party is just about to start!!!"""
"""

The terms can be explained now.

User infos: includes user's full name, username, userId, it will be used later in the prompt. In the example above, I only used username (can be full name too), the number before the username can be userId or just indexes.
Stand-alone user context: it is the part Users: in the above example, it holds the aggregated user names and number it represented for users.
Within prompt is the pattern I talked about above, the formatted message pattern is part of the prompt, the full prompt is located at pkg/openai/prompt.go

I think the best way to let you jump into this issue is to wait for me to implement the i18n support we talked about previously (#67), and then we can use English to write the OpenAI prompt for better understanding. How is that?

rafiramadhana · 2023-05-08T03:23:37Z

Yes you can pick this issue and work on it, but I am afraid of it might be too hard to understand and make changes to OpenAI prompt in pkg/openai/prompt.go for you due to it was written in Simplified Chinese. You will also need a valid OpenAI account to do the prompt engineering

OK. No worries.

Thanks for the explanation.

Do you have recommendation of issues that I can help? Maybe something that needs to be done sooner and requires less dependency (e.g. credentials).

nekomeowww · 2023-05-08T10:01:44Z

Yes you can pick this issue and work on it, but I am afraid of it might be too hard to understand and make changes to OpenAI prompt in pkg/openai/prompt.go for you due to it was written in Simplified Chinese. You will also need a valid OpenAI account to do the prompt engineering

OK. No worries.

Thanks for the explanation.

Do you have recommendation of issues that I can help? Maybe something that needs to be done sooner and requires less dependency (e.g. credentials).

What do you mean by credentials?

nekomeowww · 2023-05-08T10:07:26Z

TBH, there aren't any simple and easy issues or ongoing future issues with less dependency for you to work with, due to the lacked support of i18n, and insights-bot is a project that relies on languages and GPT models, we initially developed it with Chinese support only. You may have to wait for us to support i18n.

rafiramadhana · 2023-05-08T13:15:24Z

What do you mean by credentials?

Something like access to GPT products or the telegram bot.

You may have to wait for us to support i18n.

Ok no worries. Thanks.

nekomeowww changed the title ~~improve: aggregate user infos as standalone context within prompt~~ improve: aggregate user infos as standalone user context within prompt May 6, 2023

nekomeowww added the enhancement New feature or request label May 6, 2023

This was referenced May 6, 2023

improve: allow users to define their nicknames and include nicknames within user context #48

Open

bug: empty username picked when first name and last name have more than 10 chars #50

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve: aggregate user infos as standalone user context within prompt #47

improve: aggregate user infos as standalone user context within prompt #47

nekomeowww commented May 6, 2023

rafiramadhana commented May 7, 2023

nekomeowww commented May 8, 2023 •

edited

Loading

rafiramadhana commented May 8, 2023

nekomeowww commented May 8, 2023

nekomeowww commented May 8, 2023

rafiramadhana commented May 8, 2023

improve: aggregate user infos as standalone user context within prompt #47

improve: aggregate user infos as standalone user context within prompt #47

Comments

nekomeowww commented May 6, 2023

rafiramadhana commented May 7, 2023

nekomeowww commented May 8, 2023 • edited Loading

rafiramadhana commented May 8, 2023

nekomeowww commented May 8, 2023

nekomeowww commented May 8, 2023

rafiramadhana commented May 8, 2023

nekomeowww commented May 8, 2023 •

edited

Loading