@@ -6,17 +6,13 @@ import TabItem from '@theme/TabItem';
6
6
7
7
Get alerts for:
8
8
9
- - Hanging LLM api calls
10
- - Slow LLM api calls
11
- - Failed LLM api calls
12
- - Budget Tracking per key/user
13
- - Spend Reports - Weekly & Monthly spend per Team, Tag
14
- - Failed db read/writes
15
- - Model outage alerting
16
- - Daily Reports:
17
- - ** LLM** Top 5 slowest deployments
18
- - ** LLM** Top 5 deployments with most failed requests
19
- - ** Spend** Weekly & Monthly spend per Team, Tag
9
+ | Category | Alert Type |
10
+ | ----------| ------------|
11
+ | ** LLM Performance** | Hanging API calls, Slow API calls, Failed API calls, Model outage alerting |
12
+ | ** Budget & Spend** | Budget tracking per key/user, Soft budget alerts, Weekly & Monthly spend reports per Team/Tag |
13
+ | ** System Health** | Failed database read/writes |
14
+ | ** Daily Reports** | Top 5 slowest LLM deployments, Top 5 LLM deployments with most failed requests, Weekly & Monthly spend per Team/Tag |
15
+
20
16
21
17
22
18
Works across:
@@ -93,6 +89,51 @@ litellm_settings:
93
89
redact_messages_in_exceptions: True
94
90
```
95
91
92
+ ### Soft Budget Alerts for Virtual Keys
93
+
94
+ Use this to send an alert when a key/team is close to it's budget running out
95
+
96
+ Step 1. Create a virtual key with a soft budget
97
+
98
+ Set the ` soft_budget ` to 0.001
99
+
100
+ ``` shell
101
+ curl -X ' POST' \
102
+ ' http://localhost:4000/key/generate' \
103
+ -H ' accept: application/json' \
104
+ -H ' x-goog-api-key: sk-1234' \
105
+ -H ' Content-Type: application/json' \
106
+ -d ' {
107
+ "key_alias": "prod-app1",
108
+ "team_id": "113c1a22-e347-4506-bfb2-b320230ea414",
109
+ "soft_budget": 0.001
110
+ }'
111
+ ```
112
+
113
+ Step 2. Send a request to the proxy with the virtual key
114
+
115
+ ``` shell
116
+ curl http://0.0.0.0:4000/chat/completions \
117
+ -H " Content-Type: application/json" \
118
+ -H " Authorization: Bearer sk-Nb5eCf427iewOlbxXIH4Ow" \
119
+ -d ' {
120
+ "model": "openai/gpt-4",
121
+ "messages": [
122
+ {
123
+ "role": "user",
124
+ "content": "this is a test request, write a short poem"
125
+ }
126
+ ]
127
+ }'
128
+
129
+ ```
130
+
131
+ Step 3. Check slack for Expected Alert
132
+
133
+ <Image img={require('../../img/soft_budget_alert.png')}/>
134
+
135
+
136
+
96
137
97
138
### Add Metadata to alerts
98
139
@@ -123,7 +164,7 @@ response = client.chat.completions.create(
123
164
124
165
<Image img={require('../../img/alerting_metadata.png')}/>
125
166
126
- ### Opting into specific alert types
167
+ ### Select specific alert types
127
168
128
169
Set ` alert_types ` if you want to Opt into only specific alert types. When alert_types is not set, all Default Alert Types are enabled.
129
170
@@ -145,7 +186,7 @@ general_settings:
145
186
]
146
187
```
147
188
148
- ### Set specific slack channels per alert type
189
+ ### Map slack channels to alert type
149
190
150
191
Use this if you want to set specific channels per alert type
151
192
@@ -243,7 +284,7 @@ curl -i http://localhost:4000/v1/chat/completions \
243
284
` ` `
244
285
245
286
246
- # ## Using MS Teams Webhooks
287
+ # ## MS Teams Webhooks
247
288
248
289
MS Teams provides a slack compatible webhook url that you can use for alerting
249
290
@@ -285,7 +326,7 @@ curl --location 'http://0.0.0.0:4000/health/services?service=slack' \
285
326
286
327
<Image img={require('../../img/ms_teams_alerting.png')}/>
287
328
288
- # ## Using Discord Webhooks
329
+ # ## Discord Webhooks
289
330
290
331
Discord provides a slack compatible webhook url that you can use for alerting
291
332
0 commit comments