feat: expose GPU energy consumption (mJ) in responses #3315

JulienDelavande · 2025-08-28T13:49:16Z

What does this PR do?

This PR adds energy consumption measurement to each request served by the router.
A new field energy_mj (in millijoules) is added to the details section of the response.

Example usage:

curl 127.0.0.1:3000/generate \
    -X POST \
    -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20, "details":true}}' \
    -H 'Content-Type: application/json'

Response (excerpt):

{
  "generated_text": "Deep learning is a subset of machine learning...",
  "details": {
    "finish_reason": "length",
    "generated_tokens": 20,
    "tokens": [...],
    "energy_mj": 108836
  }
}

Motivation

Provide users with direct insight into the energy usage of their requests.
Enable frontends (e.g. [Chat UI Energy Demo](https://huggingface.co/spaces/jdelavande/chat-ui-energy)) to display this metric alongside generated text.
Support sustainability efforts by making energy costs more transparent.

Limitations

The measurement is taken at the GPU level during the full processing of the request by the router.
Energy is not attributed per user when multiple requests are batched — values represent an approximation of total GPU energy consumption.
Despite this, the metric remains useful to give users an idea of the energy footprint of their queries without deploying TGI themselves.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
Pull Request section?
Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
[documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and
[here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
Did you write any new necessary tests?

Who can review?

@regisss @Narsil

regisss

LGTM!

feat: add energy consumption for each request

cc51aab

regisss previously approved these changes Aug 28, 2025

View reviewed changes

fix: update doc

2b6d074

JulienDelavande dismissed regisss’s stale review via 2b6d074 August 28, 2025 14:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: expose GPU energy consumption (mJ) in responses #3315

feat: expose GPU energy consumption (mJ) in responses #3315

Uh oh!

JulienDelavande commented Aug 28, 2025

Uh oh!

regisss left a comment

Uh oh!

Uh oh!

feat: expose GPU energy consumption (mJ) in responses #3315

Are you sure you want to change the base?

feat: expose GPU energy consumption (mJ) in responses #3315

Uh oh!

Conversation

JulienDelavande commented Aug 28, 2025

What does this PR do?

Motivation

Limitations

Before submitting

Who can review?

Uh oh!

regisss left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!