Skip to content

Conversation

magnusjahre
Copy link

A first attempt at updating the Sspesa description to explain how it can be used to generate event-based and time-based performance profiles.

…can be used to generate event-based and time-based performance profiles.
Copy link
Collaborator

@bcstrongx bcstrongx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! A few comments to consider.

src/body.adoc Outdated
NOTE: _In contrast to Sspesa, Smpdis/Sspdis (see <<_precise_decoded_instruction_sampling_smpdissspdis>>) create profiles in which an instruction is represented proportionally to its execution count._

Performance profiles obtained through Sspesa are approximate for two reasons. The first is systematic error (or bias) which reduces profile accuracy by attributing a counter overflow to a different instruction than the one that caused it. In event-based profiles this is commonly referred to as skid, e.g., that internal delays cause profiling hardware to missattribute counter overflows. In a time-based profile, skid and biased attribution policies cause samples to be attributed to different instruction(s) than the one(s) of which the core is currently exposing execution time. Systematic errors can be eliminated by adopting appropriate attribution policies, but this may not always be desirable (e.g., due to implementation overheads). For this reason, Sspesa does not require implementations to support precise attribution.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without Sspesa or Ssplcofi, skid can (and likely will) mess up attribution. With Sspesa, attribution to a PC should be precise, as the resolved sample PC is unimpacted by the skid. Any other sample state collected (e.g., call-stack, GPRs, etc) may imprecise as a result of skid, unless Ssplcofi is also implemented. So I would strike the parts about skid contributing to attribution imprecision. In fact, the last line above conflicts with the very name of Sspesa (PESA = precise event sample attribution).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points! I partitioned this paragraph into two to deal with event-based profiles (which must be free from bias) and time-based profiles (which should minimize bias) separately. I also took out the part about skid as it, as you say, does not really apply here.

Performance profiles obtained through Sspesa are approximate for two reasons. The first is systematic error (or bias) which reduces profile accuracy by attributing a counter overflow to a different instruction than the one that caused it. In event-based profiles this is commonly referred to as skid, e.g., that internal delays cause profiling hardware to missattribute counter overflows. In a time-based profile, skid and biased attribution policies cause samples to be attributed to different instruction(s) than the one(s) of which the core is currently exposing execution time. Systematic errors can be eliminated by adopting appropriate attribution policies, but this may not always be desirable (e.g., due to implementation overheads). For this reason, Sspesa does not require implementations to support precise attribution.

WARNING: I don't fully understand why we need a separate skid-less extension. I guess this is mainly for event-based profiles because developers would want to know which events are skid-less and which are not. Another option would be to provide this information as metadata for each sample, in which case we might not need Ssplcofi? Depending on what we decide, we may need to modify the text in the above paragraph. --Magnus

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right in your assessment. We could indicate skidless on a per-sample basis, but because I expect all samples from a given implementation to skid or not I figured just having an extension would be easier than burning a bit. There's no actual state in the extension, just a way to communicate to the tools that sample state collected will be precise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Thanks for clarifying.

@magnusjahre
Copy link
Author

I did another pass in response to the discussion in last week's TG meeting.

One thing I did not change was this note:

NOTE: For events that do not support precise attribution, implementations are expected to make "best effort" to ensure that the derived sample PC is the best option for event attribution. For most cases, the PC of an instruction retiring in the cycle of overflow or, if no instructions retire in that cycle, the PC of the instruction that is next to retire is recommended.

It overlaps with the new discussion of time-based profiles, so I'm not sure it is still needed. If we leave it in, we need to revise it as implementations may not always attribute to the next retiring instruction.

@bcstrongx bcstrongx deleted the branch riscv:instr-sampling September 30, 2025 23:13
@bcstrongx bcstrongx closed this Sep 30, 2025
@bcstrongx bcstrongx reopened this Sep 30, 2025
@bcstrongx
Copy link
Collaborator

I have a few small suggestions but since your PR comes from a fork I think it'll be easiest for me to merge this then submit a new PR.

Signed-off-by: Beeman Strong <97133824+bcstrongx@users.noreply.github.com>
@bcstrongx bcstrongx merged commit 32524fb into riscv:instr-sampling Oct 18, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants