Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
195 changes: 195 additions & 0 deletions spec/std/isa/ext/V.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,201 @@ versions:
description: |
General support for data-parallel execution.
params:
ELEN:
description: |
The maximum size in bits of a vector element that any operation can produce or consume, _ELEN_ {ge} 8, which
must be a power of 2.
schema:
type: integer
enum: [8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536]
VLEN:
description: |
The number of bits in a single vector register, _VLEN_ {ge} ELEN, which must be a power of 2, and must be no greater than 2^16^.
schema:
type: integer
enum: [8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536]
extra_validation: |
assert VLEN >= ELEN
SEW_MIN:
description: |
Implementations must provide fractional LMUL settings that allow the
narrowest supported type to occupy a fraction of a vector register
corresponding to the ratio of the narrowest supported type's width to
that of the largest supported type's width. In general, the
requirement is to support LMUL {ge} SEW~MIN~/ELEN, where SEW~MIN~ is
the narrowest supported SEW value and ELEN is the widest supported SEW
value.
schema:
type: integer
enum: [8, 16, 32, 64]
default: 8
extra_validation: |
assert SEW_MIN <= ELEN
SUPPORT_FRACTIONAL_LMUL_BEYOND_REQUIRED:
description:
For a given supported fractional LMUL setting, implementations must support
SEW settings between SEW~MIN~ and LMUL * ELEN, inclusive.
Comment on lines +50 to +51
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would seem to be a normative rule, rather than a parameter. (But don't trust my interpretation! 😉 )

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue below discusses this issue. The normative rule above simply states what is required to be implemented. For example, it doesnt state that SEW 64 LMUL 1/2 cannot be supported. Therefore in my mind I see it as up to the designer to allow for this implementation or not. Supporting vs not supporting these cases would produce a different result when executing the same program: that is why it should be a parameter in my mind.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If SEW=64, then ELEN=64, so LMUL=1/2 must be supported. Per the spec:

For standard vector extensions with ELEN=64, fractional LMULs of 1/2, 1/4, and 1/8 must be supported.

I don't see a needed parameterization here, but maybe I'm missing something.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this line is refering to the fact that ELEN=64 and a standard extension implies an SEW_MIN of 8, meaning lmul 1/2, 1/4, 1/8 have to be supported LMUL. The line I quoted in the spec states that, (if ELEN =64 and SEW_MIN = 8) that for

SEW=64: fLMUL do not have to be supported
SEW=32: fMUL 1/2 has to be supported
SEW=16: fLMUL 1/2-1/4 have to be supported
SEW=8: fLMUL 1/2-1/8 have to be supported

My impression is this is leaving opportunity for fLMUL of 1/2 to be supported for SEW=64 and such if an ELEN of 128 is added. The spec does not state that one cannot implement fLMUL of 1/2 1/4 1/8 for all SEW, but does not state they have to. This causes different behavior on devices that chose to implement these lines vs not

schema:
type: string
enum: ["no_unrequired_supported", "custom"]
VILL_SET_ON_RESERVED_VTYPE:
description:
The use of `vtype` encodings with LMUL < SEW~MIN~/ELEN is
__reserved__, but implementations can set `vill` if they do not
support these configurations.
schema:
type: boolean
RESERVED_VSET_X0X0_VILL_SET:
description: |
When _rs1_=`x0` and _rd_=`x0`, the instructions operate as if the current vector length in `vl` is used as the AVL...
Use of the [vset] instructions is also reserved if `vill` was 1 beforehand. Implementations may set `vill` in either case.
schema:
type: string
enum: ["never", "always", "custom"]
RESERVED_VSET_X0X0_VLMAX_CHANGE:
description: |
When _rs1_=`x0` and _rd_=`x0`, the instructions operate as if the current vector length in `vl` is used as the AVL...
Use of the [vset] instructions with a new SEW/LMUL ratio that would result in a change of VLMAX is reserved...
Implementations may set `vill` in either case.
schema:
type: string
enum: ["never", "always", "custom"]
VECTOR_LS_INDEX_MAX_EEW:
description:
A profile may place an upper limit on the maximum supported index
EEW (e.g., only up to XLEN) smaller than ELEN.
schema:
type: string
enum: ["8", "16", "32", "64", "XLEN"]
extra_validation: |
assert 8 <= ELEN if VECTOR_LS_INDEX_MAX_EEW == "8"
assert 16 <= ELEN if VECTOR_LS_INDEX_MAX_EEW == "16"
assert 32 <= ELEN if VECTOR_LS_INDEX_MAX_EEW == "32"
assert 64 <= ELEN if VECTOR_LS_INDEX_MAX_EEW == "64"
assert XLEN <= ELEN if VECTOR_LS_INDEX_MAX_EEW == "XLEN"
VECTOR_FF_UPDATE_PAST_TRIM:
description:
Similarly, fault-only-first load instructions may update active destination elements past the element that causes trimming
of the vector length (but not past the original vector length). The values of these spurious updates do not have to correspond
to the values in memory at the addressed memory locations. Non-idempotent memory locations can only be accessed when it is known
the corresponding element load operation will not be restarted due to a trap or vector-length trimming.
schema:
type: string
enum: ["update_none", "custom"]
VECTOR_FF_SEG_EXCEPTION_PARTIAL_LOAD:
description: |
For fault-only-first segment loads, if an exception is detected partway
through accessing the zeroth segment, the trap is taken.
If an exception is detected partway through accessing a subsequent segment,
`vl` is reduced to the index of that segment.
In both cases, it is implementation-defined whether a subset of the segment is
loaded.
schema:
type: string
enum: ["no_subsegment_loaded", "custom"]
VECTOR_LS_MISSALIGNED_EXCEPTION:
description: |
If an element accessed by a vector memory instruction is not naturally aligned to the size of the element,
either the element is transferred successfully or an address misaligned exception is raised on that element.

Support for misaligned vector memory accesses is independent of an implementation’s support for misaligned scalar memory accesses.
schema:
type: boolean
VECTOR_LOAD_PAST_TRAP:
description: |
Load instructions may overwrite active destination vector register
group elements past the element index at which the trap is reported.
schema:
type: boolean
VECTOR_LOAD_SEG_FF_OVERWRITE_ELEMENTS_AFTER_FAULT:
description: |
These instructions may overwrite destination vector register group
elements past the point at which a trap is reported or past the point
at which vector length is trimmed.
schema:
type: string
enum: ["no_overwrite", "custom"]
VECTOR_LS_SEG_PARTIAL_ACCESS:
description: |
If a trap occurs during
access to a segment, it is implementation-defined whether a subset
of the faulting segment's accesses are performed before the trap is taken.
schema:
type: boolean
LEGAL_VSTART:
description: |
Implementations are permitted to raise illegal-instruction exceptions when
attempting to execute a vector instruction with a value of `vstart` that the
implementation can never produce when executing that same instruction with
the same `vtype` setting.
schema:
type: string
enum: ["1_stride", "2_stride", "4_stride", "custom"]
Comment on lines +139 to +147
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to comport more closely to the description. Maybe "EXCEPTION_ON_UNSUPPORTED_VSTART" or something like that? Then, it would just need to be a boolean.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good point I did not think of. I think in this case there should actually be 2 parameters, one to state if exceptions occur on unsupported VSTART values, and another to state which VSTART values are supported. With the current parameter definition I was trying to cover which VSTART values are supported by the implementation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid VSTART values are not configurable.

The vstart CSR is defined to have only enough writable bits to hold the largest element index (one less than the maximum VLMAX).

VSTART must be able to represent up to the maximum supported element number.

The maximum vector length is obtained with the largest LMUL setting (8) and the
smallest SEW setting (8), so VLMAX_max = 8*VLEN/8 = VLEN. For example, for
VLEN=256, vstart would have 8 bits to represent indices from 0 through 255.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I phrased my statement poorly, the CSR can always be set with any VSTART value, whether any vector instruction runs or trapps on a given VSTART value in the CSR is implementation dependent

VECTOR_LS_WHOLEREG_MISSALIGNED_EXCEPTION:
description: |
Implementations are allowed to raise a misaligned address exception on
whole register loads and stores if the base address is not naturally
aligned to the larger of the size of the encoded EEW in bytes (EEW/8)
or the implementation's smallest supported SEW size in bytes
(SEW~MIN~/8).
schema:
type: boolean
VSSTATUS_VS_EXISTS:
description: |
For implementations with a writable `misa.V` field,
the `vsstatus.VS` field may exist even if `misa.V` is clear.
schema:
type: boolean
VECTOR_FF_NO_EXCEPTION_TRIM:
description: |
Even when an exception is not raised, implementations are permitted to process
fewer than `vl` elements and reduce `vl` accordingly, but if `vstart`=0 and
`vl`>0, then at least one element must be processed.
schema:
type: boolean
VFREDUSUM_NAN:
description: |
The reduction tree structure must be deterministic for a given value in vtype and vl.
As a consequence of this definition, implementations need not propagate
NaN payloads through the reduction tree when no elements are active. In
particular, if no elements are active and the scalar input is NaN,
implementations are permitted to canonicalize the NaN and, if the NaN is
signaling, set the invalid exception flag. Implementations are alternatively
permitted to pass through the original NaN and set no exception flags, as with
`vfredosum`.
schema:
type: string
enum: ["no_change", "custom"]
Comment on lines +181 to +182
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use more eyes on this one:

  • Is the parameter just about vfredusum and "NaN" or should "no active elements" be part of the parameter name as well? In other words, is it specific to the case of "no active elements"?
  • Does it need to be a parameter at all, given the description is in an "informative note" and not in the regular text?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VFREDUSUM_NODE_ROUNDING_BEHAVIOR:
description: |
Each operator first computes an exact sum as a RISC-V scalar floating-point
addition with infinite exponent range and precision, then converts this exact
sum to a floating-point format with range and precision each at least as great as
the element floating-point format indicated by SEW, rounding using the currently
active floating-point dynamic rounding mode and raising exception flags as necessary.
A different floating-point range and precision may be chosen for the result of each operator.
schema:
type: string
enum: ["SEW_precision", "custom"]
VFREDUSUM_INACTIVE_NODE_ELEMENT_BEHAVIOR:
description: |
A node where one input is derived only from elements masked-off or beyond the active
vector length may either treat that input as the additive identity of the appropriate
EEW or simply copy the other input to its output.
schema:
type: string
enum: ["additive_identity", "copy"]
VFREDUSUM_FINAL_NODE_ELEMENT_BEHAVIOR:
description: |
An implementation is allowed to add an additional additive identity to the final result.
schema:
type: string
enum: ["additive_identity", "copy"]
IMPRECISE_VECTOR_TRAP_SETTABLE:
description: |
Some profiles may choose to provide a privileged mode bit to select between precise and imprecise vector traps.
schema:
type: boolean
MUTABLE_MISA_V:
description: |
Indicates whether or not the `V` extension can be disabled with the `misa.V` bit.
Expand Down