-
Notifications
You must be signed in to change notification settings - Fork 79
First draft vector parameters #1245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
6d0ea1f
ac9c774
9edc04f
b836017
e6c2e29
58d2461
8d52853
6cc24a4
2dd6085
071d63e
406e6d3
ba209ce
44e90e8
8d17332
95fd428
4287cff
7934d10
b28232e
c4cd8f5
e48d56c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,6 +15,201 @@ versions: | |
| description: | | ||
| General support for data-parallel execution. | ||
| params: | ||
| ELEN: | ||
| description: | | ||
| The maximum size in bits of a vector element that any operation can produce or consume, _ELEN_ {ge} 8, which | ||
| must be a power of 2. | ||
| schema: | ||
| type: integer | ||
| enum: [8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536] | ||
| VLEN: | ||
| description: | | ||
| The number of bits in a single vector register, _VLEN_ {ge} ELEN, which must be a power of 2, and must be no greater than 2^16^. | ||
| schema: | ||
| type: integer | ||
| enum: [8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536] | ||
| extra_validation: | | ||
| assert VLEN >= ELEN | ||
| SEW_MIN: | ||
| description: | | ||
| Implementations must provide fractional LMUL settings that allow the | ||
| narrowest supported type to occupy a fraction of a vector register | ||
| corresponding to the ratio of the narrowest supported type's width to | ||
| that of the largest supported type's width. In general, the | ||
| requirement is to support LMUL {ge} SEW~MIN~/ELEN, where SEW~MIN~ is | ||
| the narrowest supported SEW value and ELEN is the widest supported SEW | ||
| value. | ||
| schema: | ||
| type: integer | ||
| enum: [8, 16, 32, 64] | ||
| default: 8 | ||
| extra_validation: | | ||
| assert SEW_MIN <= ELEN | ||
| SUPPORT_FRACTIONAL_LMUL_BEYOND_REQUIRED: | ||
| description: | ||
| For a given supported fractional LMUL setting, implementations must support | ||
| SEW settings between SEW~MIN~ and LMUL * ELEN, inclusive. | ||
|
Comment on lines
+50
to
+51
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would seem to be a normative rule, rather than a parameter. (But don't trust my interpretation! 😉 )
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The issue below discusses this issue. The normative rule above simply states what is required to be implemented. For example, it doesnt state that SEW 64 LMUL 1/2 cannot be supported. Therefore in my mind I see it as up to the designer to allow for this implementation or not. Supporting vs not supporting these cases would produce a different result when executing the same program: that is why it should be a parameter in my mind.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If SEW=64, then ELEN=64, so LMUL=1/2 must be supported. Per the spec:
I don't see a needed parameterization here, but maybe I'm missing something.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe this line is refering to the fact that ELEN=64 and a standard extension implies an SEW_MIN of 8, meaning lmul 1/2, 1/4, 1/8 have to be supported LMUL. The line I quoted in the spec states that, (if ELEN =64 and SEW_MIN = 8) that for SEW=64: fLMUL do not have to be supported My impression is this is leaving opportunity for fLMUL of 1/2 to be supported for SEW=64 and such if an ELEN of 128 is added. The spec does not state that one cannot implement fLMUL of 1/2 1/4 1/8 for all SEW, but does not state they have to. This causes different behavior on devices that chose to implement these lines vs not |
||
| schema: | ||
| type: string | ||
| enum: ["no_unrequired_supported", "custom"] | ||
| VILL_SET_ON_RESERVED_VTYPE: | ||
| description: | ||
| The use of `vtype` encodings with LMUL < SEW~MIN~/ELEN is | ||
| __reserved__, but implementations can set `vill` if they do not | ||
| support these configurations. | ||
| schema: | ||
| type: boolean | ||
| RESERVED_VSET_X0X0_VILL_SET: | ||
| description: | | ||
| When _rs1_=`x0` and _rd_=`x0`, the instructions operate as if the current vector length in `vl` is used as the AVL... | ||
| Use of the [vset] instructions is also reserved if `vill` was 1 beforehand. Implementations may set `vill` in either case. | ||
| schema: | ||
| type: string | ||
| enum: ["never", "always", "custom"] | ||
| RESERVED_VSET_X0X0_VLMAX_CHANGE: | ||
| description: | | ||
| When _rs1_=`x0` and _rd_=`x0`, the instructions operate as if the current vector length in `vl` is used as the AVL... | ||
| Use of the [vset] instructions with a new SEW/LMUL ratio that would result in a change of VLMAX is reserved... | ||
| Implementations may set `vill` in either case. | ||
| schema: | ||
| type: string | ||
| enum: ["never", "always", "custom"] | ||
| VECTOR_LS_INDEX_MAX_EEW: | ||
| description: | ||
| A profile may place an upper limit on the maximum supported index | ||
| EEW (e.g., only up to XLEN) smaller than ELEN. | ||
| schema: | ||
| type: string | ||
| enum: ["8", "16", "32", "64", "XLEN"] | ||
| extra_validation: | | ||
| assert 8 <= ELEN if VECTOR_LS_INDEX_MAX_EEW == "8" | ||
| assert 16 <= ELEN if VECTOR_LS_INDEX_MAX_EEW == "16" | ||
| assert 32 <= ELEN if VECTOR_LS_INDEX_MAX_EEW == "32" | ||
| assert 64 <= ELEN if VECTOR_LS_INDEX_MAX_EEW == "64" | ||
| assert XLEN <= ELEN if VECTOR_LS_INDEX_MAX_EEW == "XLEN" | ||
| VECTOR_FF_UPDATE_PAST_TRIM: | ||
| description: | ||
| Similarly, fault-only-first load instructions may update active destination elements past the element that causes trimming | ||
| of the vector length (but not past the original vector length). The values of these spurious updates do not have to correspond | ||
| to the values in memory at the addressed memory locations. Non-idempotent memory locations can only be accessed when it is known | ||
| the corresponding element load operation will not be restarted due to a trap or vector-length trimming. | ||
| schema: | ||
| type: string | ||
| enum: ["update_none", "custom"] | ||
| VECTOR_FF_SEG_EXCEPTION_PARTIAL_LOAD: | ||
| description: | | ||
| For fault-only-first segment loads, if an exception is detected partway | ||
| through accessing the zeroth segment, the trap is taken. | ||
| If an exception is detected partway through accessing a subsequent segment, | ||
| `vl` is reduced to the index of that segment. | ||
| In both cases, it is implementation-defined whether a subset of the segment is | ||
| loaded. | ||
| schema: | ||
| type: string | ||
| enum: ["no_subsegment_loaded", "custom"] | ||
| VECTOR_LS_MISSALIGNED_EXCEPTION: | ||
jacassidy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| description: | | ||
| If an element accessed by a vector memory instruction is not naturally aligned to the size of the element, | ||
| either the element is transferred successfully or an address misaligned exception is raised on that element. | ||
|
|
||
| Support for misaligned vector memory accesses is independent of an implementation’s support for misaligned scalar memory accesses. | ||
| schema: | ||
| type: boolean | ||
| VECTOR_LOAD_PAST_TRAP: | ||
| description: | | ||
| Load instructions may overwrite active destination vector register | ||
| group elements past the element index at which the trap is reported. | ||
| schema: | ||
| type: boolean | ||
| VECTOR_LOAD_SEG_FF_OVERWRITE_ELEMENTS_AFTER_FAULT: | ||
| description: | | ||
| These instructions may overwrite destination vector register group | ||
| elements past the point at which a trap is reported or past the point | ||
| at which vector length is trimmed. | ||
| schema: | ||
| type: string | ||
| enum: ["no_overwrite", "custom"] | ||
| VECTOR_LS_SEG_PARTIAL_ACCESS: | ||
| description: | | ||
| If a trap occurs during | ||
| access to a segment, it is implementation-defined whether a subset | ||
| of the faulting segment's accesses are performed before the trap is taken. | ||
| schema: | ||
| type: boolean | ||
| LEGAL_VSTART: | ||
| description: | | ||
| Implementations are permitted to raise illegal-instruction exceptions when | ||
| attempting to execute a vector instruction with a value of `vstart` that the | ||
| implementation can never produce when executing that same instruction with | ||
| the same `vtype` setting. | ||
| schema: | ||
| type: string | ||
| enum: ["1_stride", "2_stride", "4_stride", "custom"] | ||
|
Comment on lines
+139
to
+147
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this needs to comport more closely to the description. Maybe "EXCEPTION_ON_UNSUPPORTED_VSTART" or something like that? Then, it would just need to be a boolean.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a really good point I did not think of. I think in this case there should actually be 2 parameters, one to state if exceptions occur on unsupported VSTART values, and another to state which VSTART values are supported. With the current parameter definition I was trying to cover which VSTART values are supported by the implementation.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Valid VSTART values are not configurable.
VSTART must be able to represent up to the maximum supported element number.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I phrased my statement poorly, the CSR can always be set with any VSTART value, whether any vector instruction runs or trapps on a given VSTART value in the CSR is implementation dependent |
||
| VECTOR_LS_WHOLEREG_MISSALIGNED_EXCEPTION: | ||
| description: | | ||
| Implementations are allowed to raise a misaligned address exception on | ||
| whole register loads and stores if the base address is not naturally | ||
| aligned to the larger of the size of the encoded EEW in bytes (EEW/8) | ||
| or the implementation's smallest supported SEW size in bytes | ||
| (SEW~MIN~/8). | ||
| schema: | ||
| type: boolean | ||
| VSSTATUS_VS_EXISTS: | ||
| description: | | ||
| For implementations with a writable `misa.V` field, | ||
| the `vsstatus.VS` field may exist even if `misa.V` is clear. | ||
| schema: | ||
| type: boolean | ||
| VECTOR_FF_NO_EXCEPTION_TRIM: | ||
| description: | | ||
| Even when an exception is not raised, implementations are permitted to process | ||
| fewer than `vl` elements and reduce `vl` accordingly, but if `vstart`=0 and | ||
| `vl`>0, then at least one element must be processed. | ||
| schema: | ||
| type: boolean | ||
| VFREDUSUM_NAN: | ||
| description: | | ||
| The reduction tree structure must be deterministic for a given value in vtype and vl. | ||
| As a consequence of this definition, implementations need not propagate | ||
jacassidy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| NaN payloads through the reduction tree when no elements are active. In | ||
| particular, if no elements are active and the scalar input is NaN, | ||
| implementations are permitted to canonicalize the NaN and, if the NaN is | ||
| signaling, set the invalid exception flag. Implementations are alternatively | ||
| permitted to pass through the original NaN and set no exception flags, as with | ||
| `vfredosum`. | ||
| schema: | ||
| type: string | ||
| enum: ["no_change", "custom"] | ||
|
Comment on lines
+181
to
+182
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could use more eyes on this one:
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| VFREDUSUM_NODE_ROUNDING_BEHAVIOR: | ||
| description: | | ||
| Each operator first computes an exact sum as a RISC-V scalar floating-point | ||
| addition with infinite exponent range and precision, then converts this exact | ||
| sum to a floating-point format with range and precision each at least as great as | ||
| the element floating-point format indicated by SEW, rounding using the currently | ||
| active floating-point dynamic rounding mode and raising exception flags as necessary. | ||
| A different floating-point range and precision may be chosen for the result of each operator. | ||
| schema: | ||
| type: string | ||
| enum: ["SEW_precision", "custom"] | ||
| VFREDUSUM_INACTIVE_NODE_ELEMENT_BEHAVIOR: | ||
| description: | | ||
| A node where one input is derived only from elements masked-off or beyond the active | ||
| vector length may either treat that input as the additive identity of the appropriate | ||
| EEW or simply copy the other input to its output. | ||
| schema: | ||
| type: string | ||
| enum: ["additive_identity", "copy"] | ||
| VFREDUSUM_FINAL_NODE_ELEMENT_BEHAVIOR: | ||
| description: | | ||
| An implementation is allowed to add an additional additive identity to the final result. | ||
| schema: | ||
| type: string | ||
| enum: ["additive_identity", "copy"] | ||
| IMPRECISE_VECTOR_TRAP_SETTABLE: | ||
| description: | | ||
| Some profiles may choose to provide a privileged mode bit to select between precise and imprecise vector traps. | ||
| schema: | ||
| type: boolean | ||
| MUTABLE_MISA_V: | ||
| description: | | ||
| Indicates whether or not the `V` extension can be disabled with the `misa.V` bit. | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.