Questions about whether it is an autoregressive model

"In the LLaDA paper, it is clearly stated that the model is a diffusion model rather than an autoregressive model. However, I found that your code uses a lower triangular matrix mask, which introduces causal inference relationships and turns the model into an autoregressive one. Does this conflict with the core argument of the paper? Additionally, when I tried to remove this lower triangular matrix from the source code, the loss decreased very slowly, and the test accuracy after 5 epochs was 0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about whether it is an autoregressive model #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Questions about whether it is an autoregressive model #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions