Skip to content

Doubts about the direction of the project's evolution #30

@xiaoniaoyouhuajiang

Description

@xiaoniaoyouhuajiang

Regarding improving constensor's performance for LLMs, I've been giving it some more thought. We previously discussed optimizing the CPU backend. It seems to me that to achieve the goal you mentioned – "run an LLM at very competitive speeds on any device" – which sounds a lot like what TVM aims for, we might need to consider introducing a more sophisticated compilation architecture, perhaps something akin to TVM's multi-level IR (a high-level graph IR and a low-level operator IR). This could enable more powerful graph optimizations, operator fusion, and facilitate future extensions to more backends.

However, I also notice that constensor's current design might lean more towards a runtime that executes directly, with graph optimizations being triggered implicitly, perhaps without the pre-conceived complexity of a multi-level IR system. Introducing such an architecture would be a significant undertaking.

I'd be really interested to hear your thoughts on the long-term positioning of constensor. Do you envision it evolving into a general-purpose compiler framework like TVM (perhaps with differentiators like its Rust implementation or a focus on JIT capabilities), or is the focus more on it being a lightweight, intelligent runtime optimized for specific scenarios (like efficient LLM inference)?

@EricLBuehler

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions