Questions regarding token calculation and FLOPs measurement

Thank you for your excellent work on VisionZip. I have two questions:

1. FastV prunes R% of tokens from layer K onwards. Does the table below use the average number of tokens across all decoder layers, or is it calculated differently?

<img width="785" alt="Image" src="https://github.com/user-attachments/assets/cb288115-e6b3-4c5c-91b9-20aa4464eddb" />

2. In addition to latency improvements, did you measure the theoretical FLOPs reduction for the different pruning approaches? The calculation process would be greatly appreciated.

Thank you for your time and for considering these questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Questions regarding token calculation and FLOPs measurement #22

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Questions regarding token calculation and FLOPs measurement #22

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions