Request for full weights of ./CLIP_ft_all_key_06-30-1427; eyeclip_visual.pt text encoder produces incorrect results

When using the EyeCLIP model, I found that the text encoder in eyeclip_visual.pt has issues:

The computed text similarity between any texts is always 1.

It seems the text encoder weights are incomplete or corrupted.
Load the model with the official EyeCLIP code:

import eyeclip
import torch

device = "cuda"
eyeclip_model, eyeclip_preprocess = eyeclip.load("ViT-B/32", device=device, jit=False)

# Load weights
weights_path = "./eyeclip_visual.pt"
eyeclip_model.load_state_dict(torch.load(weights_path))
eyeclip_model.eval()

# Test text similarity
text_features = eyeclip_model.encode_text(["hello", "world"])
similarity = (text_features @ text_features.T)
print(similarity)


The output is always:

tensor([[1., 1.],
        [1., 1.]])


Expected behavior

The text encoder should produce distinguishable embeddings for different texts.

Ideally, provide the full weights ./CLIP_ft_all_key_06-30-1427 to replace the current eyeclip_visual.pt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for full weights of ./CLIP_ft_all_key_06-30-1427; eyeclip_visual.pt text encoder produces incorrect results #10

Load weights

Test text similarity

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Request for full weights of ./CLIP_ft_all_key_06-30-1427; eyeclip_visual.pt text encoder produces incorrect results #10

Description

Load weights

Test text similarity

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions