Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some problems about the Paper. #21

Open
6Roy opened this issue Mar 15, 2024 · 1 comment
Open

Some problems about the Paper. #21

6Roy opened this issue Mar 15, 2024 · 1 comment

Comments

@6Roy
Copy link

6Roy commented Mar 15, 2024

Hello, author, I recently saw this paper of yours and I have questions about some of them:

  1. Data set construction: I am curious about the manual replacement of 250 examples for MS-COCO. Are the segment_descriptions here all come up by myself? And then there's layout mask, how is this layout png constructed?
  2. Regarding the improvement of self-attention and cross-attention layers in the generation of diagrams, I have read a lot of papers recently, and I feel that the improvement point is relatively small, are there any other areas for improvement?
@YunjiKim
Copy link
Collaborator

Hi, @6Roy,

Regarding the first question, we just segmented the original texts rather than coming up with all the object-wise labels from scratch. It is also the case for layout images, since MS-COCO dataset offers instance-wise layout labels.

For the second question, I first want you to consider the difficulty of evaluating generative models especially when the target is so specific as ours. As you mentioned, our method shows small improvements on some metrics but we would like to emphasize that the strongest contribution is coming from improving the fidelity to layout conditions of existing t2i model even without requiring fine tuning process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants