GitHub - puar-playground/TextLap: Source code of the TextLap model, a LLM for text-2-layout generation.

TextLap

This repo is the source code of the paper: TextLap: Customizing Language Models for Text-to-Layout Planning.

1. Installation

conda create -n TextLap python=3.9
conda activate TextLap
pip install -r requirements.txt
pip install flash-attn --no-build-isolation
pip3 install -e ".[model_worker,webui]"

2. Training

TextLap is trained using FastChat. A slightly modified version of FastChat is also included in this repository, but its behavior should be the same as the official code when training without adding special tokens. A demo training script is also provided.

#!/bin/bash
output="/path/to/model/save/dir"
model_name="saved_model_name"
torchrun --nproc_per_node=8 --master_port=20003 fastchat/train/train.py \
    --model_name_or_path lmsys/vicuna-7b-v1.5  \
    --data_path /finetune/data/train_data.json \
    --bf16 True \
    --output_dir $output \
    --num_train_epochs 4 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 1 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 1000 \
    --save_total_limit 1 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 2 \
    --fsdp "full_shard auto_wrap" \
    --fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --lazy_preprocess True \
    --special_token False

3. Inference

3.1 Graphic design model

The model is available at puar-playground/TextLap-Graphic. The graphic design model on the Crello dataset requires a multimodal language model to generate captions for each visual element. The code to run the inference script is as follows:

python inference_crello.py

The code will show a comparison between the real layout and generated layout.

3.2 Image layout planning model

The model is available at puar-playground/TextLap-COCO. The image layout planning model is trained on the COCO layout dataset. The checkpoint trained using CSS layout format is released and tested:

python inference_coco.py

4. Command line Chat

For chat through command line

python3 -m fastchat.serve.cli --model-path "puar-playground/TextLap-COCO"

InstLap Dataset

The fine-tuning data is available as JSON files in the finetune_data folder.

The graphic design subset of the InstLap dataset has been released separately on Huggingface at puar-playground/crello-cap to support future research. The crello-cap dataset includes both the training and testing splits from Crello, with element captions generated by Phi-3-Vision. Smaller elements are filtered and merged into the background, labeled as background_merge.

Reference

@article{chen2024textlap,
  title={TextLap: Customizing Language Models for Text-to-Layout Planning},
  author={Chen, Jian and Zhang, Ruiyi and Zhou, Yufan and Healey, Jennifer and Gu, Jiuxiang and Xu, Zhiqiang and Chen, Changyou},
  journal={arXiv preprint arXiv:2410.12844},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
fastchat		fastchat
figure		figure
finetune_data		finetune_data
utils		utils
LICENSE		LICENSE
README.md		README.md
inference_coco.py		inference_coco.py
inference_crello.py		inference_crello.py
requirements.txt		requirements.txt
update.sh		update.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TextLap

1. Installation

2. Training

3. Inference

3.1 Graphic design model

3.2 Image layout planning model

4. Command line Chat

InstLap Dataset

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

puar-playground/TextLap

Folders and files

Latest commit

History

Repository files navigation

TextLap

1. Installation

2. Training

3. Inference

3.1 Graphic design model

3.2 Image layout planning model

4. Command line Chat

InstLap Dataset

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages