FPN logit #3

Zzang-yeah · 2024-08-23T09:05:18Z

When and where is the FPN logit stored?
Whether online or offline, tools/demo.py just runs and generates an image, but no npy file is created, so I can't proceed with student learning.

Zzang-yeah · 2024-08-28T01:52:02Z

I fixed the above problem
In yolo_head.py, on line 316, you need to add -f to yolox_command and expfile to yolox_command
But I have another problem, the processes don't seem to run in order when multi-gpu.
The npy file is not saved and I keep getting file not found after loading it.
I think I need to modify the code to synchronize the processes.

martinaub · 2024-08-28T10:26:55Z

Hi @Zzang-yeah, thank you for your interest in our repo.
To save images and FPN logits, you should ensure the parameters in both student and teacher files are correct.
For instance, for saving the FPN teacher logits, you should set to true the self.KD parameter in the teacher file. In addition, for online KD, you should set to true the self.KD_online and self.KD in the student file. In the case of online KD, the FPN and images are saved at every epoch and deleted after the KD loss is completed for the past epoch in a way that does not require too much space while training.
About the multi-gpu training, I do not know since I am only using a single GPU.
Let me know if it works with the proper parameters :)

Zzang-yeah · 2024-08-29T00:41:03Z

When learning with multi-GPU, the online learning didn't seem to work well because the order between processes was messed up.
It was trying to load the npy file before creating the npy file, which caused a file not found error and training stopped.
I am now switching to offline training and it is working fine.

I have one question: I understand that in online learning, the teacher model is augmented and saved logit for every iteration to KD train the student model, and in offline learning, the student model is KD trained by running Teacher_Inference.py with the teacher model and running augmentation and save logit only once.
Doesn't this make any difference between online and offline learning?
I ask because my guess is that online learning with KD at every iteration will perform better, but I don't remember it being mentioned in the paper.

martinaub · 2024-08-29T09:51:45Z

Thank you for raising this concern; I thought it would have been obvious, but maybe not.
Because of computational power limitation, we introduced the offline KD, drastically reducing the time of KD training. Offline KD means that the model does not rely on the online data augmentation provided by the original YOLOX model. Instead, the model relies only on the pre-defined dataset. To highlight the difference in training between both methods (online data augmentation and non-online data augmentation), we show in our paper the no-Aug models, in which the metrics result without the data augmentation. By comparing the other model (e.g., YOLOX-L with YOLO-L-noAug), you will see a big difference in object detection metrics, where the L model is way better than the L-noAug. This result highlights the utility of online data augmentation during training and showcases that online KD would perform better than offline KD.
However, online data augmentation is performed randomly while training the model. Thus, when launching the KD method, we do not know how will look like the online augmentated dataset for the training, resulting in launching the inference teacher at every iteration for each augmentated data.
In addition, you can still train the teacher with online data augmentation, and then the teacher model can transfer better knowledge into the student model.
Thus, offline KD does not perform as well as online KD; however, as shown in the result metrics from the paper, the model is still improved.
If you have access to multiple GPUs, it should be faster for you to run the online KD.

Zzang-yeah · 2024-09-10T01:45:47Z

When comparing nano models trained with KD to nano models trained without, we found that the performance was not significantly better. Does the fact that the model with KD performed better mean that the FP was improved? In my experiments, the AP was higher for the nano model without KD than for the model with KD.

zxccsssd · 2024-10-09T08:20:28Z

@Zzang-yeah

I fixed the above problem In yolo_head.py, on line 316, you need to add -f to yolox_command and expfile to yolox_command But I have another problem, the processes don't seem to run in order when multi-gpu. The npy file is not saved and I keep getting file not found after loading it. I think I need to modify the code to synchronize the processes.

I encountered the same issue when using a single GPU. I tried the method you provided, but I still get an error saying the npy file was not found. May I take a look at your modified code?

xiaohongzaizhe · 2024-10-22T08:26:13Z

@Zzang-yeah

I fixed the above problem In yolo_head.py, on line 316, you need to add -f to yolox_command and expfile to yolox_command But I have another problem, the processes don't seem to run in order when multi-gpu. The npy file is not saved and I keep getting file not found after loading it. I think I need to modify the code to synchronize the processes.

I encountered the same issue when using a single GPU. I tried the method you provided, but I still get an error saying the npy file was not found. May I take a look at your modified code?

我已经解决了

zxccsssd · 2024-10-22T08:30:58Z

@Zzang-yeah

I fixed the above problem In yolo_head.py, on line 316, you need to add -f to yolox_command and expfile to yolox_command But I have another problem, the processes don't seem to run in order when multi-gpu. The npy file is not saved and I keep getting file not found after loading it. I think I need to modify the code to synchronize the processes.

I encountered the same issue when using a single GPU. I tried the method you provided, but I still get an error saying the npy file was not found. May I take a look at your modified code?

我已经解决了

大佬，怎么解决的？

martinaub closed this as completed Aug 28, 2024

martinaub reopened this Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FPN logit #3

FPN logit #3

Zzang-yeah commented Aug 23, 2024

Zzang-yeah commented Aug 28, 2024

martinaub commented Aug 28, 2024

Zzang-yeah commented Aug 29, 2024

martinaub commented Aug 29, 2024

Zzang-yeah commented Sep 10, 2024

zxccsssd commented Oct 9, 2024

xiaohongzaizhe commented Oct 22, 2024

zxccsssd commented Oct 22, 2024

FPN logit #3

FPN logit #3

Comments

Zzang-yeah commented Aug 23, 2024

Zzang-yeah commented Aug 28, 2024

martinaub commented Aug 28, 2024

Zzang-yeah commented Aug 29, 2024

martinaub commented Aug 29, 2024

Zzang-yeah commented Sep 10, 2024

zxccsssd commented Oct 9, 2024

xiaohongzaizhe commented Oct 22, 2024

zxccsssd commented Oct 22, 2024