-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathHCQ_MSRVTT_1kA_L3.txt
2607 lines (2607 loc) · 195 KB
/
HCQ_MSRVTT_1kA_L3.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Experiment directory: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3
Preparing the dataloaders ...
Loading dataset MSRVTT_jsfusion_trainval in ram ...
Finish loading dataset MSRVTT_jsfusion_trainval in ram, taking 1399.3037362098694 s.
Loading dataset MSRVTT_jsfusion_test in ram ...
Finish loading dataset MSRVTT_jsfusion_test in ram, taking 83.50736212730408 s.
Loading dataset MSRVTT_jsfusion_test in ram ...
Finish loading dataset MSRVTT_jsfusion_test in ram, taking 61.18068528175354 s.
Training ...
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch0.pth ...
Done in 5.122s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch0.pth ...
Done in 6.955s
epoch : 0
loss : 0
learning_rate : 5e-05
n_samples : 0
n_steps : 0
MSRVTT_jsfusion_test/t2v_metrics/R1: 0.0
MSRVTT_jsfusion_test/t2v_metrics/R5: 0.1
MSRVTT_jsfusion_test/t2v_metrics/R10: 0.6
MSRVTT_jsfusion_test/t2v_metrics/R50: 4.8
MSRVTT_jsfusion_test/t2v_metrics/MedR: 510.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 502.005
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 0.0
MSRVTT_jsfusion_test/v2t_metrics/R1: 0.1
MSRVTT_jsfusion_test/v2t_metrics/R5: 0.7
MSRVTT_jsfusion_test/v2t_metrics/R10: 1.2
MSRVTT_jsfusion_test/v2t_metrics/R50: 5.7
MSRVTT_jsfusion_test/v2t_metrics/MedR: 510.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 504.663
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 0.43795191398878897
mnt_best : 0.0
not_improved_count: 0
Train Epoch: 1 [1/250 128/32000 (0%)] Loss: 32.83125 (QuantReg: 22.82731) QuantErr: 22.82731 batch_time=43.85550
Train Epoch: 1 [12/250 1536/32000 (5%)] Loss: 30.60303 (QuantReg: 22.83835) QuantErr: 22.83835 batch_time=0.47750
Train Epoch: 1 [23/250 2944/32000 (9%)] Loss: 27.57227 (QuantReg: 22.68104) QuantErr: 22.68104 batch_time=0.48669
Train Epoch: 1 [34/250 4352/32000 (14%)] Loss: 22.35240 (QuantReg: 22.67958) QuantErr: 22.67958 batch_time=0.50392
Train Epoch: 1 [45/250 5760/32000 (18%)] Loss: 22.80320 (QuantReg: 22.66862) QuantErr: 22.66862 batch_time=0.44018
Train Epoch: 1 [56/250 7168/32000 (22%)] Loss: 20.06005 (QuantReg: 22.71804) QuantErr: 22.71804 batch_time=0.44240
Train Epoch: 1 [67/250 8576/32000 (27%)] Loss: 20.04149 (QuantReg: 22.64183) QuantErr: 22.64183 batch_time=0.44190
Train Epoch: 1 [78/250 9984/32000 (31%)] Loss: 19.45669 (QuantReg: 22.61533) QuantErr: 22.61533 batch_time=0.42179
Train Epoch: 1 [89/250 11392/32000 (36%)] Loss: 18.20087 (QuantReg: 22.63413) QuantErr: 22.63413 batch_time=0.44230
Train Epoch: 1 [100/250 12800/32000 (40%)] Loss: 18.20848 (QuantReg: 22.70575) QuantErr: 22.70575 batch_time=0.46585
Train Epoch: 1 [111/250 14208/32000 (44%)] Loss: 16.83438 (QuantReg: 22.61252) QuantErr: 22.61252 batch_time=0.45029
Train Epoch: 1 [122/250 15616/32000 (49%)] Loss: 18.02362 (QuantReg: 22.60206) QuantErr: 22.60206 batch_time=0.45388
Train Epoch: 1 [133/250 17024/32000 (53%)] Loss: 17.47155 (QuantReg: 22.62625) QuantErr: 22.62625 batch_time=0.49979
Train Epoch: 1 [144/250 18432/32000 (58%)] Loss: 16.99991 (QuantReg: 22.64014) QuantErr: 22.64014 batch_time=0.51420
Train Epoch: 1 [155/250 19840/32000 (62%)] Loss: 14.56367 (QuantReg: 22.62826) QuantErr: 22.62826 batch_time=0.46800
Train Epoch: 1 [166/250 21248/32000 (66%)] Loss: 14.92841 (QuantReg: 22.66840) QuantErr: 22.66840 batch_time=0.47441
Train Epoch: 1 [177/250 22656/32000 (71%)] Loss: 16.36340 (QuantReg: 22.62175) QuantErr: 22.62175 batch_time=0.47866
Train Epoch: 1 [188/250 24064/32000 (75%)] Loss: 14.96557 (QuantReg: 22.61112) QuantErr: 22.61112 batch_time=0.44001
Train Epoch: 1 [199/250 25472/32000 (80%)] Loss: 16.08274 (QuantReg: 22.69087) QuantErr: 22.69087 batch_time=0.48742
Train Epoch: 1 [210/250 26880/32000 (84%)] Loss: 14.83075 (QuantReg: 22.65704) QuantErr: 22.65704 batch_time=0.46384
Train Epoch: 1 [221/250 28288/32000 (88%)] Loss: 13.70708 (QuantReg: 22.60753) QuantErr: 22.60753 batch_time=0.52184
Train Epoch: 1 [232/250 29696/32000 (93%)] Loss: 16.01554 (QuantReg: 22.61163) QuantErr: 22.61163 batch_time=0.44290
Train Epoch: 1 [243/250 31104/32000 (97%)] Loss: 13.38301 (QuantReg: 22.62234) QuantErr: 22.62234 batch_time=0.45087
Train Epoch: 1 codebook_update_time=0.96127
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch1.pth ...
Done in 3.938s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch1.pth ...
Done in 7.965s
epoch : 1
loss : 18.566399478912352
quant_reg : 22.65883716583252
quant_err : 22.65883716583252
learning_rate : 5e-05
n_samples : 32000
n_steps : 250
MSRVTT_jsfusion_test/t2v_metrics/R1: 9.1
MSRVTT_jsfusion_test/t2v_metrics/R5: 29.8
MSRVTT_jsfusion_test/t2v_metrics/R10: 43.9
MSRVTT_jsfusion_test/t2v_metrics/R50: 79.8
MSRVTT_jsfusion_test/t2v_metrics/MedR: 14.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 42.874
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 22.833582654259274
MSRVTT_jsfusion_test/v2t_metrics/R1: 8.3
MSRVTT_jsfusion_test/v2t_metrics/R5: 32.5
MSRVTT_jsfusion_test/v2t_metrics/R10: 46.1
MSRVTT_jsfusion_test/v2t_metrics/R50: 79.0
MSRVTT_jsfusion_test/v2t_metrics/MedR: 12.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 41.999
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 23.167942122785814
mnt_best : 22.833582654259274
not_improved_count: 0
Train Epoch: 2 [1/250 128/32000 (0%)] Loss: 13.09873 (QuantReg: 11.60903) QuantErr: 11.60903 batch_time=43.81042
Train Epoch: 2 [12/250 1536/32000 (5%)] Loss: 15.51215 (QuantReg: 11.35355) QuantErr: 11.35355 batch_time=0.47912
Train Epoch: 2 [23/250 2944/32000 (9%)] Loss: 12.63166 (QuantReg: 11.58446) QuantErr: 11.58446 batch_time=0.50988
Train Epoch: 2 [34/250 4352/32000 (14%)] Loss: 14.50120 (QuantReg: 12.04102) QuantErr: 12.04102 batch_time=0.44435
Train Epoch: 2 [45/250 5760/32000 (18%)] Loss: 14.14351 (QuantReg: 11.91783) QuantErr: 11.91783 batch_time=0.44045
Train Epoch: 2 [56/250 7168/32000 (22%)] Loss: 12.55157 (QuantReg: 12.04705) QuantErr: 12.04705 batch_time=0.45983
Train Epoch: 2 [67/250 8576/32000 (27%)] Loss: 13.43394 (QuantReg: 12.15230) QuantErr: 12.15230 batch_time=0.44205
Train Epoch: 2 [78/250 9984/32000 (31%)] Loss: 14.32668 (QuantReg: 12.52283) QuantErr: 12.52283 batch_time=0.47107
Train Epoch: 2 [89/250 11392/32000 (36%)] Loss: 14.74108 (QuantReg: 12.83707) QuantErr: 12.83707 batch_time=0.46597
Train Epoch: 2 [100/250 12800/32000 (40%)] Loss: 12.58172 (QuantReg: 12.45203) QuantErr: 12.45203 batch_time=0.44083
Train Epoch: 2 [111/250 14208/32000 (44%)] Loss: 12.58557 (QuantReg: 12.70559) QuantErr: 12.70559 batch_time=0.43757
Train Epoch: 2 [122/250 15616/32000 (49%)] Loss: 11.79485 (QuantReg: 12.94580) QuantErr: 12.94580 batch_time=0.44329
Train Epoch: 2 [133/250 17024/32000 (53%)] Loss: 13.72666 (QuantReg: 13.27091) QuantErr: 13.27091 batch_time=0.46659
Train Epoch: 2 [144/250 18432/32000 (58%)] Loss: 12.90316 (QuantReg: 13.59238) QuantErr: 13.59238 batch_time=0.43576
Train Epoch: 2 [155/250 19840/32000 (62%)] Loss: 12.32447 (QuantReg: 13.69081) QuantErr: 13.69081 batch_time=0.46618
Train Epoch: 2 [166/250 21248/32000 (66%)] Loss: 12.22063 (QuantReg: 13.48847) QuantErr: 13.48847 batch_time=0.43952
Train Epoch: 2 [177/250 22656/32000 (71%)] Loss: 11.55470 (QuantReg: 13.97384) QuantErr: 13.97384 batch_time=0.43978
Train Epoch: 2 [188/250 24064/32000 (75%)] Loss: 12.13988 (QuantReg: 13.74501) QuantErr: 13.74501 batch_time=0.43637
Train Epoch: 2 [199/250 25472/32000 (80%)] Loss: 10.37850 (QuantReg: 13.91778) QuantErr: 13.91778 batch_time=0.44506
Train Epoch: 2 [210/250 26880/32000 (84%)] Loss: 11.04460 (QuantReg: 14.27949) QuantErr: 14.27949 batch_time=0.43723
Train Epoch: 2 [221/250 28288/32000 (88%)] Loss: 10.95867 (QuantReg: 14.31577) QuantErr: 14.31577 batch_time=0.47403
Train Epoch: 2 [232/250 29696/32000 (93%)] Loss: 11.72800 (QuantReg: 13.98548) QuantErr: 13.98548 batch_time=0.43485
Train Epoch: 2 [243/250 31104/32000 (97%)] Loss: 11.20017 (QuantReg: 14.23133) QuantErr: 14.23133 batch_time=0.49009
Train Epoch: 2 codebook_update_time=0.90039
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch2.pth ...
Done in 5.520s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch2.pth ...
Done in 9.492s
removing stale ckpt [epoch 1] [took 0.00s]
removing stale ckpt [epoch 0] [took 0.01s]
epoch : 2
loss : 12.817329372406006
quant_reg : 13.027298622131347
quant_err : 13.027298622131347
learning_rate : 4.75e-05
n_samples : 64000
n_steps : 500
MSRVTT_jsfusion_test/t2v_metrics/R1: 12.1
MSRVTT_jsfusion_test/t2v_metrics/R5: 38.5
MSRVTT_jsfusion_test/t2v_metrics/R10: 52.9
MSRVTT_jsfusion_test/t2v_metrics/R50: 83.5
MSRVTT_jsfusion_test/t2v_metrics/MedR: 9.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 34.301
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 29.100509353672667
MSRVTT_jsfusion_test/v2t_metrics/R1: 14.3
MSRVTT_jsfusion_test/v2t_metrics/R5: 41.3
MSRVTT_jsfusion_test/v2t_metrics/R10: 55.1
MSRVTT_jsfusion_test/v2t_metrics/R50: 84.5
MSRVTT_jsfusion_test/v2t_metrics/MedR: 9.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 32.465
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 31.92610193844254
mnt_best : 29.100509353672667
not_improved_count: 0
Train Epoch: 3 [1/250 128/32000 (0%)] Loss: 12.06062 (QuantReg: 11.44259) QuantErr: 11.44259 batch_time=36.14632
Train Epoch: 3 [12/250 1536/32000 (5%)] Loss: 11.35840 (QuantReg: 11.87926) QuantErr: 11.87926 batch_time=0.43156
Train Epoch: 3 [23/250 2944/32000 (9%)] Loss: 11.20785 (QuantReg: 11.40909) QuantErr: 11.40909 batch_time=0.44268
Train Epoch: 3 [34/250 4352/32000 (14%)] Loss: 10.70342 (QuantReg: 11.83622) QuantErr: 11.83622 batch_time=0.44180
Train Epoch: 3 [45/250 5760/32000 (18%)] Loss: 9.90324 (QuantReg: 11.61013) QuantErr: 11.61013 batch_time=0.44952
Train Epoch: 3 [56/250 7168/32000 (22%)] Loss: 11.39174 (QuantReg: 11.54603) QuantErr: 11.54603 batch_time=0.43477
Train Epoch: 3 [67/250 8576/32000 (27%)] Loss: 11.61854 (QuantReg: 11.97667) QuantErr: 11.97667 batch_time=0.43393
Train Epoch: 3 [78/250 9984/32000 (31%)] Loss: 11.77129 (QuantReg: 11.93829) QuantErr: 11.93829 batch_time=0.47715
Train Epoch: 3 [89/250 11392/32000 (36%)] Loss: 11.75396 (QuantReg: 11.72993) QuantErr: 11.72993 batch_time=0.71545
Train Epoch: 3 [100/250 12800/32000 (40%)] Loss: 12.87441 (QuantReg: 11.95269) QuantErr: 11.95269 batch_time=0.50232
Train Epoch: 3 [111/250 14208/32000 (44%)] Loss: 11.58214 (QuantReg: 12.23561) QuantErr: 12.23561 batch_time=0.42897
Train Epoch: 3 [122/250 15616/32000 (49%)] Loss: 10.79363 (QuantReg: 12.13221) QuantErr: 12.13221 batch_time=0.45346
Train Epoch: 3 [133/250 17024/32000 (53%)] Loss: 9.88701 (QuantReg: 12.15233) QuantErr: 12.15233 batch_time=0.47769
Train Epoch: 3 [144/250 18432/32000 (58%)] Loss: 10.74149 (QuantReg: 12.20969) QuantErr: 12.20969 batch_time=0.42566
Train Epoch: 3 [155/250 19840/32000 (62%)] Loss: 10.05471 (QuantReg: 12.21230) QuantErr: 12.21230 batch_time=0.48662
Train Epoch: 3 [166/250 21248/32000 (66%)] Loss: 10.49986 (QuantReg: 12.38680) QuantErr: 12.38680 batch_time=0.45322
Train Epoch: 3 [177/250 22656/32000 (71%)] Loss: 10.00033 (QuantReg: 12.56349) QuantErr: 12.56349 batch_time=0.45063
Train Epoch: 3 [188/250 24064/32000 (75%)] Loss: 10.01628 (QuantReg: 12.51929) QuantErr: 12.51929 batch_time=0.45115
Train Epoch: 3 [199/250 25472/32000 (80%)] Loss: 11.65317 (QuantReg: 12.62611) QuantErr: 12.62611 batch_time=0.44521
Train Epoch: 3 [210/250 26880/32000 (84%)] Loss: 10.26978 (QuantReg: 12.68830) QuantErr: 12.68830 batch_time=0.43390
Train Epoch: 3 [221/250 28288/32000 (88%)] Loss: 10.70022 (QuantReg: 12.95265) QuantErr: 12.95265 batch_time=0.43313
Train Epoch: 3 [232/250 29696/32000 (93%)] Loss: 10.61561 (QuantReg: 12.90516) QuantErr: 12.90516 batch_time=0.44076
Train Epoch: 3 [243/250 31104/32000 (97%)] Loss: 9.50581 (QuantReg: 13.35578) QuantErr: 13.35578 batch_time=0.44107
Train Epoch: 3 codebook_update_time=1.00870
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch3.pth ...
Done in 4.172s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch3.pth ...
Done in 8.340s
removing stale ckpt [epoch 2] [took 0.02s]
epoch : 3
loss : 10.952469360351563
quant_reg : 12.190602237701416
quant_err : 12.190602237701416
learning_rate : 4.5125e-05
n_samples : 96000
n_steps : 750
MSRVTT_jsfusion_test/t2v_metrics/R1: 14.6
MSRVTT_jsfusion_test/t2v_metrics/R5: 40.3
MSRVTT_jsfusion_test/t2v_metrics/R10: 55.6
MSRVTT_jsfusion_test/t2v_metrics/R50: 85.6
MSRVTT_jsfusion_test/t2v_metrics/MedR: 8.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 32.209
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 31.98238874689571
MSRVTT_jsfusion_test/v2t_metrics/R1: 15.1
MSRVTT_jsfusion_test/v2t_metrics/R5: 40.6
MSRVTT_jsfusion_test/v2t_metrics/R10: 55.6
MSRVTT_jsfusion_test/v2t_metrics/R50: 85.5
MSRVTT_jsfusion_test/v2t_metrics/MedR: 8.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 31.149
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 32.42345251140843
mnt_best : 31.98238874689571
not_improved_count: 0
Train Epoch: 4 [1/250 128/32000 (0%)] Loss: 11.44930 (QuantReg: 11.38882) QuantErr: 11.38882 batch_time=36.61996
Train Epoch: 4 [12/250 1536/32000 (5%)] Loss: 9.13384 (QuantReg: 11.57896) QuantErr: 11.57896 batch_time=0.59030
Train Epoch: 4 [23/250 2944/32000 (9%)] Loss: 8.73876 (QuantReg: 11.67418) QuantErr: 11.67418 batch_time=0.50335
Train Epoch: 4 [34/250 4352/32000 (14%)] Loss: 10.53733 (QuantReg: 11.62023) QuantErr: 11.62023 batch_time=0.43096
Train Epoch: 4 [45/250 5760/32000 (18%)] Loss: 9.55077 (QuantReg: 12.09265) QuantErr: 12.09265 batch_time=0.46845
Train Epoch: 4 [56/250 7168/32000 (22%)] Loss: 10.64332 (QuantReg: 11.79316) QuantErr: 11.79316 batch_time=0.47998
Train Epoch: 4 [67/250 8576/32000 (27%)] Loss: 10.39478 (QuantReg: 11.83850) QuantErr: 11.83850 batch_time=1.70828
Train Epoch: 4 [78/250 9984/32000 (31%)] Loss: 9.44104 (QuantReg: 12.43796) QuantErr: 12.43796 batch_time=1.58867
Train Epoch: 4 [89/250 11392/32000 (36%)] Loss: 9.09528 (QuantReg: 11.98465) QuantErr: 11.98465 batch_time=0.43600
Train Epoch: 4 [100/250 12800/32000 (40%)] Loss: 10.31756 (QuantReg: 12.25072) QuantErr: 12.25072 batch_time=0.43433
Train Epoch: 4 [111/250 14208/32000 (44%)] Loss: 9.69282 (QuantReg: 12.42768) QuantErr: 12.42768 batch_time=0.42162
Train Epoch: 4 [122/250 15616/32000 (49%)] Loss: 10.72278 (QuantReg: 12.06815) QuantErr: 12.06815 batch_time=0.42961
Train Epoch: 4 [133/250 17024/32000 (53%)] Loss: 9.45379 (QuantReg: 12.05396) QuantErr: 12.05396 batch_time=0.42276
Train Epoch: 4 [144/250 18432/32000 (58%)] Loss: 9.40115 (QuantReg: 12.42507) QuantErr: 12.42507 batch_time=0.47592
Train Epoch: 4 [155/250 19840/32000 (62%)] Loss: 9.01040 (QuantReg: 12.34481) QuantErr: 12.34481 batch_time=0.44149
Train Epoch: 4 [166/250 21248/32000 (66%)] Loss: 10.54159 (QuantReg: 12.57438) QuantErr: 12.57438 batch_time=0.50513
Train Epoch: 4 [177/250 22656/32000 (71%)] Loss: 10.11916 (QuantReg: 12.55103) QuantErr: 12.55103 batch_time=0.45036
Train Epoch: 4 [188/250 24064/32000 (75%)] Loss: 9.39581 (QuantReg: 12.58825) QuantErr: 12.58825 batch_time=0.46829
Train Epoch: 4 [199/250 25472/32000 (80%)] Loss: 9.58508 (QuantReg: 12.25649) QuantErr: 12.25649 batch_time=0.45379
Train Epoch: 4 [210/250 26880/32000 (84%)] Loss: 8.32013 (QuantReg: 12.66826) QuantErr: 12.66826 batch_time=0.46081
Train Epoch: 4 [221/250 28288/32000 (88%)] Loss: 9.18411 (QuantReg: 12.66631) QuantErr: 12.66631 batch_time=0.42764
Train Epoch: 4 [232/250 29696/32000 (93%)] Loss: 9.69078 (QuantReg: 12.48672) QuantErr: 12.48672 batch_time=0.45342
Train Epoch: 4 [243/250 31104/32000 (97%)] Loss: 8.83247 (QuantReg: 12.45794) QuantErr: 12.45794 batch_time=0.44147
Train Epoch: 4 codebook_update_time=0.82298
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch4.pth ...
Done in 3.946s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch4.pth ...
Done in 7.875s
removing stale ckpt [epoch 3] [took 0.01s]
epoch : 4
loss : 9.788853378295899
quant_reg : 12.217399677276612
quant_err : 12.217399677276612
learning_rate : 4.2868749999999995e-05
n_samples : 128000
n_steps : 1000
MSRVTT_jsfusion_test/t2v_metrics/R1: 17.6
MSRVTT_jsfusion_test/t2v_metrics/R5: 43.5
MSRVTT_jsfusion_test/t2v_metrics/R10: 57.0
MSRVTT_jsfusion_test/t2v_metrics/R50: 86.7
MSRVTT_jsfusion_test/t2v_metrics/MedR: 7.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 30.718
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 35.20672220101715
MSRVTT_jsfusion_test/v2t_metrics/R1: 16.6
MSRVTT_jsfusion_test/v2t_metrics/R5: 45.0
MSRVTT_jsfusion_test/v2t_metrics/R10: 58.5
MSRVTT_jsfusion_test/v2t_metrics/R50: 86.4
MSRVTT_jsfusion_test/v2t_metrics/MedR: 7.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 30.19
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 35.22293078008761
mnt_best : 35.20672220101715
not_improved_count: 0
Train Epoch: 5 [1/250 128/32000 (0%)] Loss: 9.95337 (QuantReg: 11.86706) QuantErr: 11.86706 batch_time=40.16910
Train Epoch: 5 [12/250 1536/32000 (5%)] Loss: 8.18843 (QuantReg: 12.05030) QuantErr: 12.05030 batch_time=0.45412
Train Epoch: 5 [23/250 2944/32000 (9%)] Loss: 9.04331 (QuantReg: 12.05496) QuantErr: 12.05496 batch_time=0.42985
Train Epoch: 5 [34/250 4352/32000 (14%)] Loss: 8.23559 (QuantReg: 12.04685) QuantErr: 12.04685 batch_time=0.45999
Train Epoch: 5 [45/250 5760/32000 (18%)] Loss: 8.97485 (QuantReg: 11.71347) QuantErr: 11.71347 batch_time=0.93398
Train Epoch: 5 [56/250 7168/32000 (22%)] Loss: 10.30818 (QuantReg: 12.25002) QuantErr: 12.25002 batch_time=0.45193
Train Epoch: 5 [67/250 8576/32000 (27%)] Loss: 9.74421 (QuantReg: 12.19296) QuantErr: 12.19296 batch_time=0.44050
Train Epoch: 5 [78/250 9984/32000 (31%)] Loss: 9.53052 (QuantReg: 12.16427) QuantErr: 12.16427 batch_time=0.43035
Train Epoch: 5 [89/250 11392/32000 (36%)] Loss: 8.54571 (QuantReg: 12.35797) QuantErr: 12.35797 batch_time=0.43896
Train Epoch: 5 [100/250 12800/32000 (40%)] Loss: 9.46775 (QuantReg: 12.15966) QuantErr: 12.15966 batch_time=0.44260
Train Epoch: 5 [111/250 14208/32000 (44%)] Loss: 9.68130 (QuantReg: 12.23354) QuantErr: 12.23354 batch_time=0.51181
Train Epoch: 5 [122/250 15616/32000 (49%)] Loss: 7.87275 (QuantReg: 12.26372) QuantErr: 12.26372 batch_time=0.46536
Train Epoch: 5 [133/250 17024/32000 (53%)] Loss: 9.09422 (QuantReg: 12.15833) QuantErr: 12.15833 batch_time=0.45199
Train Epoch: 5 [144/250 18432/32000 (58%)] Loss: 8.68091 (QuantReg: 12.05550) QuantErr: 12.05550 batch_time=0.43949
Train Epoch: 5 [155/250 19840/32000 (62%)] Loss: 9.68669 (QuantReg: 12.46503) QuantErr: 12.46503 batch_time=0.43068
Train Epoch: 5 [166/250 21248/32000 (66%)] Loss: 7.52363 (QuantReg: 12.60557) QuantErr: 12.60557 batch_time=0.45826
Train Epoch: 5 [177/250 22656/32000 (71%)] Loss: 8.79836 (QuantReg: 12.54465) QuantErr: 12.54465 batch_time=0.47795
Train Epoch: 5 [188/250 24064/32000 (75%)] Loss: 9.29198 (QuantReg: 12.43630) QuantErr: 12.43630 batch_time=0.69366
Train Epoch: 5 [199/250 25472/32000 (80%)] Loss: 6.91105 (QuantReg: 12.69849) QuantErr: 12.69849 batch_time=0.44000
Train Epoch: 5 [210/250 26880/32000 (84%)] Loss: 7.88353 (QuantReg: 12.92059) QuantErr: 12.92059 batch_time=0.44932
Train Epoch: 5 [221/250 28288/32000 (88%)] Loss: 9.58816 (QuantReg: 12.76789) QuantErr: 12.76789 batch_time=0.46837
Train Epoch: 5 [232/250 29696/32000 (93%)] Loss: 8.46835 (QuantReg: 12.67284) QuantErr: 12.67284 batch_time=0.46884
Train Epoch: 5 [243/250 31104/32000 (97%)] Loss: 7.85264 (QuantReg: 12.66855) QuantErr: 12.66855 batch_time=0.46251
Train Epoch: 5 codebook_update_time=0.86855
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch5.pth ...
Done in 10.093s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch5.pth ...
Done in 13.860s
removing stale ckpt [epoch 4] [took 0.02s]
epoch : 5
loss : 9.039031126022339
quant_reg : 12.347215183258056
quant_err : 12.347215183258056
learning_rate : 4.072531249999999e-05
n_samples : 160000
n_steps : 1250
MSRVTT_jsfusion_test/t2v_metrics/R1: 18.8
MSRVTT_jsfusion_test/t2v_metrics/R5: 45.9
MSRVTT_jsfusion_test/t2v_metrics/R10: 59.2
MSRVTT_jsfusion_test/t2v_metrics/R50: 87.3
MSRVTT_jsfusion_test/t2v_metrics/MedR: 7.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 29.106
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 37.1048557178105
MSRVTT_jsfusion_test/v2t_metrics/R1: 19.5
MSRVTT_jsfusion_test/v2t_metrics/R5: 45.5
MSRVTT_jsfusion_test/v2t_metrics/R10: 60.3
MSRVTT_jsfusion_test/v2t_metrics/R50: 88.0
MSRVTT_jsfusion_test/v2t_metrics/MedR: 7.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 29.233
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 37.680886069876955
mnt_best : 37.1048557178105
not_improved_count: 0
Train Epoch: 6 [1/250 128/32000 (0%)] Loss: 8.83435 (QuantReg: 12.51050) QuantErr: 12.51050 batch_time=35.33500
Train Epoch: 6 [12/250 1536/32000 (5%)] Loss: 9.53157 (QuantReg: 12.14843) QuantErr: 12.14843 batch_time=0.43711
Train Epoch: 6 [23/250 2944/32000 (9%)] Loss: 8.24592 (QuantReg: 12.12432) QuantErr: 12.12432 batch_time=0.44806
Train Epoch: 6 [34/250 4352/32000 (14%)] Loss: 9.15562 (QuantReg: 12.56200) QuantErr: 12.56200 batch_time=0.44051
Train Epoch: 6 [45/250 5760/32000 (18%)] Loss: 9.85272 (QuantReg: 12.28611) QuantErr: 12.28611 batch_time=0.44343
Train Epoch: 6 [56/250 7168/32000 (22%)] Loss: 8.07504 (QuantReg: 12.07554) QuantErr: 12.07554 batch_time=0.44370
Train Epoch: 6 [67/250 8576/32000 (27%)] Loss: 8.52167 (QuantReg: 12.11064) QuantErr: 12.11064 batch_time=0.88821
Train Epoch: 6 [78/250 9984/32000 (31%)] Loss: 8.06989 (QuantReg: 12.21470) QuantErr: 12.21470 batch_time=0.42587
Train Epoch: 6 [89/250 11392/32000 (36%)] Loss: 8.12128 (QuantReg: 12.52693) QuantErr: 12.52693 batch_time=0.43289
Train Epoch: 6 [100/250 12800/32000 (40%)] Loss: 9.09133 (QuantReg: 12.24650) QuantErr: 12.24650 batch_time=0.47888
Train Epoch: 6 [111/250 14208/32000 (44%)] Loss: 10.11119 (QuantReg: 12.23168) QuantErr: 12.23168 batch_time=0.42686
Train Epoch: 6 [122/250 15616/32000 (49%)] Loss: 8.40886 (QuantReg: 12.32557) QuantErr: 12.32557 batch_time=0.43042
Train Epoch: 6 [133/250 17024/32000 (53%)] Loss: 8.54484 (QuantReg: 12.52634) QuantErr: 12.52634 batch_time=0.52372
Train Epoch: 6 [144/250 18432/32000 (58%)] Loss: 8.03995 (QuantReg: 12.49967) QuantErr: 12.49967 batch_time=0.77183
Train Epoch: 6 [155/250 19840/32000 (62%)] Loss: 9.15015 (QuantReg: 12.77551) QuantErr: 12.77551 batch_time=0.45468
Train Epoch: 6 [166/250 21248/32000 (66%)] Loss: 7.97784 (QuantReg: 12.41195) QuantErr: 12.41195 batch_time=0.50605
Train Epoch: 6 [177/250 22656/32000 (71%)] Loss: 7.97878 (QuantReg: 12.42679) QuantErr: 12.42679 batch_time=0.45989
Train Epoch: 6 [188/250 24064/32000 (75%)] Loss: 7.75268 (QuantReg: 12.79352) QuantErr: 12.79352 batch_time=0.43626
Train Epoch: 6 [199/250 25472/32000 (80%)] Loss: 7.63631 (QuantReg: 12.48400) QuantErr: 12.48400 batch_time=4.28945
Train Epoch: 6 [210/250 26880/32000 (84%)] Loss: 6.20903 (QuantReg: 12.61933) QuantErr: 12.61933 batch_time=0.43162
Train Epoch: 6 [221/250 28288/32000 (88%)] Loss: 8.97249 (QuantReg: 12.38035) QuantErr: 12.38035 batch_time=0.49475
Train Epoch: 6 [232/250 29696/32000 (93%)] Loss: 8.70732 (QuantReg: 12.82257) QuantErr: 12.82257 batch_time=0.45461
Train Epoch: 6 [243/250 31104/32000 (97%)] Loss: 7.44357 (QuantReg: 12.77844) QuantErr: 12.77844 batch_time=0.43223
Train Epoch: 6 codebook_update_time=0.97663
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch6.pth ...
Done in 4.140s
removing stale ckpt [epoch 5] [took 0.01s]
epoch : 6
loss : 8.340163831710816
quant_reg : 12.416304889678955
quant_err : 12.416304889678955
learning_rate : 3.868904687499999e-05
n_samples : 192000
n_steps : 1500
MSRVTT_jsfusion_test/t2v_metrics/R1: 17.0
MSRVTT_jsfusion_test/t2v_metrics/R5: 46.2
MSRVTT_jsfusion_test/t2v_metrics/R10: 59.2
MSRVTT_jsfusion_test/t2v_metrics/R50: 87.8
MSRVTT_jsfusion_test/t2v_metrics/MedR: 6.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 29.385
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 35.95871811146236
MSRVTT_jsfusion_test/v2t_metrics/R1: 19.3
MSRVTT_jsfusion_test/v2t_metrics/R5: 47.3
MSRVTT_jsfusion_test/v2t_metrics/R10: 62.2
MSRVTT_jsfusion_test/v2t_metrics/R50: 87.3
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 27.549
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 38.435831265161866
mnt_best : 37.1048557178105
not_improved_count: 1
Train Epoch: 7 [1/250 128/32000 (0%)] Loss: 8.78065 (QuantReg: 12.00459) QuantErr: 12.00459 batch_time=34.03501
Train Epoch: 7 [12/250 1536/32000 (5%)] Loss: 7.42361 (QuantReg: 12.40001) QuantErr: 12.40001 batch_time=0.44406
Train Epoch: 7 [23/250 2944/32000 (9%)] Loss: 8.84435 (QuantReg: 12.15129) QuantErr: 12.15129 batch_time=0.42647
Train Epoch: 7 [34/250 4352/32000 (14%)] Loss: 7.13295 (QuantReg: 11.80920) QuantErr: 11.80920 batch_time=0.44960
Train Epoch: 7 [45/250 5760/32000 (18%)] Loss: 7.43465 (QuantReg: 12.24323) QuantErr: 12.24323 batch_time=0.43916
Train Epoch: 7 [56/250 7168/32000 (22%)] Loss: 7.02870 (QuantReg: 12.04548) QuantErr: 12.04548 batch_time=0.44780
Train Epoch: 7 [67/250 8576/32000 (27%)] Loss: 6.44055 (QuantReg: 12.38646) QuantErr: 12.38646 batch_time=4.95446
Train Epoch: 7 [78/250 9984/32000 (31%)] Loss: 8.59543 (QuantReg: 12.65621) QuantErr: 12.65621 batch_time=0.47060
Train Epoch: 7 [89/250 11392/32000 (36%)] Loss: 8.49204 (QuantReg: 12.41478) QuantErr: 12.41478 batch_time=0.45441
Train Epoch: 7 [100/250 12800/32000 (40%)] Loss: 7.85544 (QuantReg: 12.51474) QuantErr: 12.51474 batch_time=0.42850
Train Epoch: 7 [111/250 14208/32000 (44%)] Loss: 6.46204 (QuantReg: 12.48075) QuantErr: 12.48075 batch_time=0.43950
Train Epoch: 7 [122/250 15616/32000 (49%)] Loss: 6.15616 (QuantReg: 12.51491) QuantErr: 12.51491 batch_time=0.49897
Train Epoch: 7 [133/250 17024/32000 (53%)] Loss: 8.33362 (QuantReg: 12.52547) QuantErr: 12.52547 batch_time=0.89148
Train Epoch: 7 [144/250 18432/32000 (58%)] Loss: 7.42147 (QuantReg: 12.14680) QuantErr: 12.14680 batch_time=0.42249
Train Epoch: 7 [155/250 19840/32000 (62%)] Loss: 7.42013 (QuantReg: 12.36652) QuantErr: 12.36652 batch_time=0.42910
Train Epoch: 7 [166/250 21248/32000 (66%)] Loss: 7.94611 (QuantReg: 12.57547) QuantErr: 12.57547 batch_time=0.47176
Train Epoch: 7 [177/250 22656/32000 (71%)] Loss: 8.55206 (QuantReg: 12.53772) QuantErr: 12.53772 batch_time=0.43132
Train Epoch: 7 [188/250 24064/32000 (75%)] Loss: 8.60866 (QuantReg: 12.42687) QuantErr: 12.42687 batch_time=0.44642
Train Epoch: 7 [199/250 25472/32000 (80%)] Loss: 6.21436 (QuantReg: 12.50636) QuantErr: 12.50636 batch_time=0.44353
Train Epoch: 7 [210/250 26880/32000 (84%)] Loss: 7.78000 (QuantReg: 12.87903) QuantErr: 12.87903 batch_time=0.76620
Train Epoch: 7 [221/250 28288/32000 (88%)] Loss: 8.00854 (QuantReg: 12.48201) QuantErr: 12.48201 batch_time=0.97545
Train Epoch: 7 [232/250 29696/32000 (93%)] Loss: 8.21336 (QuantReg: 12.33379) QuantErr: 12.33379 batch_time=0.46321
Train Epoch: 7 [243/250 31104/32000 (97%)] Loss: 7.03958 (QuantReg: 12.65308) QuantErr: 12.65308 batch_time=0.43023
Train Epoch: 7 codebook_update_time=0.81736
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch7.pth ...
Done in 3.872s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch7.pth ...
Done in 7.986s
removing stale ckpt [epoch 6] [took 0.01s]
epoch : 7
loss : 7.736541116714478
quant_reg : 12.501497966766358
quant_err : 12.501497966766358
learning_rate : 3.675459453124999e-05
n_samples : 224000
n_steps : 1750
MSRVTT_jsfusion_test/t2v_metrics/R1: 18.7
MSRVTT_jsfusion_test/t2v_metrics/R5: 46.9
MSRVTT_jsfusion_test/t2v_metrics/R10: 61.2
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.2
MSRVTT_jsfusion_test/t2v_metrics/MedR: 6.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 28.6
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 37.72147130402437
MSRVTT_jsfusion_test/v2t_metrics/R1: 19.5
MSRVTT_jsfusion_test/v2t_metrics/R5: 47.1
MSRVTT_jsfusion_test/v2t_metrics/R10: 63.4
MSRVTT_jsfusion_test/v2t_metrics/R50: 88.1
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 26.9985
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 38.759805773707456
mnt_best : 37.72147130402437
not_improved_count: 0
Train Epoch: 8 [1/250 128/32000 (0%)] Loss: 7.27719 (QuantReg: 12.35727) QuantErr: 12.35727 batch_time=34.96891
Train Epoch: 8 [12/250 1536/32000 (5%)] Loss: 6.74069 (QuantReg: 12.26834) QuantErr: 12.26834 batch_time=0.44041
Train Epoch: 8 [23/250 2944/32000 (9%)] Loss: 7.34260 (QuantReg: 12.56522) QuantErr: 12.56522 batch_time=0.44347
Train Epoch: 8 [34/250 4352/32000 (14%)] Loss: 8.55385 (QuantReg: 12.47588) QuantErr: 12.47588 batch_time=0.45491
Train Epoch: 8 [45/250 5760/32000 (18%)] Loss: 6.86548 (QuantReg: 12.83172) QuantErr: 12.83172 batch_time=0.45916
Train Epoch: 8 [56/250 7168/32000 (22%)] Loss: 6.68177 (QuantReg: 12.61390) QuantErr: 12.61390 batch_time=0.46361
Train Epoch: 8 [67/250 8576/32000 (27%)] Loss: 7.42016 (QuantReg: 11.77381) QuantErr: 11.77381 batch_time=0.45538
Train Epoch: 8 [78/250 9984/32000 (31%)] Loss: 8.04611 (QuantReg: 12.68692) QuantErr: 12.68692 batch_time=0.43336
Train Epoch: 8 [89/250 11392/32000 (36%)] Loss: 7.68259 (QuantReg: 12.69226) QuantErr: 12.69226 batch_time=0.44769
Train Epoch: 8 [100/250 12800/32000 (40%)] Loss: 7.29241 (QuantReg: 12.87856) QuantErr: 12.87856 batch_time=0.52180
Train Epoch: 8 [111/250 14208/32000 (44%)] Loss: 7.66222 (QuantReg: 12.32039) QuantErr: 12.32039 batch_time=0.44877
Train Epoch: 8 [122/250 15616/32000 (49%)] Loss: 6.03377 (QuantReg: 12.54185) QuantErr: 12.54185 batch_time=0.43507
Train Epoch: 8 [133/250 17024/32000 (53%)] Loss: 8.40032 (QuantReg: 12.60897) QuantErr: 12.60897 batch_time=0.67655
Train Epoch: 8 [144/250 18432/32000 (58%)] Loss: 6.50713 (QuantReg: 12.78468) QuantErr: 12.78468 batch_time=0.44049
Train Epoch: 8 [155/250 19840/32000 (62%)] Loss: 7.83658 (QuantReg: 12.71741) QuantErr: 12.71741 batch_time=0.42564
Train Epoch: 8 [166/250 21248/32000 (66%)] Loss: 6.73093 (QuantReg: 13.00007) QuantErr: 13.00007 batch_time=0.43121
Train Epoch: 8 [177/250 22656/32000 (71%)] Loss: 7.12723 (QuantReg: 12.64676) QuantErr: 12.64676 batch_time=0.43871
Train Epoch: 8 [188/250 24064/32000 (75%)] Loss: 7.94445 (QuantReg: 12.19280) QuantErr: 12.19280 batch_time=0.46410
Train Epoch: 8 [199/250 25472/32000 (80%)] Loss: 6.88029 (QuantReg: 12.54877) QuantErr: 12.54877 batch_time=0.62138
Train Epoch: 8 [210/250 26880/32000 (84%)] Loss: 7.35685 (QuantReg: 12.59092) QuantErr: 12.59092 batch_time=0.48055
Train Epoch: 8 [221/250 28288/32000 (88%)] Loss: 7.47722 (QuantReg: 12.92703) QuantErr: 12.92703 batch_time=0.43439
Train Epoch: 8 [232/250 29696/32000 (93%)] Loss: 7.34276 (QuantReg: 12.58528) QuantErr: 12.58528 batch_time=0.44517
Train Epoch: 8 [243/250 31104/32000 (97%)] Loss: 5.78367 (QuantReg: 12.94667) QuantErr: 12.94667 batch_time=0.44323
Train Epoch: 8 codebook_update_time=0.84333
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch8.pth ...
Done in 4.268s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch8.pth ...
Done in 8.575s
removing stale ckpt [epoch 7] [took 0.01s]
epoch : 8
loss : 7.26406202507019
quant_reg : 12.601601432800292
quant_err : 12.601601432800292
learning_rate : 3.4916864804687486e-05
n_samples : 256000
n_steps : 2000
MSRVTT_jsfusion_test/t2v_metrics/R1: 19.5
MSRVTT_jsfusion_test/t2v_metrics/R5: 49.2
MSRVTT_jsfusion_test/t2v_metrics/R10: 62.6
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.5
MSRVTT_jsfusion_test/t2v_metrics/MedR: 6.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.785
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 39.16138255759405
MSRVTT_jsfusion_test/v2t_metrics/R1: 19.1
MSRVTT_jsfusion_test/v2t_metrics/R5: 48.5
MSRVTT_jsfusion_test/v2t_metrics/R10: 62.8
MSRVTT_jsfusion_test/v2t_metrics/R50: 88.4
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 25.7225
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 38.74760970340723
mnt_best : 39.16138255759405
not_improved_count: 0
Train Epoch: 9 [1/250 128/32000 (0%)] Loss: 7.35309 (QuantReg: 12.82009) QuantErr: 12.82009 batch_time=31.51401
Train Epoch: 9 [12/250 1536/32000 (5%)] Loss: 7.22241 (QuantReg: 12.18414) QuantErr: 12.18414 batch_time=0.44373
Train Epoch: 9 [23/250 2944/32000 (9%)] Loss: 7.09137 (QuantReg: 12.37309) QuantErr: 12.37309 batch_time=0.54900
Train Epoch: 9 [34/250 4352/32000 (14%)] Loss: 7.59582 (QuantReg: 12.14407) QuantErr: 12.14407 batch_time=0.46690
Train Epoch: 9 [45/250 5760/32000 (18%)] Loss: 6.31563 (QuantReg: 12.45616) QuantErr: 12.45616 batch_time=0.42971
Train Epoch: 9 [56/250 7168/32000 (22%)] Loss: 7.62519 (QuantReg: 12.14870) QuantErr: 12.14870 batch_time=0.49013
Train Epoch: 9 [67/250 8576/32000 (27%)] Loss: 6.97316 (QuantReg: 12.89897) QuantErr: 12.89897 batch_time=0.75648
Train Epoch: 9 [78/250 9984/32000 (31%)] Loss: 6.29674 (QuantReg: 12.93623) QuantErr: 12.93623 batch_time=0.45657
Train Epoch: 9 [89/250 11392/32000 (36%)] Loss: 6.80941 (QuantReg: 12.38357) QuantErr: 12.38357 batch_time=0.42320
Train Epoch: 9 [100/250 12800/32000 (40%)] Loss: 7.95538 (QuantReg: 12.44637) QuantErr: 12.44637 batch_time=0.48336
Train Epoch: 9 [111/250 14208/32000 (44%)] Loss: 6.56419 (QuantReg: 12.79359) QuantErr: 12.79359 batch_time=0.44776
Train Epoch: 9 [122/250 15616/32000 (49%)] Loss: 9.70020 (QuantReg: 12.75809) QuantErr: 12.75809 batch_time=0.45183
Train Epoch: 9 [133/250 17024/32000 (53%)] Loss: 6.57877 (QuantReg: 12.69367) QuantErr: 12.69367 batch_time=0.55183
Train Epoch: 9 [144/250 18432/32000 (58%)] Loss: 5.66924 (QuantReg: 12.54642) QuantErr: 12.54642 batch_time=0.45070
Train Epoch: 9 [155/250 19840/32000 (62%)] Loss: 5.19848 (QuantReg: 12.61232) QuantErr: 12.61232 batch_time=0.42965
Train Epoch: 9 [166/250 21248/32000 (66%)] Loss: 7.19742 (QuantReg: 12.58417) QuantErr: 12.58417 batch_time=0.82090
Train Epoch: 9 [177/250 22656/32000 (71%)] Loss: 8.62126 (QuantReg: 12.87934) QuantErr: 12.87934 batch_time=0.43905
Train Epoch: 9 [188/250 24064/32000 (75%)] Loss: 7.73235 (QuantReg: 12.24652) QuantErr: 12.24652 batch_time=0.45029
Train Epoch: 9 [199/250 25472/32000 (80%)] Loss: 6.65107 (QuantReg: 12.34350) QuantErr: 12.34350 batch_time=0.42985
Train Epoch: 9 [210/250 26880/32000 (84%)] Loss: 6.16306 (QuantReg: 12.63750) QuantErr: 12.63750 batch_time=0.47856
Train Epoch: 9 [221/250 28288/32000 (88%)] Loss: 5.95574 (QuantReg: 12.90009) QuantErr: 12.90009 batch_time=1.08564
Train Epoch: 9 [232/250 29696/32000 (93%)] Loss: 7.67670 (QuantReg: 12.89914) QuantErr: 12.89914 batch_time=0.46311
Train Epoch: 9 [243/250 31104/32000 (97%)] Loss: 6.86312 (QuantReg: 12.62319) QuantErr: 12.62319 batch_time=0.95585
Train Epoch: 9 codebook_update_time=0.85190
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch9.pth ...
Done in 14.142s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch9.pth ...
Done in 19.013s
removing stale ckpt [epoch 8] [took 0.62s]
epoch : 9
loss : 6.949498327255249
quant_reg : 12.572703582763673
quant_err : 12.572703582763673
learning_rate : 3.317102156445311e-05
n_samples : 288000
n_steps : 2250
MSRVTT_jsfusion_test/t2v_metrics/R1: 20.0
MSRVTT_jsfusion_test/t2v_metrics/R5: 48.6
MSRVTT_jsfusion_test/t2v_metrics/R10: 62.6
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.1
MSRVTT_jsfusion_test/t2v_metrics/MedR: 6.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.654
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 39.33207567238599
MSRVTT_jsfusion_test/v2t_metrics/R1: 19.4
MSRVTT_jsfusion_test/v2t_metrics/R5: 48.3
MSRVTT_jsfusion_test/v2t_metrics/R10: 62.4
MSRVTT_jsfusion_test/v2t_metrics/R50: 89.1
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 25.346
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 38.81305402680257
mnt_best : 39.33207567238599
not_improved_count: 0
Train Epoch: 10 [1/250 128/32000 (0%)] Loss: 9.64240 (QuantReg: 12.45832) QuantErr: 12.45832 batch_time=40.79518
Train Epoch: 10 [12/250 1536/32000 (5%)] Loss: 5.86795 (QuantReg: 12.77669) QuantErr: 12.77669 batch_time=0.45428
Train Epoch: 10 [23/250 2944/32000 (9%)] Loss: 6.82776 (QuantReg: 12.42654) QuantErr: 12.42654 batch_time=0.46269
Train Epoch: 10 [34/250 4352/32000 (14%)] Loss: 5.61373 (QuantReg: 12.32398) QuantErr: 12.32398 batch_time=0.47357
Train Epoch: 10 [45/250 5760/32000 (18%)] Loss: 5.63104 (QuantReg: 12.67479) QuantErr: 12.67479 batch_time=0.43540
Train Epoch: 10 [56/250 7168/32000 (22%)] Loss: 7.52662 (QuantReg: 12.26425) QuantErr: 12.26425 batch_time=0.42927
Train Epoch: 10 [67/250 8576/32000 (27%)] Loss: 6.11760 (QuantReg: 12.97516) QuantErr: 12.97516 batch_time=0.49496
Train Epoch: 10 [78/250 9984/32000 (31%)] Loss: 6.23472 (QuantReg: 12.51823) QuantErr: 12.51823 batch_time=0.47157
Train Epoch: 10 [89/250 11392/32000 (36%)] Loss: 6.62832 (QuantReg: 12.35540) QuantErr: 12.35540 batch_time=0.48280
Train Epoch: 10 [100/250 12800/32000 (40%)] Loss: 7.25573 (QuantReg: 12.71241) QuantErr: 12.71241 batch_time=0.42589
Train Epoch: 10 [111/250 14208/32000 (44%)] Loss: 6.14731 (QuantReg: 12.97672) QuantErr: 12.97672 batch_time=0.47894
Train Epoch: 10 [122/250 15616/32000 (49%)] Loss: 7.16526 (QuantReg: 12.60312) QuantErr: 12.60312 batch_time=0.46370
Train Epoch: 10 [133/250 17024/32000 (53%)] Loss: 6.42739 (QuantReg: 12.52941) QuantErr: 12.52941 batch_time=0.44746
Train Epoch: 10 [144/250 18432/32000 (58%)] Loss: 6.26651 (QuantReg: 13.08371) QuantErr: 13.08371 batch_time=0.45840
Train Epoch: 10 [155/250 19840/32000 (62%)] Loss: 7.50382 (QuantReg: 12.66911) QuantErr: 12.66911 batch_time=0.44662
Train Epoch: 10 [166/250 21248/32000 (66%)] Loss: 7.37232 (QuantReg: 12.72648) QuantErr: 12.72648 batch_time=0.46384
Train Epoch: 10 [177/250 22656/32000 (71%)] Loss: 5.36141 (QuantReg: 12.90325) QuantErr: 12.90325 batch_time=0.46907
Train Epoch: 10 [188/250 24064/32000 (75%)] Loss: 6.08001 (QuantReg: 12.94646) QuantErr: 12.94646 batch_time=0.48335
Train Epoch: 10 [199/250 25472/32000 (80%)] Loss: 6.41303 (QuantReg: 13.01624) QuantErr: 13.01624 batch_time=0.43282
Train Epoch: 10 [210/250 26880/32000 (84%)] Loss: 7.05852 (QuantReg: 12.86827) QuantErr: 12.86827 batch_time=0.47923
Train Epoch: 10 [221/250 28288/32000 (88%)] Loss: 6.10226 (QuantReg: 12.98083) QuantErr: 12.98083 batch_time=0.43714
Train Epoch: 10 [232/250 29696/32000 (93%)] Loss: 7.08199 (QuantReg: 12.85072) QuantErr: 12.85072 batch_time=0.89621
Train Epoch: 10 [243/250 31104/32000 (97%)] Loss: 6.42241 (QuantReg: 12.87077) QuantErr: 12.87077 batch_time=0.44793
Train Epoch: 10 codebook_update_time=0.83816
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch10.pth ...
Done in 4.219s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch10.pth ...
Done in 9.590s
removing stale ckpt [epoch 9] [took 0.06s]
epoch : 10
loss : 6.623721956253052
quant_reg : 12.723752265930175
quant_err : 12.723752265930175
learning_rate : 3.151247048623045e-05
n_samples : 320000
n_steps : 2500
MSRVTT_jsfusion_test/t2v_metrics/R1: 20.1
MSRVTT_jsfusion_test/t2v_metrics/R5: 50.3
MSRVTT_jsfusion_test/t2v_metrics/R10: 63.8
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.9
MSRVTT_jsfusion_test/t2v_metrics/MedR: 5.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.245
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 40.10466630191795
MSRVTT_jsfusion_test/v2t_metrics/R1: 20.4
MSRVTT_jsfusion_test/v2t_metrics/R5: 49.5
MSRVTT_jsfusion_test/v2t_metrics/R10: 63.0
MSRVTT_jsfusion_test/v2t_metrics/R50: 90.4
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 25.142
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 39.9201323015737
mnt_best : 40.10466630191795
not_improved_count: 0
Train Epoch: 11 [1/250 128/32000 (0%)] Loss: 6.57911 (QuantReg: 12.24247) QuantErr: 12.24247 batch_time=32.13807
Train Epoch: 11 [12/250 1536/32000 (5%)] Loss: 6.06979 (QuantReg: 12.70928) QuantErr: 12.70928 batch_time=0.48406
Train Epoch: 11 [23/250 2944/32000 (9%)] Loss: 6.46562 (QuantReg: 12.60772) QuantErr: 12.60772 batch_time=0.49342
Train Epoch: 11 [34/250 4352/32000 (14%)] Loss: 6.23592 (QuantReg: 13.03246) QuantErr: 13.03246 batch_time=0.47665
Train Epoch: 11 [45/250 5760/32000 (18%)] Loss: 6.16024 (QuantReg: 12.80082) QuantErr: 12.80082 batch_time=0.43772
Train Epoch: 11 [56/250 7168/32000 (22%)] Loss: 6.57143 (QuantReg: 12.63651) QuantErr: 12.63651 batch_time=0.54089
Train Epoch: 11 [67/250 8576/32000 (27%)] Loss: 6.03609 (QuantReg: 12.75341) QuantErr: 12.75341 batch_time=1.06622
Train Epoch: 11 [78/250 9984/32000 (31%)] Loss: 7.07162 (QuantReg: 12.69960) QuantErr: 12.69960 batch_time=0.44240
Train Epoch: 11 [89/250 11392/32000 (36%)] Loss: 6.67005 (QuantReg: 12.50513) QuantErr: 12.50513 batch_time=0.49718
Train Epoch: 11 [100/250 12800/32000 (40%)] Loss: 7.17235 (QuantReg: 12.41867) QuantErr: 12.41867 batch_time=0.47177
Train Epoch: 11 [111/250 14208/32000 (44%)] Loss: 5.48453 (QuantReg: 12.80367) QuantErr: 12.80367 batch_time=0.42875
Train Epoch: 11 [122/250 15616/32000 (49%)] Loss: 7.82363 (QuantReg: 12.71257) QuantErr: 12.71257 batch_time=0.42467
Train Epoch: 11 [133/250 17024/32000 (53%)] Loss: 6.66270 (QuantReg: 12.71639) QuantErr: 12.71639 batch_time=0.66217
Train Epoch: 11 [144/250 18432/32000 (58%)] Loss: 5.68612 (QuantReg: 13.13256) QuantErr: 13.13256 batch_time=0.42994
Train Epoch: 11 [155/250 19840/32000 (62%)] Loss: 5.88441 (QuantReg: 13.12373) QuantErr: 13.12373 batch_time=0.43264
Train Epoch: 11 [166/250 21248/32000 (66%)] Loss: 6.01885 (QuantReg: 12.42695) QuantErr: 12.42695 batch_time=0.45845
Train Epoch: 11 [177/250 22656/32000 (71%)] Loss: 5.80212 (QuantReg: 12.50165) QuantErr: 12.50165 batch_time=0.44121
Train Epoch: 11 [188/250 24064/32000 (75%)] Loss: 7.16878 (QuantReg: 12.66839) QuantErr: 12.66839 batch_time=0.43346
Train Epoch: 11 [199/250 25472/32000 (80%)] Loss: 5.50427 (QuantReg: 12.82311) QuantErr: 12.82311 batch_time=0.46360
Train Epoch: 11 [210/250 26880/32000 (84%)] Loss: 6.59805 (QuantReg: 12.50465) QuantErr: 12.50465 batch_time=0.54318
Train Epoch: 11 [221/250 28288/32000 (88%)] Loss: 6.11138 (QuantReg: 12.91292) QuantErr: 12.91292 batch_time=0.42795
Train Epoch: 11 [232/250 29696/32000 (93%)] Loss: 6.35429 (QuantReg: 12.88437) QuantErr: 12.88437 batch_time=0.43727
Train Epoch: 11 [243/250 31104/32000 (97%)] Loss: 7.50396 (QuantReg: 13.02508) QuantErr: 13.02508 batch_time=0.43163
Train Epoch: 11 codebook_update_time=0.91173
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch11.pth ...
Done in 5.202s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch11.pth ...
Done in 10.858s
removing stale ckpt [epoch 10] [took 0.07s]
epoch : 11
loss : 6.323669538497925
quant_reg : 12.716312957763671
quant_err : 12.716312957763671
learning_rate : 2.993684696191893e-05
n_samples : 352000
n_steps : 2750
MSRVTT_jsfusion_test/t2v_metrics/R1: 20.1
MSRVTT_jsfusion_test/t2v_metrics/R5: 50.0
MSRVTT_jsfusion_test/t2v_metrics/R10: 64.3
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.5
MSRVTT_jsfusion_test/t2v_metrics/MedR: 5.5
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 26.385
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 40.12906229191178
MSRVTT_jsfusion_test/v2t_metrics/R1: 23.2
MSRVTT_jsfusion_test/v2t_metrics/R5: 51.2
MSRVTT_jsfusion_test/v2t_metrics/R10: 64.1
MSRVTT_jsfusion_test/v2t_metrics/R50: 89.4
MSRVTT_jsfusion_test/v2t_metrics/MedR: 5.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 24.525
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 42.3843302723228
mnt_best : 40.12906229191178
not_improved_count: 0
Train Epoch: 12 [1/250 128/32000 (0%)] Loss: 5.46731 (QuantReg: 12.44892) QuantErr: 12.44892 batch_time=33.76574
Train Epoch: 12 [12/250 1536/32000 (5%)] Loss: 7.12926 (QuantReg: 12.50900) QuantErr: 12.50900 batch_time=0.43173
Train Epoch: 12 [23/250 2944/32000 (9%)] Loss: 6.74616 (QuantReg: 12.86113) QuantErr: 12.86113 batch_time=0.43100
Train Epoch: 12 [34/250 4352/32000 (14%)] Loss: 6.28451 (QuantReg: 12.54025) QuantErr: 12.54025 batch_time=0.44358
Train Epoch: 12 [45/250 5760/32000 (18%)] Loss: 5.17279 (QuantReg: 12.68000) QuantErr: 12.68000 batch_time=0.43779
Train Epoch: 12 [56/250 7168/32000 (22%)] Loss: 6.65758 (QuantReg: 12.94282) QuantErr: 12.94282 batch_time=0.45222
Train Epoch: 12 [67/250 8576/32000 (27%)] Loss: 6.07324 (QuantReg: 12.58420) QuantErr: 12.58420 batch_time=5.22658
Train Epoch: 12 [78/250 9984/32000 (31%)] Loss: 6.02424 (QuantReg: 12.51514) QuantErr: 12.51514 batch_time=0.47459
Train Epoch: 12 [89/250 11392/32000 (36%)] Loss: 5.36616 (QuantReg: 12.21901) QuantErr: 12.21901 batch_time=0.48710
Train Epoch: 12 [100/250 12800/32000 (40%)] Loss: 8.15154 (QuantReg: 12.55299) QuantErr: 12.55299 batch_time=0.43994
Train Epoch: 12 [111/250 14208/32000 (44%)] Loss: 6.76884 (QuantReg: 12.58890) QuantErr: 12.58890 batch_time=0.47268
Train Epoch: 12 [122/250 15616/32000 (49%)] Loss: 8.15976 (QuantReg: 12.84366) QuantErr: 12.84366 batch_time=0.43775
Train Epoch: 12 [133/250 17024/32000 (53%)] Loss: 6.07221 (QuantReg: 12.35528) QuantErr: 12.35528 batch_time=0.45567
Train Epoch: 12 [144/250 18432/32000 (58%)] Loss: 4.76413 (QuantReg: 12.77607) QuantErr: 12.77607 batch_time=0.48162
Train Epoch: 12 [155/250 19840/32000 (62%)] Loss: 6.95701 (QuantReg: 12.73765) QuantErr: 12.73765 batch_time=0.45349
Train Epoch: 12 [166/250 21248/32000 (66%)] Loss: 6.13578 (QuantReg: 12.85414) QuantErr: 12.85414 batch_time=0.43229
Train Epoch: 12 [177/250 22656/32000 (71%)] Loss: 4.77136 (QuantReg: 12.88798) QuantErr: 12.88798 batch_time=0.44080
Train Epoch: 12 [188/250 24064/32000 (75%)] Loss: 6.37746 (QuantReg: 12.73063) QuantErr: 12.73063 batch_time=0.44810
Train Epoch: 12 [199/250 25472/32000 (80%)] Loss: 6.76908 (QuantReg: 12.95208) QuantErr: 12.95208 batch_time=1.02887
Train Epoch: 12 [210/250 26880/32000 (84%)] Loss: 6.99887 (QuantReg: 12.71913) QuantErr: 12.71913 batch_time=0.43087
Train Epoch: 12 [221/250 28288/32000 (88%)] Loss: 6.11837 (QuantReg: 12.33100) QuantErr: 12.33100 batch_time=0.45965
Train Epoch: 12 [232/250 29696/32000 (93%)] Loss: 6.19339 (QuantReg: 12.86785) QuantErr: 12.86785 batch_time=0.47405
Train Epoch: 12 [243/250 31104/32000 (97%)] Loss: 4.77813 (QuantReg: 13.05545) QuantErr: 13.05545 batch_time=0.44280
Train Epoch: 12 codebook_update_time=0.87827
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch12.pth ...
Done in 4.745s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch12.pth ...
Done in 9.840s
removing stale ckpt [epoch 11] [took 0.03s]
epoch : 12
loss : 6.035094451904297
quant_reg : 12.759136753082275
quant_err : 12.759136753082275
learning_rate : 2.844000461382298e-05
n_samples : 384000
n_steps : 3000
MSRVTT_jsfusion_test/t2v_metrics/R1: 21.7
MSRVTT_jsfusion_test/t2v_metrics/R5: 51.0
MSRVTT_jsfusion_test/t2v_metrics/R10: 64.0
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.8
MSRVTT_jsfusion_test/t2v_metrics/MedR: 5.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 26.839
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 41.374868642013524
MSRVTT_jsfusion_test/v2t_metrics/R1: 22.0
MSRVTT_jsfusion_test/v2t_metrics/R5: 51.6
MSRVTT_jsfusion_test/v2t_metrics/R10: 63.6
MSRVTT_jsfusion_test/v2t_metrics/R50: 89.2
MSRVTT_jsfusion_test/v2t_metrics/MedR: 5.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 24.86
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 41.639914845615344
mnt_best : 41.374868642013524
not_improved_count: 0
Train Epoch: 13 [1/250 128/32000 (0%)] Loss: 6.34729 (QuantReg: 12.49700) QuantErr: 12.49700 batch_time=34.46744
Train Epoch: 13 [12/250 1536/32000 (5%)] Loss: 6.72392 (QuantReg: 12.60175) QuantErr: 12.60175 batch_time=0.45063
Train Epoch: 13 [23/250 2944/32000 (9%)] Loss: 5.26484 (QuantReg: 12.58747) QuantErr: 12.58747 batch_time=0.47170
Train Epoch: 13 [34/250 4352/32000 (14%)] Loss: 4.97597 (QuantReg: 12.74365) QuantErr: 12.74365 batch_time=0.42102
Train Epoch: 13 [45/250 5760/32000 (18%)] Loss: 6.83810 (QuantReg: 12.63359) QuantErr: 12.63359 batch_time=0.45721
Train Epoch: 13 [56/250 7168/32000 (22%)] Loss: 5.67031 (QuantReg: 12.65903) QuantErr: 12.65903 batch_time=0.43272
Train Epoch: 13 [67/250 8576/32000 (27%)] Loss: 6.52979 (QuantReg: 12.45891) QuantErr: 12.45891 batch_time=0.43940
Train Epoch: 13 [78/250 9984/32000 (31%)] Loss: 5.51254 (QuantReg: 12.57109) QuantErr: 12.57109 batch_time=0.45743
Train Epoch: 13 [89/250 11392/32000 (36%)] Loss: 6.97235 (QuantReg: 12.63547) QuantErr: 12.63547 batch_time=0.44997
Train Epoch: 13 [100/250 12800/32000 (40%)] Loss: 5.40520 (QuantReg: 12.98596) QuantErr: 12.98596 batch_time=0.45802
Train Epoch: 13 [111/250 14208/32000 (44%)] Loss: 5.31070 (QuantReg: 12.51354) QuantErr: 12.51354 batch_time=0.45811
Train Epoch: 13 [122/250 15616/32000 (49%)] Loss: 5.70351 (QuantReg: 12.69886) QuantErr: 12.69886 batch_time=0.43597
Train Epoch: 13 [133/250 17024/32000 (53%)] Loss: 5.91428 (QuantReg: 12.93650) QuantErr: 12.93650 batch_time=0.46566
Train Epoch: 13 [144/250 18432/32000 (58%)] Loss: 6.19095 (QuantReg: 12.86820) QuantErr: 12.86820 batch_time=0.48135
Train Epoch: 13 [155/250 19840/32000 (62%)] Loss: 6.16410 (QuantReg: 12.88587) QuantErr: 12.88587 batch_time=0.43621
Train Epoch: 13 [166/250 21248/32000 (66%)] Loss: 4.97724 (QuantReg: 13.55271) QuantErr: 13.55271 batch_time=0.44857
Train Epoch: 13 [177/250 22656/32000 (71%)] Loss: 5.59041 (QuantReg: 13.14562) QuantErr: 13.14562 batch_time=0.44776
Train Epoch: 13 [188/250 24064/32000 (75%)] Loss: 5.86592 (QuantReg: 12.93934) QuantErr: 12.93934 batch_time=0.43039
Train Epoch: 13 [199/250 25472/32000 (80%)] Loss: 6.12566 (QuantReg: 12.52328) QuantErr: 12.52328 batch_time=0.47882
Train Epoch: 13 [210/250 26880/32000 (84%)] Loss: 4.97705 (QuantReg: 12.92218) QuantErr: 12.92218 batch_time=0.48776
Train Epoch: 13 [221/250 28288/32000 (88%)] Loss: 6.14528 (QuantReg: 13.04573) QuantErr: 13.04573 batch_time=0.44641
Train Epoch: 13 [232/250 29696/32000 (93%)] Loss: 5.90673 (QuantReg: 12.77876) QuantErr: 12.77876 batch_time=0.44527
Train Epoch: 13 [243/250 31104/32000 (97%)] Loss: 6.48067 (QuantReg: 12.92029) QuantErr: 12.92029 batch_time=0.43866
Train Epoch: 13 codebook_update_time=0.92310
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch13.pth ...
Done in 4.903s
removing stale ckpt [epoch 12] [took 0.13s]
epoch : 13
loss : 5.8579316997528075
quant_reg : 12.766504890441894
quant_err : 12.766504890441894
learning_rate : 2.7018004383131832e-05
n_samples : 416000
n_steps : 3250
MSRVTT_jsfusion_test/t2v_metrics/R1: 20.2
MSRVTT_jsfusion_test/t2v_metrics/R5: 50.0
MSRVTT_jsfusion_test/t2v_metrics/R10: 64.9
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.6
MSRVTT_jsfusion_test/t2v_metrics/MedR: 5.5
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 26.444
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 40.320139268961306
MSRVTT_jsfusion_test/v2t_metrics/R1: 21.5
MSRVTT_jsfusion_test/v2t_metrics/R5: 52.2
MSRVTT_jsfusion_test/v2t_metrics/R10: 66.2
MSRVTT_jsfusion_test/v2t_metrics/R50: 89.1
MSRVTT_jsfusion_test/v2t_metrics/MedR: 5.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 23.423
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 42.03931692476018
mnt_best : 41.374868642013524
not_improved_count: 1
Train Epoch: 14 [1/250 128/32000 (0%)] Loss: 4.89523 (QuantReg: 12.51422) QuantErr: 12.51422 batch_time=39.17042
Train Epoch: 14 [12/250 1536/32000 (5%)] Loss: 5.89018 (QuantReg: 12.61719) QuantErr: 12.61719 batch_time=0.43712
Train Epoch: 14 [23/250 2944/32000 (9%)] Loss: 6.43818 (QuantReg: 12.73387) QuantErr: 12.73387 batch_time=0.66439
Train Epoch: 14 [34/250 4352/32000 (14%)] Loss: 4.50565 (QuantReg: 12.64110) QuantErr: 12.64110 batch_time=0.84366
Train Epoch: 14 [45/250 5760/32000 (18%)] Loss: 5.31117 (QuantReg: 12.75432) QuantErr: 12.75432 batch_time=0.65260
Train Epoch: 14 [56/250 7168/32000 (22%)] Loss: 4.67133 (QuantReg: 13.09315) QuantErr: 13.09315 batch_time=0.44368
Train Epoch: 14 [67/250 8576/32000 (27%)] Loss: 6.68525 (QuantReg: 12.71621) QuantErr: 12.71621 batch_time=0.46270
Train Epoch: 14 [78/250 9984/32000 (31%)] Loss: 5.98087 (QuantReg: 12.63349) QuantErr: 12.63349 batch_time=0.47164
Train Epoch: 14 [89/250 11392/32000 (36%)] Loss: 6.89675 (QuantReg: 13.00384) QuantErr: 13.00384 batch_time=0.47863
Train Epoch: 14 [100/250 12800/32000 (40%)] Loss: 6.06289 (QuantReg: 12.77107) QuantErr: 12.77107 batch_time=0.57679
Train Epoch: 14 [111/250 14208/32000 (44%)] Loss: 5.51695 (QuantReg: 13.02040) QuantErr: 13.02040 batch_time=0.45004
Train Epoch: 14 [122/250 15616/32000 (49%)] Loss: 5.34513 (QuantReg: 12.79675) QuantErr: 12.79675 batch_time=0.43075
Train Epoch: 14 [133/250 17024/32000 (53%)] Loss: 5.02327 (QuantReg: 12.75355) QuantErr: 12.75355 batch_time=0.46120
Train Epoch: 14 [144/250 18432/32000 (58%)] Loss: 6.09545 (QuantReg: 12.94664) QuantErr: 12.94664 batch_time=0.47273
Train Epoch: 14 [155/250 19840/32000 (62%)] Loss: 5.66827 (QuantReg: 12.73441) QuantErr: 12.73441 batch_time=0.47037
Train Epoch: 14 [166/250 21248/32000 (66%)] Loss: 6.20992 (QuantReg: 12.48534) QuantErr: 12.48534 batch_time=0.49191
Train Epoch: 14 [177/250 22656/32000 (71%)] Loss: 4.47668 (QuantReg: 13.18070) QuantErr: 13.18070 batch_time=0.44368
Train Epoch: 14 [188/250 24064/32000 (75%)] Loss: 3.62865 (QuantReg: 13.16245) QuantErr: 13.16245 batch_time=0.43663
Train Epoch: 14 [199/250 25472/32000 (80%)] Loss: 5.34764 (QuantReg: 12.76668) QuantErr: 12.76668 batch_time=0.47893
Train Epoch: 14 [210/250 26880/32000 (84%)] Loss: 5.99629 (QuantReg: 12.94887) QuantErr: 12.94887 batch_time=0.43350
Train Epoch: 14 [221/250 28288/32000 (88%)] Loss: 6.26806 (QuantReg: 12.65168) QuantErr: 12.65168 batch_time=0.45829
Train Epoch: 14 [232/250 29696/32000 (93%)] Loss: 5.54734 (QuantReg: 13.12836) QuantErr: 13.12836 batch_time=0.44289
Train Epoch: 14 [243/250 31104/32000 (97%)] Loss: 4.69684 (QuantReg: 12.98477) QuantErr: 12.98477 batch_time=0.46065
Train Epoch: 14 codebook_update_time=0.81725
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch14.pth ...
Done in 6.992s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch14.pth ...
Done in 12.474s
removing stale ckpt [epoch 13] [took 0.25s]
epoch : 14
loss : 5.6459108829498295
quant_reg : 12.814208446502686
quant_err : 12.814208446502686
learning_rate : 2.566710416397524e-05
n_samples : 448000
n_steps : 3500
MSRVTT_jsfusion_test/t2v_metrics/R1: 21.3
MSRVTT_jsfusion_test/t2v_metrics/R5: 52.2
MSRVTT_jsfusion_test/t2v_metrics/R10: 65.0
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.9
MSRVTT_jsfusion_test/t2v_metrics/MedR: 5.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 26.27
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 41.65378659667916
MSRVTT_jsfusion_test/v2t_metrics/R1: 21.8
MSRVTT_jsfusion_test/v2t_metrics/R5: 52.6
MSRVTT_jsfusion_test/v2t_metrics/R10: 65.1
MSRVTT_jsfusion_test/v2t_metrics/R50: 89.6
MSRVTT_jsfusion_test/v2t_metrics/MedR: 5.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 24.07
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 42.105717802437304
mnt_best : 41.65378659667916
not_improved_count: 0
Train Epoch: 15 [1/250 128/32000 (0%)] Loss: 5.51352 (QuantReg: 12.70065) QuantErr: 12.70065 batch_time=33.99276
Train Epoch: 15 [12/250 1536/32000 (5%)] Loss: 6.44742 (QuantReg: 12.73079) QuantErr: 12.73079 batch_time=0.45666
Train Epoch: 15 [23/250 2944/32000 (9%)] Loss: 5.62172 (QuantReg: 12.79024) QuantErr: 12.79024 batch_time=0.43426
Train Epoch: 15 [34/250 4352/32000 (14%)] Loss: 5.77518 (QuantReg: 12.72455) QuantErr: 12.72455 batch_time=0.76185
Train Epoch: 15 [45/250 5760/32000 (18%)] Loss: 5.58634 (QuantReg: 12.74755) QuantErr: 12.74755 batch_time=0.49904
Train Epoch: 15 [56/250 7168/32000 (22%)] Loss: 5.54750 (QuantReg: 12.78760) QuantErr: 12.78760 batch_time=0.45302
Train Epoch: 15 [67/250 8576/32000 (27%)] Loss: 5.86000 (QuantReg: 12.83187) QuantErr: 12.83187 batch_time=0.74811
Train Epoch: 15 [78/250 9984/32000 (31%)] Loss: 4.27829 (QuantReg: 12.88298) QuantErr: 12.88298 batch_time=0.43759
Train Epoch: 15 [89/250 11392/32000 (36%)] Loss: 5.11688 (QuantReg: 12.61309) QuantErr: 12.61309 batch_time=0.43764
Train Epoch: 15 [100/250 12800/32000 (40%)] Loss: 7.63640 (QuantReg: 12.91029) QuantErr: 12.91029 batch_time=0.44143
Train Epoch: 15 [111/250 14208/32000 (44%)] Loss: 5.95085 (QuantReg: 12.80833) QuantErr: 12.80833 batch_time=0.46405
Train Epoch: 15 [122/250 15616/32000 (49%)] Loss: 5.67575 (QuantReg: 12.80137) QuantErr: 12.80137 batch_time=0.44036
Train Epoch: 15 [133/250 17024/32000 (53%)] Loss: 5.44792 (QuantReg: 12.97953) QuantErr: 12.97953 batch_time=0.44820
Train Epoch: 15 [144/250 18432/32000 (58%)] Loss: 6.34223 (QuantReg: 12.98839) QuantErr: 12.98839 batch_time=0.46392
Train Epoch: 15 [155/250 19840/32000 (62%)] Loss: 5.90870 (QuantReg: 12.48590) QuantErr: 12.48590 batch_time=0.43581
Train Epoch: 15 [166/250 21248/32000 (66%)] Loss: 5.27757 (QuantReg: 12.79318) QuantErr: 12.79318 batch_time=0.42250
Train Epoch: 15 [177/250 22656/32000 (71%)] Loss: 7.35503 (QuantReg: 12.80166) QuantErr: 12.80166 batch_time=0.45991
Train Epoch: 15 [188/250 24064/32000 (75%)] Loss: 5.70275 (QuantReg: 12.53505) QuantErr: 12.53505 batch_time=0.47117
Train Epoch: 15 [199/250 25472/32000 (80%)] Loss: 5.58192 (QuantReg: 12.74040) QuantErr: 12.74040 batch_time=0.43351
Train Epoch: 15 [210/250 26880/32000 (84%)] Loss: 4.72705 (QuantReg: 13.18221) QuantErr: 13.18221 batch_time=3.56919
Train Epoch: 15 [221/250 28288/32000 (88%)] Loss: 3.98288 (QuantReg: 13.11997) QuantErr: 13.11997 batch_time=0.43075
Train Epoch: 15 [232/250 29696/32000 (93%)] Loss: 4.84123 (QuantReg: 12.76921) QuantErr: 12.76921 batch_time=0.44210
Train Epoch: 15 [243/250 31104/32000 (97%)] Loss: 5.75659 (QuantReg: 13.24665) QuantErr: 13.24665 batch_time=0.51357
Train Epoch: 15 codebook_update_time=0.89789
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch15.pth ...
Done in 6.471s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch15.pth ...
Done in 12.333s
removing stale ckpt [epoch 14] [took 0.01s]
epoch : 15
loss : 5.465325452804565
quant_reg : 12.846569816589355
quant_err : 12.846569816589355
learning_rate : 2.4383748955776477e-05
n_samples : 480000
n_steps : 3750
MSRVTT_jsfusion_test/t2v_metrics/R1: 22.2
MSRVTT_jsfusion_test/t2v_metrics/R5: 50.9
MSRVTT_jsfusion_test/t2v_metrics/R10: 64.5
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.6
MSRVTT_jsfusion_test/t2v_metrics/MedR: 5.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 26.326
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 41.771187687092564
MSRVTT_jsfusion_test/v2t_metrics/R1: 22.7
MSRVTT_jsfusion_test/v2t_metrics/R5: 52.9
MSRVTT_jsfusion_test/v2t_metrics/R10: 65.7
MSRVTT_jsfusion_test/v2t_metrics/R50: 89.7
MSRVTT_jsfusion_test/v2t_metrics/MedR: 5.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 23.7185
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 42.8893008022798
mnt_best : 41.771187687092564
not_improved_count: 0
Train Epoch: 16 [1/250 128/32000 (0%)] Loss: 5.80335 (QuantReg: 12.43698) QuantErr: 12.43698 batch_time=31.30250
Train Epoch: 16 [12/250 1536/32000 (5%)] Loss: 4.78886 (QuantReg: 12.86228) QuantErr: 12.86228 batch_time=0.44177
Train Epoch: 16 [23/250 2944/32000 (9%)] Loss: 5.81632 (QuantReg: 12.88665) QuantErr: 12.88665 batch_time=0.47491
Train Epoch: 16 [34/250 4352/32000 (14%)] Loss: 4.76781 (QuantReg: 13.01186) QuantErr: 13.01186 batch_time=0.44117
Train Epoch: 16 [45/250 5760/32000 (18%)] Loss: 4.66127 (QuantReg: 12.99827) QuantErr: 12.99827 batch_time=0.43936
Train Epoch: 16 [56/250 7168/32000 (22%)] Loss: 4.49042 (QuantReg: 12.71127) QuantErr: 12.71127 batch_time=0.42506
Train Epoch: 16 [67/250 8576/32000 (27%)] Loss: 5.71592 (QuantReg: 12.64052) QuantErr: 12.64052 batch_time=2.28511
Train Epoch: 16 [78/250 9984/32000 (31%)] Loss: 5.64259 (QuantReg: 12.79655) QuantErr: 12.79655 batch_time=0.49439
Train Epoch: 16 [89/250 11392/32000 (36%)] Loss: 5.09714 (QuantReg: 12.77061) QuantErr: 12.77061 batch_time=0.48133
Train Epoch: 16 [100/250 12800/32000 (40%)] Loss: 5.78072 (QuantReg: 12.75083) QuantErr: 12.75083 batch_time=0.44492
Train Epoch: 16 [111/250 14208/32000 (44%)] Loss: 4.93932 (QuantReg: 12.84992) QuantErr: 12.84992 batch_time=0.43692
Train Epoch: 16 [122/250 15616/32000 (49%)] Loss: 5.32952 (QuantReg: 12.89946) QuantErr: 12.89946 batch_time=0.43543
Train Epoch: 16 [133/250 17024/32000 (53%)] Loss: 6.32180 (QuantReg: 12.77149) QuantErr: 12.77149 batch_time=0.46789
Train Epoch: 16 [144/250 18432/32000 (58%)] Loss: 5.71808 (QuantReg: 13.18291) QuantErr: 13.18291 batch_time=1.22256
Train Epoch: 16 [155/250 19840/32000 (62%)] Loss: 4.19472 (QuantReg: 13.01720) QuantErr: 13.01720 batch_time=0.44755
Train Epoch: 16 [166/250 21248/32000 (66%)] Loss: 6.01964 (QuantReg: 12.77306) QuantErr: 12.77306 batch_time=0.43983
Train Epoch: 16 [177/250 22656/32000 (71%)] Loss: 4.63029 (QuantReg: 12.87913) QuantErr: 12.87913 batch_time=0.46951
Train Epoch: 16 [188/250 24064/32000 (75%)] Loss: 4.85387 (QuantReg: 12.84404) QuantErr: 12.84404 batch_time=0.44447
Train Epoch: 16 [199/250 25472/32000 (80%)] Loss: 4.71203 (QuantReg: 12.65442) QuantErr: 12.65442 batch_time=0.49415
Train Epoch: 16 [210/250 26880/32000 (84%)] Loss: 5.08306 (QuantReg: 12.73127) QuantErr: 12.73127 batch_time=0.43806
Train Epoch: 16 [221/250 28288/32000 (88%)] Loss: 5.30163 (QuantReg: 12.74316) QuantErr: 12.74316 batch_time=0.69437
Train Epoch: 16 [232/250 29696/32000 (93%)] Loss: 5.44623 (QuantReg: 13.07905) QuantErr: 13.07905 batch_time=0.43479
Train Epoch: 16 [243/250 31104/32000 (97%)] Loss: 5.68255 (QuantReg: 12.88440) QuantErr: 12.88440 batch_time=0.44241
Train Epoch: 16 codebook_update_time=0.83313
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch16.pth ...
Done in 4.719s
removing stale ckpt [epoch 15] [took 0.21s]
epoch : 16
loss : 5.329007749557495
quant_reg : 12.873565444946289
quant_err : 12.873565444946289
learning_rate : 2.3164561507987653e-05
n_samples : 512000
n_steps : 4000
MSRVTT_jsfusion_test/t2v_metrics/R1: 22.3
MSRVTT_jsfusion_test/t2v_metrics/R5: 50.9
MSRVTT_jsfusion_test/t2v_metrics/R10: 63.8
MSRVTT_jsfusion_test/t2v_metrics/R50: 89.3
MSRVTT_jsfusion_test/t2v_metrics/MedR: 5.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 26.721
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 41.68192566100982
MSRVTT_jsfusion_test/v2t_metrics/R1: 21.7
MSRVTT_jsfusion_test/v2t_metrics/R5: 51.4
MSRVTT_jsfusion_test/v2t_metrics/R10: 64.7
MSRVTT_jsfusion_test/v2t_metrics/R50: 89.5
MSRVTT_jsfusion_test/v2t_metrics/MedR: 5.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 24.4125
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 41.63344781300215
mnt_best : 41.771187687092564
not_improved_count: 1
Train Epoch: 17 [1/250 128/32000 (0%)] Loss: 6.17467 (QuantReg: 12.84096) QuantErr: 12.84096 batch_time=37.78498
Train Epoch: 17 [12/250 1536/32000 (5%)] Loss: 5.86657 (QuantReg: 12.61172) QuantErr: 12.61172 batch_time=0.43393
Train Epoch: 17 [23/250 2944/32000 (9%)] Loss: 5.47078 (QuantReg: 12.63507) QuantErr: 12.63507 batch_time=0.43600
Train Epoch: 17 [34/250 4352/32000 (14%)] Loss: 4.43911 (QuantReg: 12.92326) QuantErr: 12.92326 batch_time=0.43863
Train Epoch: 17 [45/250 5760/32000 (18%)] Loss: 4.78205 (QuantReg: 12.82391) QuantErr: 12.82391 batch_time=0.47710
Train Epoch: 17 [56/250 7168/32000 (22%)] Loss: 4.25905 (QuantReg: 12.80255) QuantErr: 12.80255 batch_time=0.43561
Train Epoch: 17 [67/250 8576/32000 (27%)] Loss: 4.75178 (QuantReg: 12.88118) QuantErr: 12.88118 batch_time=0.44502
Train Epoch: 17 [78/250 9984/32000 (31%)] Loss: 5.00059 (QuantReg: 12.90600) QuantErr: 12.90600 batch_time=0.45247
Train Epoch: 17 [89/250 11392/32000 (36%)] Loss: 5.55295 (QuantReg: 12.55315) QuantErr: 12.55315 batch_time=0.43974
Train Epoch: 17 [100/250 12800/32000 (40%)] Loss: 5.35525 (QuantReg: 13.20118) QuantErr: 13.20118 batch_time=0.44463
Train Epoch: 17 [111/250 14208/32000 (44%)] Loss: 4.90898 (QuantReg: 12.73168) QuantErr: 12.73168 batch_time=0.47385
Train Epoch: 17 [122/250 15616/32000 (49%)] Loss: 5.21768 (QuantReg: 12.78734) QuantErr: 12.78734 batch_time=0.44932
Train Epoch: 17 [133/250 17024/32000 (53%)] Loss: 4.66258 (QuantReg: 12.99283) QuantErr: 12.99283 batch_time=0.46482
Train Epoch: 17 [144/250 18432/32000 (58%)] Loss: 5.68964 (QuantReg: 12.84130) QuantErr: 12.84130 batch_time=0.44730
Train Epoch: 17 [155/250 19840/32000 (62%)] Loss: 6.19249 (QuantReg: 12.74007) QuantErr: 12.74007 batch_time=0.44848
Train Epoch: 17 [166/250 21248/32000 (66%)] Loss: 4.83957 (QuantReg: 12.82446) QuantErr: 12.82446 batch_time=0.43579
Train Epoch: 17 [177/250 22656/32000 (71%)] Loss: 5.05763 (QuantReg: 12.52489) QuantErr: 12.52489 batch_time=0.51734
Train Epoch: 17 [188/250 24064/32000 (75%)] Loss: 4.15693 (QuantReg: 13.18572) QuantErr: 13.18572 batch_time=0.48451
Train Epoch: 17 [199/250 25472/32000 (80%)] Loss: 4.98089 (QuantReg: 12.92828) QuantErr: 12.92828 batch_time=0.52929
Train Epoch: 17 [210/250 26880/32000 (84%)] Loss: 5.23064 (QuantReg: 12.89619) QuantErr: 12.89619 batch_time=0.43226
Train Epoch: 17 [221/250 28288/32000 (88%)] Loss: 4.67185 (QuantReg: 12.65114) QuantErr: 12.65114 batch_time=0.46415
Train Epoch: 17 [232/250 29696/32000 (93%)] Loss: 4.26191 (QuantReg: 12.86043) QuantErr: 12.86043 batch_time=0.44046
Train Epoch: 17 [243/250 31104/32000 (97%)] Loss: 4.59985 (QuantReg: 13.15664) QuantErr: 13.15664 batch_time=0.48338
Train Epoch: 17 codebook_update_time=0.89893
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch17.pth ...
Done in 4.882s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch17.pth ...
Done in 9.979s
removing stale ckpt [epoch 16] [took 0.10s]
epoch : 17
loss : 5.172454194068909
quant_reg : 12.864629558563232
quant_err : 12.864629558563232
learning_rate : 2.2006333432588268e-05
n_samples : 544000
n_steps : 4250
MSRVTT_jsfusion_test/t2v_metrics/R1: 21.7
MSRVTT_jsfusion_test/t2v_metrics/R5: 51.3
MSRVTT_jsfusion_test/t2v_metrics/R10: 65.8
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.8
MSRVTT_jsfusion_test/t2v_metrics/MedR: 5.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.126
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 41.840898060391126
MSRVTT_jsfusion_test/v2t_metrics/R1: 23.8
MSRVTT_jsfusion_test/v2t_metrics/R5: 54.1
MSRVTT_jsfusion_test/v2t_metrics/R10: 65.7
MSRVTT_jsfusion_test/v2t_metrics/R50: 89.7
MSRVTT_jsfusion_test/v2t_metrics/MedR: 5.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 23.9435
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 43.89818157933313
mnt_best : 41.840898060391126
not_improved_count: 0
Train Epoch: 18 [1/250 128/32000 (0%)] Loss: 4.95954 (QuantReg: 12.55362) QuantErr: 12.55362 batch_time=36.57540
Train Epoch: 18 [12/250 1536/32000 (5%)] Loss: 4.80784 (QuantReg: 12.87540) QuantErr: 12.87540 batch_time=0.47005
Train Epoch: 18 [23/250 2944/32000 (9%)] Loss: 5.89837 (QuantReg: 12.84291) QuantErr: 12.84291 batch_time=0.44531
Train Epoch: 18 [34/250 4352/32000 (14%)] Loss: 5.00528 (QuantReg: 13.12797) QuantErr: 13.12797 batch_time=0.45362
Train Epoch: 18 [45/250 5760/32000 (18%)] Loss: 6.25172 (QuantReg: 12.95459) QuantErr: 12.95459 batch_time=0.44669
Train Epoch: 18 [56/250 7168/32000 (22%)] Loss: 6.19726 (QuantReg: 12.70006) QuantErr: 12.70006 batch_time=0.42683
Train Epoch: 18 [67/250 8576/32000 (27%)] Loss: 6.08030 (QuantReg: 12.87752) QuantErr: 12.87752 batch_time=0.53405
Train Epoch: 18 [78/250 9984/32000 (31%)] Loss: 5.87493 (QuantReg: 13.34417) QuantErr: 13.34417 batch_time=0.43731
Train Epoch: 18 [89/250 11392/32000 (36%)] Loss: 4.77198 (QuantReg: 13.22663) QuantErr: 13.22663 batch_time=0.49694
Train Epoch: 18 [100/250 12800/32000 (40%)] Loss: 6.34516 (QuantReg: 12.42311) QuantErr: 12.42311 batch_time=0.67007
Train Epoch: 18 [111/250 14208/32000 (44%)] Loss: 3.77899 (QuantReg: 13.00386) QuantErr: 13.00386 batch_time=0.50274
Train Epoch: 18 [122/250 15616/32000 (49%)] Loss: 5.36833 (QuantReg: 12.99224) QuantErr: 12.99224 batch_time=0.46007
Train Epoch: 18 [133/250 17024/32000 (53%)] Loss: 4.29227 (QuantReg: 13.15481) QuantErr: 13.15481 batch_time=0.46231
Train Epoch: 18 [144/250 18432/32000 (58%)] Loss: 4.22272 (QuantReg: 12.75667) QuantErr: 12.75667 batch_time=0.44981
Train Epoch: 18 [155/250 19840/32000 (62%)] Loss: 5.25625 (QuantReg: 13.10066) QuantErr: 13.10066 batch_time=0.46246
Train Epoch: 18 [166/250 21248/32000 (66%)] Loss: 5.15408 (QuantReg: 13.27261) QuantErr: 13.27261 batch_time=0.44804
Train Epoch: 18 [177/250 22656/32000 (71%)] Loss: 4.47825 (QuantReg: 13.16567) QuantErr: 13.16567 batch_time=0.47126
Train Epoch: 18 [188/250 24064/32000 (75%)] Loss: 6.92678 (QuantReg: 12.91647) QuantErr: 12.91647 batch_time=0.46630
Train Epoch: 18 [199/250 25472/32000 (80%)] Loss: 5.54290 (QuantReg: 13.02566) QuantErr: 13.02566 batch_time=0.44287
Train Epoch: 18 [210/250 26880/32000 (84%)] Loss: 5.86957 (QuantReg: 12.90027) QuantErr: 12.90027 batch_time=0.44541
Train Epoch: 18 [221/250 28288/32000 (88%)] Loss: 5.13737 (QuantReg: 12.94506) QuantErr: 12.94506 batch_time=0.42813
Train Epoch: 18 [232/250 29696/32000 (93%)] Loss: 4.60280 (QuantReg: 12.85895) QuantErr: 12.85895 batch_time=0.45333
Train Epoch: 18 [243/250 31104/32000 (97%)] Loss: 5.69484 (QuantReg: 12.86862) QuantErr: 12.86862 batch_time=0.45155
Train Epoch: 18 codebook_update_time=0.84657
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch18.pth ...
Done in 4.961s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch18.pth ...
Done in 10.024s
removing stale ckpt [epoch 17] [took 0.02s]
epoch : 18
loss : 5.054880195617676
quant_reg : 12.974631008148194
quant_err : 12.974631008148194
learning_rate : 2.0906016760958855e-05
n_samples : 576000
n_steps : 4500
MSRVTT_jsfusion_test/t2v_metrics/R1: 22.4
MSRVTT_jsfusion_test/t2v_metrics/R5: 50.9
MSRVTT_jsfusion_test/t2v_metrics/R10: 64.8
MSRVTT_jsfusion_test/t2v_metrics/R50: 89.9
MSRVTT_jsfusion_test/t2v_metrics/MedR: 5.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 26.9
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 41.96110685214228
MSRVTT_jsfusion_test/v2t_metrics/R1: 23.3
MSRVTT_jsfusion_test/v2t_metrics/R5: 50.5
MSRVTT_jsfusion_test/v2t_metrics/R10: 66.5
MSRVTT_jsfusion_test/v2t_metrics/R50: 89.2
MSRVTT_jsfusion_test/v2t_metrics/MedR: 5.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 24.48
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 42.771680577495054
mnt_best : 41.96110685214228
not_improved_count: 0
Train Epoch: 19 [1/250 128/32000 (0%)] Loss: 5.04807 (QuantReg: 12.72191) QuantErr: 12.72191 batch_time=31.50367
Train Epoch: 19 [12/250 1536/32000 (5%)] Loss: 4.76501 (QuantReg: 13.05006) QuantErr: 13.05006 batch_time=0.42397
Train Epoch: 19 [23/250 2944/32000 (9%)] Loss: 4.18486 (QuantReg: 12.72200) QuantErr: 12.72200 batch_time=0.42882
Train Epoch: 19 [34/250 4352/32000 (14%)] Loss: 3.85611 (QuantReg: 13.22548) QuantErr: 13.22548 batch_time=0.73858
Train Epoch: 19 [45/250 5760/32000 (18%)] Loss: 4.64414 (QuantReg: 12.99193) QuantErr: 12.99193 batch_time=0.43369
Train Epoch: 19 [56/250 7168/32000 (22%)] Loss: 5.20969 (QuantReg: 13.00700) QuantErr: 13.00700 batch_time=0.43055
Train Epoch: 19 [67/250 8576/32000 (27%)] Loss: 4.58974 (QuantReg: 13.11093) QuantErr: 13.11093 batch_time=0.44599
Train Epoch: 19 [78/250 9984/32000 (31%)] Loss: 4.36976 (QuantReg: 12.81047) QuantErr: 12.81047 batch_time=0.43240
Train Epoch: 19 [89/250 11392/32000 (36%)] Loss: 4.57177 (QuantReg: 12.64293) QuantErr: 12.64293 batch_time=0.42566
Train Epoch: 19 [100/250 12800/32000 (40%)] Loss: 4.37840 (QuantReg: 13.01058) QuantErr: 13.01058 batch_time=0.47061
Train Epoch: 19 [111/250 14208/32000 (44%)] Loss: 5.82476 (QuantReg: 12.97648) QuantErr: 12.97648 batch_time=0.49869
Train Epoch: 19 [122/250 15616/32000 (49%)] Loss: 5.68190 (QuantReg: 13.04120) QuantErr: 13.04120 batch_time=0.44320
Train Epoch: 19 [133/250 17024/32000 (53%)] Loss: 5.21025 (QuantReg: 13.00919) QuantErr: 13.00919 batch_time=0.46532
Train Epoch: 19 [144/250 18432/32000 (58%)] Loss: 4.51230 (QuantReg: 12.87225) QuantErr: 12.87225 batch_time=0.43462
Train Epoch: 19 [155/250 19840/32000 (62%)] Loss: 4.31006 (QuantReg: 13.27975) QuantErr: 13.27975 batch_time=0.43758
Train Epoch: 19 [166/250 21248/32000 (66%)] Loss: 5.18265 (QuantReg: 12.95189) QuantErr: 12.95189 batch_time=0.44973
Train Epoch: 19 [177/250 22656/32000 (71%)] Loss: 3.96763 (QuantReg: 13.22038) QuantErr: 13.22038 batch_time=0.45919
Train Epoch: 19 [188/250 24064/32000 (75%)] Loss: 5.02620 (QuantReg: 12.95633) QuantErr: 12.95633 batch_time=1.01409
Train Epoch: 19 [199/250 25472/32000 (80%)] Loss: 4.58592 (QuantReg: 13.03451) QuantErr: 13.03451 batch_time=0.48512
Train Epoch: 19 [210/250 26880/32000 (84%)] Loss: 4.59246 (QuantReg: 12.70414) QuantErr: 12.70414 batch_time=0.46487
Train Epoch: 19 [221/250 28288/32000 (88%)] Loss: 4.60097 (QuantReg: 12.66146) QuantErr: 12.66146 batch_time=0.42892
Train Epoch: 19 [232/250 29696/32000 (93%)] Loss: 5.17558 (QuantReg: 13.29051) QuantErr: 13.29051 batch_time=0.42483
Train Epoch: 19 [243/250 31104/32000 (97%)] Loss: 4.67630 (QuantReg: 12.74276) QuantErr: 12.74276 batch_time=0.44042
Train Epoch: 19 codebook_update_time=0.83823
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch19.pth ...
Done in 4.863s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_L3/checkpoint-epoch19.pth ...
Done in 9.230s
removing stale ckpt [epoch 18] [took 0.01s]
epoch : 19
loss : 4.87741821193695
quant_reg : 12.965199855804443
quant_err : 12.965199855804443
learning_rate : 1.986071592291091e-05
n_samples : 608000
n_steps : 4750