You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your good project!
I used the same sample strategy to operate audio data and video frames, e.g., resample all video frames using 25 fps, and use 24 frames one time to extract a feature using i3d. At the same time, one audio feature represents a 0.96 audio clip. But I got different length features, e.g, audio with (162, 128) and video with (165, 1024). the video features length is correct but with the wrong audio feature length.
How do I deal with it?
The text was updated successfully, but these errors were encountered:
With the information that you provide, it is hard to give recommendations.
2% of features are missing in one modality - i would just trim it to the shortest sequence (162 in your case).
By the way, is it happening for every video you tried or some videos? Can you calculate the ratio of videos when shape mismatch occur? Is this ratio large enough to worry?
I extracted features of 3000+ videos, there are 6 videos with longer visual features and 400+ videos with shorter video features than audio features.
I think the videos whose visual features are 1 shorter than audio features are reasonable since 1 more frame is needed every time to extract optical flow. But the videos whose visual features are longer than audio features are abnormal.
If I directly trim it to the shortest sequence, I'm afraid the two modalities can not correspond with each other well.
I think one track (audio or visual) is slightly longer than another one. Maybe something is accumulating somewhere -- hard to tell based on the information you are providing.
Does the difference grow as the video gets longer?
Thanks for your good project!
I used the same sample strategy to operate audio data and video frames, e.g., resample all video frames using 25 fps, and use 24 frames one time to extract a feature using i3d. At the same time, one audio feature represents a 0.96 audio clip. But I got different length features, e.g, audio with (162, 128) and video with (165, 1024). the video features length is correct but with the wrong audio feature length.
How do I deal with it?
The text was updated successfully, but these errors were encountered: