-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ten/five crop augmentation #110
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your pull request. However, it should be significantly revised before merging.
I'd be more positive about it if:
- one pull request wouldn't implement two features: one for the ten crop, the other for the changes in disk writing
- I don't like that each crop is treated as a separate video (now each video creates 10/5 times more files. can't we implement crop as a batch dimension?
- the augs are applied only to RGB stream, not both.
- the augs are implemented for i3d but not other models.
- i am not happy that the classes/functions that are common for many models are being changed without reflecting on how logic for other features that depend on them changes
i appreciate the efforts but i am not convinced that it is enough
self.i3d_transforms = { | ||
'rgb': torchvision.transforms.Compose([ | ||
TensorCenterCrop(self.central_crop_size), | ||
aug_transform, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why we can't do it for the flow?
|
||
|
||
if self.aug_type is not None: | ||
feats_dict = {stream: [[] for _ in range(self.num_crop)] for stream in self.streams} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why treat each crop as a separate tensor instead of a batch dimension: B, Crops, D --> B*Crops, D
?
@@ -145,13 +145,17 @@ def __call__(self, tensor: torch.FloatTensor) -> torch.FloatTensor: | |||
|
|||
class ScaleTo1_1(object): | |||
|
|||
def __call__(self, tensor: torch.FloatTensor) -> torch.FloatTensor: | |||
def __call__(self, tensor): | |||
if isinstance(tensor, tuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lost typing
return (2 * tensor / 255) - 1 | ||
|
||
|
||
class PermuteAndUnsqueeze(object): | ||
|
||
def __call__(self, tensor: torch.FloatTensor) -> torch.FloatTensor: | ||
def __call__(self, tensor): | ||
if isinstance(tensor, tuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lost typing
@@ -50,9 +50,18 @@ def show_predictions_on_dataset(logits: torch.FloatTensor, dataset: Union[str, L | |||
print(f'{logit:8.3f} | {smax:.3f} | {cls}') | |||
print() | |||
|
|||
def make_path(output_root, video_path, output_key, ext): | |||
def make_path(output_root, video_path, output_key, ext, idx=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't resort to this. it became incredibly redundant. we need to save all features in one file
@@ -76,6 +78,11 @@ def action_on_extraction( | |||
return | |||
|
|||
for key, value in feats_dict.items(): | |||
if self.save_option == 'rgb_only': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's wrong with the streams
argument in i3d?
Add ten/five crop augmentation when extract the I3D features. Solve issue #92 #72 . And add save_option to the i3d.yaml file to save only reg features. Because I think that fps and timestamp features are really redundant. The shape of the rgb features imply timestamp.
Thank you for your great work. 🚀🚀