Open
Description
Inside a model definition, the torch.nn.Module
objects inside a Python list do not get their parameters registered. Hence such parameters do not get trained by the optimizer, even though they are in the call graph formed by forward(). This should be flagged by torchfix -- currently no warning is given for this issue.
Example:
class FeedForward(torch.nn.Module):
def __init__(self, n_features, n_classes, n_hidden, width):
super().__init__()
# Ideally, torchfix should issue a warning on below code
# The parameters of the hidden layers do not get registered if they are in a list, and are not optimized!
self.hidden_layers = [torch.nn.Linear(n_features if i ==0 else width, width, bias=True) for i in range(n_hidden)]
# Correct version of the above code -- use ModuleList([]) instead of python list []
self.hidden_layers = torch.nn.ModuleList([torch.nn.Linear(n_features if i ==0 else width, width, bias=True) for i in range(n_hidden)])
# Dummy call to torch.solve() to throw a torchfix warning (to demonstrate that torchfix is working correctly)
torch.solve()
Torchfix output:
$ torchfix --select=ALL ./supervised/nn/feed_forward_nn.py
supervised/nn/feed_forward_nn.py:20:9: TOR001 Use of removed function torch.solve: https://github.com/pytorch-labs/torchfix#torchsolve
Finished checking 1 files.
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
kit1980 commentedon Mar 13, 2024
Sounds like a good idea.
I personally helped people who had this issue.
kit1980 commentedon Mar 13, 2024
This seems a bit tricky to implement.
Currently TorchFix doesn't know the types of the objects, so it's hard to find lists of
torch.nn.Module
objects.pyre and
TypeInferenceProvider
https://libcst.readthedocs.io/en/latest/metadata.html#libcst.metadata.TypeInferenceProvider can probably help here, but it's a separate feature to implement.ssgosh commentedon Mar 13, 2024
Yikes! Perhaps it can be done on a best-effort basis for some commonly-used class types, such as
Linear
,Conv2d
and other subclasses oftorch.nn.Module
as found here: https://pytorch.org/docs/stable/nn.html ? Maybe it can be done only for list comprehensions? I would imagine that it's a common idiom that many people use.sbrugman commentedon Sep 3, 2024
I'll contribute this rule. Got it working locally, just waiting for the open PRs to be reviewed/merged.
There is a real-world example in
transformers
(impact mitigated by the subsequentadd_module
calls). Other than that, the violation of this rule is fairly rare in larger projects, but moderately common in smaller repos (10+ examples)