-
Notifications
You must be signed in to change notification settings - Fork 866
849 quantum vision transformer #967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@@ -0,0 +1,528 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Show convergence results etc...
If running is too slow, you can train it once and in the notebook upload a pretrained weights.
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomerGoldfriend, The training procedure is very slow due to the API call, it takes more than 1 minute to get the result of one batch iteration.
Also, I am getting this error when I want to train the Quantum Vision Transformer:
ClassiqAPIError: Call to API failed with code 400: Apologies for the inconvenience. We're currently experiencing an overwhelming surge in user activity, causing our system to be temporarily overloaded. Our team is actively addressing this and working on resolving the issue to provide you with a smoother experience. Please bear with us as we work to accommodate the high interest and requests. Thank you for your patience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neogyk OK, I see. Could you please clarify the size of your quantum layer? How many input parameters and how many weights, as well as the batch size you are using.
Is it possible to reduce the problem, treating a smaller usecase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomerGoldfriend , The quantum layer has a 4 qubits. I am using batchsize of 1 element. The dimensionality of weights is 4*4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomerGoldfriend , I can try to use the smaller dataset, the problem appears during the backpropagation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yes, it's correct
- I will send you the benchmark asap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neogyk any news? did reducing the problem size make it better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It takes around '06:26<6437:15:41, 386.24s/it' for 1 element batch iteration. I never got the full epoch training.
Line # Hits Time Per Hit % Time Line Contents
1 def train():
2 1 1699000.0 2e+06 0.0 clf.train() # set model in training mode (need this because of dropout)
3
4 # dataset API gives us pythonic batching
5 2 117870000.0 6e+07 0.0 for data, label in tqdm.tqdm(train_loader):
6 2 1276000.0 638000.0 0.0 opt.zero_grad()
7 2 2e+11 1e+11 30.7 preds = clf(data)
8 2 3454000.0 2e+06 0.0 loss = torch.nn.functional.nll_loss(preds, label)
9 2 5e+11 3e+11 69.3 loss.backward()
10 1 10000.0 10000.0 0.0 loss_history.append(loss)
11 1 3981000.0 4e+06 0.0 opt.step()
12 return loss_history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomerGoldfriend, I am currently looking for ways to optimize the execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neogyk sorry for my late reply. I am not sure I fully understand the data. How much time in seconds does it take to do one batch of size 1?
Did you try to reduce the problem in terms of weights (from 16 to 4 or 8)?
Quantum Vision Transformer. Paper Implementation
The purpose of this pr is to bring the implementation of Quantum Vision Transformer to classiq community.
The related issue is here