Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor/decompose primitives #25

Merged
merged 7 commits into from
Aug 10, 2023
Merged

Conversation

PabloAndresCQ
Copy link
Collaborator

@PabloAndresCQ PabloAndresCQ commented Aug 1, 2023

Refactor of MPS algorithms so that every call to tensor_svd and tensor_qr are replaced with the higher level function decompose from cuTensorNet. This means that the following bits of code are no longer needed (and hence removed):

  • No longer need to set a memory handler in the CuTensorNetHandle.
  • No longer need to create/destroy tensor descriptors nor svd_config or svd_info objects.
  • SVD configuration is much simpler and elegant.
  • Bond IDs are now managed locally, i.e. we do not keep the bond IDs of the network around. This has some advantages:
    • Many lines of code are removed, since we need to keep track of one less thing.
    • Code readability of MPSxGate is greatly improved, since now contract uses the "subscript" notation, which is more human-readable.
    • The MPSxMPO algorithm does keep track of global bond IDs for the MPOs, since I found that's useful here. I don't think the code readability has improved that much here, but it's not worse than before.
  • The Tensor class has been removed since we no longer need to keep track of bond IDs, nor generate tensor descriptors (thanks to the higher-level API from cuTensorNet).

Additionally, I have taken the liberty of sneaking in a few related changes:

  • Whenever an SVD is not followed by a truncation, this is replaced with QR decomposition (much cheaper).
  • Added an absolute cutoff to SVD operations, so that singular values that are essentially zero (up to float precision) are automatically removed.
  • Moved the .use() method of cuda.Device from MPS to CuTensorNetHandle, where it more naturally belongs.

Additionally, I have had a look at contract_decompose and split_gate from cuTensorNet.

  • My conclusion is that split_gate is never going to be useful for MPS, since the initial QR decompositions that are done would not decrease the rank of the tensors being SVD'd. Similarly for TreeTN, so this will only become useful when working with more general TN states.
  • On the other hand, contract_decompose is potentially useful since it doesn't just do contract then decompose, but keeps track of the max possible dimension of the shared bond and impose it in the decomposition. However, in the places I apply QR and SVD, I had already taken this into account, so I don't expect any performance improvement. It'd be non-trivial (but not too difficult either) to refactor the code to use contract_decompose and, since it's still in the experimental module, I'm not using it for now, but I'll keep an eye on it.

@PabloAndresCQ PabloAndresCQ requested review from yapolyak and removed request for yapolyak August 1, 2023 10:38
@PabloAndresCQ PabloAndresCQ marked this pull request as draft August 1, 2023 16:25
@PabloAndresCQ
Copy link
Collaborator Author

While working on the sampling feature for MPS I have realised that I'd prefer not to keep track of the ID of all bond IDs in the network. This is just more data to keep around and it's actually not that useful, since in the end I'm only ever applying contract and decompose to local parts of the network, and I could assign bond IDs to them in the spot.

I have spent some time today thinking about the implications of this going forward. I think it'd make the implementation of TTN a bit easier and it should work for the belief propagation algorithms as well, so I've decided I'm going for it.

Since this PR already had to deal with Tensor's bond IDs and since the new way of using contract I'm envisioning was inspired on the way decompose works, I'll just go ahead and do this changes in here directly.

@PabloAndresCQ PabloAndresCQ marked this pull request as ready for review August 7, 2023 13:55
Copy link
Contributor

@yapolyak yapolyak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the simplifications! I wasn't too attentive esp. in MPSxMPO, but it looks good!


memhandle = (malloc, free, "memory_handler")
cutn.set_device_mem_handler(self.handle, memhandle)
dev = cp.cuda.Device()
Copy link
Contributor

@yapolyak yapolyak Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you not need to pass device_id anymore to cp.cuda.Device()?
P.S. I love the simplifications and removal of the boiler plate! Do you think this class is still usable for other purposes such as intrinsic MPI support for the backend?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 70 cp.cuda.Device(device_id).use() I've set cupy to use device_id, so that any call to cp stuff will use it by default. In particular, cp.cuda.Device() does as well, so you may think that dev == device_id everytime. However, the sneaky bit is that device_id may be None. In that case, cp.cuda.Device(device_id).use() is simple saying "use the default device" and dev = cp.cuda.Device() is returning the id of said default device.

Do you think this class is still usable for other purposes such as intrinsic MPI support for the backend?

Yeah, as usable as it was before. We'd just need to keep multiple instances of CuTensorNetHandle, each on a different device. The non-trivialy part would still be to refactor the MPS class itself so that we can keep track of which tensors are on which device, so that we know which of these CuTensorNetHandles should be used when applying operations on them. And, of course, what to do with message passing when acting between tensors in different devices (possibly, just sending one of the tensors to the other device, updating and then sending back).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am thinking about this more and more often - sounds like we really need it. Yes, I think we can use a common stencil-like approach as the first attempt. I will hope to start looking into this soonish.

@PabloAndresCQ PabloAndresCQ merged commit a3ee4f0 into develop Aug 10, 2023
6 checks passed
@PabloAndresCQ PabloAndresCQ deleted the refactor/decompose_primitives branch August 10, 2023 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants