Speed Improvement Suggestions

It might be worth looking into how much time is spent transferring data to and from the GPU.  In general sending 1 larger chunk to the GPU is preferable to sending 10-100 smaller chunks.  I would just make sure that this isn't the case. 

https://github.com/wdwzyyg/ElectronCounting/blob/2abba00a09cd9d9245cd9745fd31f97b22f9e1a5/CountingNN/locator.py#L194

This line could be problematic as it copies data to the GPU memory from the CPU memory.  I'm not sure exactly how long each call takes but for a 256x256 spatial position datset then there are 256*256*2=131072 transfers back and forth from the GPU. It probably depends if each of these transfers is 1 seconds, 0.1 seconds or 0.01 seconds.  I'd hope it is more the last one but it is good to check.

An easy way to get this if just to time the difference in running `predict_sequence` for an image on the GPU already and one where the data has to transfer to/from the GPU.

I know that we had this problem in `pyxem` when using the GPU as things were slowing down from transfer times.  The solution is to use something like dask and the `map blocks` function to  transfer mutiple images to the GPU at one time.  Maybe you are already doing this and I just didn't see it. 

This might be relevant: 

https://github.com/pyxem/pyxem/blob/99013eeefbde37d64399e997785c6a1438312c09/pyxem/utils/indexation_utils.py#L1561



-----

The other thing you could look into is that applying your model in a for-loop to a small part of your data is probably very impact on your processing time.  If you think of a GPU as a bunch of small CPU's then you are having most of your GPU sit idle while you process the small patch.  

You should look into [this](https://pytorch.org/vision/master/auto_examples/plot_scripted_tensor_transforms.html#scriptable-transforms-for-easier-deployment-via-torchscript).  Basically this uses numba and `jit` to accelerate batches of images.  So you should just be able to apply it to  multiple patches much more quickly than using a for loop. 

We can talk about this more in person if you would like.  The second suggestion is probably much easier to implement and has potentially much larger gains so I would start there. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed Improvement Suggestions #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Speed Improvement Suggestions #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions