Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task04 Нина Чекалина ITMO #194

Closed
wants to merge 3 commits into from

Conversation

ninachekalina
Copy link

@ninachekalina ninachekalina commented Oct 17, 2024

Локальный вывод

` По транспонированию матрицы: Using device #1: GPU. NVIDIA GeForce GTX 1050. Total memory: 4096 Mb Data generated for M=4096, K=4096 [matrix_transpose_naive] GPU: 0.00573333+-0.000442217 s GPU: 2926.26 millions/s [matrix_transpose_local_bad_banks] GPU: 0.002+-0 s GPU: 8388.61 millions/s [matrix_transpose_local_good_banks] GPU: 0.002+-0 s GPU: 8388.61 millions/s По перемножению матриц: Using device #1: GPU. NVIDIA GeForce GTX 1050. Total memory: 4096 Mb Data generated for M=1024, K=1024, N=1024 CPU: 4.443+-0 s CPU: 0.450146 GFlops [naive, ts=4] GPU: 0.0665+-0.00453689 s GPU: 30.0752 GFlops Average difference: 0.000196008% [naive, ts=8] GPU: 0.0291667+-0.000372678 s GPU: 68.5714 GFlops Average difference: 0.000196008% [naive, ts=16] GPU: 0.0195+-0.0005 s GPU: 102.564 GFlops Average difference: 0.000196008% [local, ts=4] GPU: 0.0518333+-0.000687184 s GPU: 38.5852 GFlops Average difference: 0.000196008% [local, ts=8] GPU: 0.0156667+-0.000471405 s GPU: 127.66 GFlops Average difference: 0.000196008% [local, ts=16] GPU: 0.008+-0 s GPU: 250 GFlops Average difference: 0.000196008% [local wpt, ts=4, wpt=2] GPU: 0.07+-0 s GPU: 28.5714 GFlops Average difference: 0.000196008% [local wpt, ts=4, wpt=4] GPU: 0.0723333+-0.000471405 s GPU: 27.6498 GFlops Average difference: 0.000196008% [local wpt, ts=8, wpt=2] GPU: 0.016+-0 s GPU: 125 GFlops Average difference: 0.000196008% [local wpt, ts=8, wpt=4] GPU: 0.0175+-0.0005 s GPU: 114.286 GFlops Average difference: 0.000196008% [local wpt, ts=8, wpt=8] GPU: 0.02+-0 s GPU: 100 GFlops Average difference: 0.000196008% [local wpt, ts=16, wpt=2] GPU: 0.00583333+-0.000372678 s GPU: 342.857 GFlops Average difference: 0.000196008% [local wpt, ts=16, wpt=4] GPU: 0.00483333+-0.000372678 s GPU: 413.793 GFlops Average difference: 0.000196008% [local wpt, ts=16, wpt=8] GPU: 0.00416667+-0.000372678 s GPU: 480 GFlops Average difference: 0.000196008% [local wpt, ts=16, wpt=16] GPU: 0.007+-0 s GPU: 285.714 GFlops Average difference: 0.000196008% `
Вывод на Git CI
По транспонированию матрицы:
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15[9]
Data generated for M=4096, K=4096
[matrix_transpose_naive]
    GPU: 0.0152501+-0.0001[10]
    GPU: 1100.13 millions/s
[matrix_transpose_local_bad_banks]
    GPU: 0.0151669+-2.53016e-05 s
    GPU:06.17 millions/s
[matrix_transpose_local_good_banks]
    GPU: 0.0154942+-4.75584e-05 s
    GPU: 1082.81 millions/s
По перемножению матриц:
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 91 Mb
Data generated for M=1024, K=1024, N=[10]
CPU: 6.22955+-0 s
CPU: 0.32105 GFlops
[naive, ts=4]
    GPU: 0.303985+-0.0188553 s
    GPU: 6.57928 GFlops
    Average difference: 0.000149043%
[naive, ts=8]
    GPU: 0.25422+-0.00269473 s
    GPU: 7.86721 GFlops
    Average difference: 0.000149043%
[naive, ts=16]
    GPU: 0.2623+-0.00694251 s
    GPU: 7.63029 GFlops
    Average difference: 0.000149043%
[local, ts=4]
    GPU: 0.594601+-0.000920175 s
    GPU: 3.3636 GFlops
    Average difference: 0.000149043%
[local, ts=8]
    GPU: 0.151897+-0.000347045 s
    GPU: 13.1668 GFlops
    Average difference: 0.000149043%
[local, ts=16]
    GPU: 0.0932103+-0.000214138 s
    GPU: 21.4568 GFlops
    Average difference: 0.000149043%
[local wpt, ts=4, wpt=2]
    GPU: 0.514815+-0.000434953 s
    GPU: 3.88489 GFlops
    Average difference: 0.000149043%
[local wpt, ts=4, wpt=4]
    GPU: 0.424239+-0.000809278 s
    GPU: 4.71432 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=2]
    GPU: 0.6965+-0.000457022 s
    GPU: 15.7523 GFlops
    Average difference: 0.0009043%
[local wpt, ts=8, wpt=4]
    GPU: 0.255246+-0.000552609 s
    GPU: 7.83557 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=8]
    GPU: 0.256738+-0.00107633 s
    GPU: 7.79004 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=2]
    GPU: 0.0766763+-8.68863e-05 s
    GPU: 26.0837 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=4]
    GPU: 0.07314+-0.000450453 s
    GPU: 27.3443 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16 wpt=8]
    GPU: 0.0811698+-0.0011556 s
    GPU: 24.6397 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=16]
    GPU: 0.328982+-0.001054 s
    GPU: 6.07936 GFlops
    Average difference: 0.000149043%

@simiyutin
Copy link
Collaborator

Все хорошо, задача зачтена, 8/10 баллов

@simiyutin simiyutin closed this Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants