-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
909 lines (866 loc) · 53.6 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="description"
content="This research explores innovative techniques inspired by quantum computing to compress deep learning models, particularly convolutional neural networks (CNNs).
By reducing model complexity and computational demands, we aim to make AI more efficient and accessible, especially in resource-constrained environments.
Our work, presented at NeurIPS 2024, has the potential to revolutionize AI deployment in industries ranging from healthcare to agriculture.">
<meta name="keywords" content="QIANets, Quantum Pruning, Quantum Tensor Decomposition">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models</title>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-PYVRSFMDRL"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag('js', new Date());
gtag('config', 'G-PYVRSFMDRL');
</script>
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro"
rel="stylesheet">
<link rel="stylesheet"
href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<link href="bulma.min.css" rel="stylesheet" type="text/css">
<link href="bulma-carousel.min.css" rel="stylesheet" type="text/css">
<link href="bulma-slider.min.css" rel="stylesheet" type="text/css">
<link href="index.css" rel="stylesheet" type="text/css">
<link href="fontawesome.all.min.css" rel="stylesheet" type="text/css">
<script type="text/javascript" src="bulma-slider.min.js"></script>
<script type="text/javascript" src="index.js"></script>
<script type="text/javascript" src="bulma-carousel.min.js"></script>
<script type="text/javascript" src="fontawesome.all.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
</head>
<body>
<section class="hero">
<div class="hero-body">
<div class="container is-max-desktop">
<div class="columns is-centered">
<div class="column has-text-centered">
<h1 class="title is-1 publication-title">QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models</h1>
<div class="is-size-5 publication-authors">
<span class="author-block">
<a href="https://edwardmagongo.esoko.tz/">Edward Magongo</a><sup>1</sup>,
</span>
<span class="author-block">
<a href="https://www.linkedin.com/in/olivia-f-holmberg">Olivia Holmberg</a><sup>2</sup>,
</span>
<span class="author-block">
<a href="https://www.linkedin.com/in/vanessa-m-a2935930b/">Vanessa Matvei</a><sup>3</sup>,
</span>
<span class="author-block">
<a href="https://www.linkedin.com/in/zhumazhan-balapanov-679a23218">Zhumazhan Balapanov</a><sup>4</sup>
</span>
</div>
<div class="is-size-5 publication-affiliations">
<span class="author-block"><sup>1</sup>Saint Andrews Turi Molo,</span>
<span class="author-block"><sup>2</sup>American School in London,</span>
<span class="author-block"><sup>3</sup>Orizont Theoretical Lyceum,</span>
<span class="author-block"><sup>4</sup>Munich International School</span>
</div>
<div class="column has-text-centered">
<div class="publication-links">
<!-- Peper Link. -->
<span class="link-block">
<a href="https://arxiv.org/pdf/2410.10318v2" class="external-link button is-normal is-rounded is-dark">
<span class="icon">
<i class="fas fa-file-pdf"></i>
</span>
<span>Paper</span>
</a>
</span>
<!-- Quantum Insider Link. -->
<span class="link-block">
<a href="https://thequantuminsider.com/2024/10/30/quantum-inspired-techniques-cut-latency-in-computer-vision-without-sacrificing-accuracy/" class="external-link button is-normal is-rounded is-dark">
<span class="icon">
<i class="fas fa-atom"></i>
</span>
<span>The Quantum Insider</span>
</a>
</span>
<!-- arXiv Link. -->
<span class="link-block">
<a href="https://doi.org/10.48550/arXiv.2410.10318" class="external-link button is-normal is-rounded is-dark">
<span class="icon">
<i class="ai ai-arxiv"></i>
</span>
<span>arXiv</span>
</a>
</span>
<!-- Code Link. -->
<span class="link-block">
<a href="https://github.com/edwardmagongo/Quantum-Inspired-Model-Compression" class="external-link button is-normal is-rounded is-dark">
<span class="icon">
<i class="fab fa-github"></i>
</span>
<span>Code</span>
</a>
</span>
</div>
</div>
</div>
</div>
</div>
</div>
</section>
<section class="section">
<div class="container is-max-desktop">
<!-- Abstract. -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Abstract</h2>
<div class="content has-text-justified">
<p>
Convolutional neural networks (CNNs) have made significant advances in computer vision tasks, yet their high inference times and latency often limit real-world applicability. While model compression techniques have gained popularity as solutions,
they often overlook <em>the critical balance between low latency and uncompromised accuracy</em>. By harnessing <strong> quantum-inspired pruning, tensor decomposition and
annealing-based matrix factorization</strong> – three quantum-inspired concepts – we introduce QIANets: a novel approach of redesigning the traditional GoogLeNet,
DenseNet, and ResNet-18 model architectures to process more parameters and computations whilst maintaining low inference times. Despite experimental limitations, the method was tested and evaluated, demonstrating reductions in inference
times, along with effective accuracy preservations.
</p>
</div>
</div>
</div>
<!--/ Abstract. -->
</div>
</section>
<section class="section">
<div class="container is-max-desktop">
<!-- Introduction -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Introduction</h2>
<div class="content has-text-justified">
<p>
The field of computer vision (CV) has recently experienced a substantial rise in interest (Su & Crandall, 2021). This surge has created transformative advancements, driving the development of deep learning models, particularly those based on the convolutional architecture, such as DenseNet (Huang et al., 2017). GoogLeNet (Szegedy et al. 2015), and ResNet-18 (He et al., 2015). These
methods have significantly optimized neural networks for image processing tasks, achieving state-of-the-art performance across multiple benchmarks (Anumol et al., 2023). However, the increasing computational complexity, memory consumption, and model size – often comprising millions to
billions of parameters – pose substantial challenges for deployment, especially in time-sensitive and
computationally-limited scenarios. The demand for <em>low-latency processing</em> in real-time applications,
such as image processing and automated CV systems, is critical; compact models are needed for
faster responses (Honegger et al., 2014).
</p>
<p>
To address these issues, researchers have explored various optimization techniques to reduce inference
times and latency while maintaining high accuracy. Model compression techniques such as pruning,
quantization, and knowledge distillation have shown promise in enhancing model efficiency (Li
et al., 2023). Yet, these methods often come with trade-offs that can impact model performance,
necessitating a careful balance between energy efficiency and accuracy.
</p>
<p>
In recent years, the principles of quantum computing have emerged as an avenue for accelerating
inference in machine learning (Divya & Dinesh Peter, 2021). Quantum-inspired methods, which
leverage phenomena such as quantum optimization algorithms, strive to maintain model performance
by reducing computational requirements, thereby offering significant speedups for certain tasks
(Pandey et al., 2023). Meanwhile, traditional model compression techniques reduce the size of neural
networks by removing less important weights, <em>acrificing accuracy for lower latency </em> (Francy &
Singh, 2024). By integrating concepts from quantum mechanics into convolutional neural network
(CNN) models, our approach seeks to address these limitations. We explore the potential of designing
CNNs to balance improved inference times with minimal accuracy loss, creating a novel solution.
</p>
<p>
Within this context, we employ three key quantum-inspired principles: 1. quantum-inspired pruning:
reducing model size by removing unnecessary parameters, guided by quantum approximation algorithms;
2. tensor decomposition: breaking down high-dimensional tensors into smaller components
to reduce computational complexity; and 3. annealing-based matrix factorization: optimizing matrix
factorization by using annealing techniques to find efficient representations of the data.
</p>
<p>
Our work addresses the following research question: How can principles from quantum computing
be used to design and optimize CNNs to reduce latency and improve inference times, while still
maintaining stable accuracies across various models?
</p>
<p>
In this paper, we propose a Quantum-Integrated Adaptive Networks (QIANets) – a comprehensive
framework that integrates these quantum computing techniques into the DenseNet, GoogLeNet,
and ResNet-18 architectures. To the best of our knowledge, this is the first attempt made to: 1)
apply quantum computing-inspired algorithms into the models’ architectures to reduce computational
requirements and achieve efficient performance improvements, and 2) specifically target these models.
</p>
<p>
The contributions of this work include:
</p>
<p>
• QIANets: a comprehensive framework that integrates QAOA-inspired pruning, tensor
decomposition and quantum annealing-inspired matrix factorization into three CNNs.
</p>
<p>
• An exploration of the trade-offs between latency, inference time, and accuracy, highlighting
the effects of applying quantum principles to CNN models for real-time optimization.
</p>
</div>
</div>
</div>
</div>
</section>
<section class="section">
<div class="container is-max-desktop"
<!-- Related Works. -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Related Works</h2>
<div class="content has-text-justified">
<p>
Our proposed method builds upon the ideas of model compression &
quantum-inspired techniques to improve the inference times of CNNs.
</p>
<!-- Model Compression Technique. -->
<h3 class="title is-4" style="text-align: center;">2.1 Model Compression Techniques</h3>
<div class="content has-text-justified">
<p>
<em>Pruning</em> is one of the most effective ways to accelerate CNNs. Cheng et al. (2018) provided a comprehensive
review of model compression techniques for deep neural networks (DNNs), specifically
focusing on parameter pruning methods: the removal of individual weights based on importance.
This significantly reduces model size, while generally preserving the model’s performance.
</p>
<p>
Despite the advancements in parameter pruning, overall conventional pruning techniques have
limitations: 1) when applied during the training phase, as such a method may be costly, and 2) the
risk of prematurely removing important data. Hou et al. (2022) proposed a novel methodology called
CHEX for <em>training-based channel pruning</em> and regrowing of channels throughout the training process.
By employing a column subset selection (CSS) formulation, CHEX allocates and reassigns channels
across layers, allowing for significant model compression without requiring a fully pre-trained model.
</p>
</div>
</div>
<!--/ Model Compression Technique. -->
<!-- Quantum-Inspired Techniques for CNNs -->
<h3 class="title is-4">2.2 Quantum-Inspired Techniques for CNNs</h3>
<div class="content has-text-justified">
<p>
Quantum computing is currently recognized as a potential game-changer for various fields, including
NLP, due to its ability to process complex data more efficiently than classical computers. Shi et al.
(2021) proposed a quantum-inspired architecture for convolutional neural networks (QICNNs), using
complex-valued weights to enhance the representational capacity of traditional CNNs. The authors
display that their QICNNs achieve higher classification accuracy and faster convergence on datasets
compared to standard CNNs. In contrast, our methodology prioritizes structural optimization for
greater computational efficiency. We focus on reducing latency and improving inference times by
employing various quantum techniques to decrease computational overhead in the tested CNNs.
</p>
<p>
Hu et al. (2022) set a high standard in the field by addressing the unique challenges of compressing
quantum neural networks (QNNs). Their CompVQC framework leverages an alternating direction
method of multipliers (ADMM) approach, achieving a remarkable reduction in circuit depth by
over 2.5 times with less than 1% accuracy degradation. While their results in QNN compression
are impressive, our research introduces a novel first-attempt technique that applies QAOA-inspired
pruning, tensor decomposition and quantum annealing-inspired matrix factorization to classical
CNNs. Our work can complement their approach, underscoring the potential of integrating quantum
concepts to classical networks, and may lead to further improvements in model efficiency.
</p>
</div>
<div class="content has-text-centered"> </div>
<!--/ Quantum-Inspired Techniques for CNNs -->
</div>
</div>
</div>
<!--/ Related Works. -->
</section>
<section class="section">
<div class="container is-max-desktop"
<!-- Methodology -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Methodology</h2>
<!-- Quantum-inspired pruning -->
<h3 class="title is-4">3.1 Quantum-inspired pruning</h3>
<div class="content has-text-justified">
<p>
<em>Pruning</em> is a widely applied technique for reducing CNN complexities; early studies have demon-
strated the effectiveness of pruning in optimizing neural networks (LeCun et al., 1989; Hanson and
Pratt, 1989; Hassibi et al., 1993). Our method implements a new approach to optimization through
the <em>Quantum Approximate Optimization Algorithm</em> (QAOA). QAOA (Fahri et al., 2022) is designed
to address problems in combinatorial optimization, which involves finding the best solution from a
finite set of solutions.
</p>
<p>
Similarly, we adapt these principles by framing pruning as a probabilistic optimization problem.
The goal is to identify the most important weights to retain while allowing others to approach zero.
For a neural network layer represented by weights as a tensor:
<em>W ∈ R<sup>C<sub>out</sub>×C<sub>in</sub>×H×W</sup></em>, we define the importance of each weight using its absolute value:
<em>I<sub>i,j</sub> = </em>|W<sub><em>i,j</em></sub>|
</p>
<p>
To facilitate decision-making regarding weight retention, we normalize these importance scores with the
softmax function to derive probabilities:
</p>
<div style="text-align: center;">
<p style="font-size: 20px;">
P<sub>i,j</sub> = ∑<sub>k,l</sub> e<sup>I<sub>k,l</sub></sup> e<sup>I<sub>i,j</sub></sup>
</p><br>
</div>
<p>
These probabilities are then utilized in a quantum-inspired decision-making process,
where weights are selected for pruning based on a threshold λ, influenced by a hyperparameter known as
layer sparsity α:
</p>
<div style="text-align: center;">
<p style="font-size: 20px;">
R<sub>i,j</sub> =
<span style="display: inline-block; text-align: left; font-size: 20px; margin-top: 10px;">
<span style=" font-size: 16px;">1</span>
<span style="font-size: 16px;"> if P<sub>i,j</sub> ≥ λ</span><br>
<span style="font-size: 16px;;">0</span>
<span style="font-size: 16px;"> otherwise</span>
</span>
</p>
</div><br>
<p>
Here, R<sub>i, j</sub> serves as a binary retain mask indicating whether a weight is pruned (set to 0) or retained.
The threshold λ is calibrated to ensure that approximately 100α% of the weights are pruned.
</p>
<p>
When implemented, we adopt an iterative approach for pruning the model across multiple stages.
Each iteration recalculates the retain mask based on updated probabilities derived from the current
weights. To enhance this process, we introduce a neighboring entanglement mechanism: when a
weight is pruned, its adjacent weights in the tensor may also be pruned with a specified probability
P<sub>entangle</sub>. This mechanism simulates quantum entanglement, reflecting the idea that nearby weights
may exhibit correlated behavior and can be pruned collectively. For convolutional layers specifically,
this sequential pruning strategy is executed over several iterations, progressively reducing the number
of parameters in the model while maintaining performance integrity.
</p>
</div>
<!-- Tensor Decomposition -->
<h3 class="title is-4">3.2 Tensor Decomposition</h3>
<div class="content has-text-justified">
<p>
<em>Tensor decomposition</em> further reduces the dimensionality of the weight tensor while preserving essential information for accurate predictions.
This technique is inspired by Quantum Circuit Learning (QCL), where high-dimensional tensors are decomposed into lower-dimensional
forms for efficient training of quantum circuits (Mitarai et al., 2018).
</p>
<p>
For a weight tensor <em>W ∈ R<sup>C<sub>out</sub>×C<sub>in</sub>×H×W</sup></em>, we use
Singular Value Decomposition (SVD) (Wang et al., 2021) on its flattened matrix representation
<em>W<sub>f</sub> ∈ R<sup>C<sub>out</sub>×(C<sub>in</sub>·H·W)</sup></em>, decomposing it as:
</p>
<p style="text-align: center;">
<em>W<sub>f</sub> = UΣV<sup>T</sup></em>
</p>
<img src="Diagram.png" alt=""/>
<div class="content has-text-centered"> </div>
<p> Figure 1: An illustrative diagram showcasing the framework used for Quantum-Inspired Pruning,
Tensor Decomposition, and Quantum Annealing-Inspired Matrix Factorization. </p>
<p>
Here, <em>U ∈ R<sup>C<sub>out</sub>xr</sup></em> and <em>V ∈ R<sup>(C<sub>in</sub>·H·W)xr</sup></em>
are orthogonal matrices, and Σ ∈ R<sup>r×r</sup> is a diagonal matrix of singular values.
The rank r, chosen as a hyperparameter, controls the compression level by retaining only the top r singular values,
leading to a reduced rank approximation:
</p>
<p style="text-align: center;">
<em>W<sub>f</sub> ≈ U<sub>r</sub> Σ<sub>r</sub> V<sub>r</sub><sup>T</sup></em>.
</p>
<p>
This reduces the number of parameters and computational costs during inference.
</p>
<p>
After tensor decomposition, the original weight tensor is reconstructed using the truncated matrices,
reshaping the compressed weights back into their original form. This process significantly decreases
the number of parameters without greatly affecting model performance. We apply this method to each convolutional layer,
creating lower-rank approximations to control the model’s capacity.
</p>
<!--/ Tensor Decomposition -->
<!-- Quantum annealing-inspired matrix factorization -->
<h3 class="title is-4">3.3 Quantum annealing inspired matrix factorisation</h3>
<div class="content has-text-justified">
<p>
Quantum annealing solves optimization problems by evolving a system toward its lowest energy state (Gherardi & Leporati, 2024).
We apply this concept to factorize weight tensors, treating the factorization as an optimization problem aimed at minimizing
the difference between the original weights and their factorized representation.
</p>
<p>
Given a weight matrix W ∈ R<sup>m×n</sup>, we seek to factor it into two lower-dimensional matrices,
W<sub>1</sub> ∈ R<sup>m×r</sup> and W<sub>2</sub> ∈ R<sup>r×n</sup>, where r is a hyperparameter
that controls the rank:
</p>
<p style="text-align: center;">
W ≈ W<sub>1</sub>W<sub>2</sub>
</p>
<p>
The objective is to minimize the reconstruction error:
</p>
<p style="text-align: center;">
L(W<sub>1</sub>, W<sub>2</sub>) = ∥W − W<sub>1</sub>W<sub>2</sub>∥<sup>2</sup><sub>F</sub>
</p>
<p>
Here, ∥ · ∥<sub>F</sub> denotes the Frobenius norm, measuring the difference between the original and factorized matrices.
We use an iterative optimization procedure inspired by quantum annealing to minimize the loss.
</p>
<p>
The factorization employs gradient-based optimization, initializing W<sub>1</sub> and W<sub>2</sub> randomly. We iteratively
minimize the loss function using an optimizer like LBFGS, suitable for small parameter sets and non-convex landscapes.
This optimization simulates quantum annealing by gradually reducing the step size, ensuring convergence to a local minimum.
Once complete, the compressed weight matrix is defined as:
<p style="text-align: center;">
W<sub>c</sub> = W<sub>1</sub>W<sub>2</sub>.
</p>
<p>
This compressed matrix replaces the original matrix, reducing model complexity while maintaining performance.
</p>
</div>
</div>
</div>
</div>
<!--/ Methodology. -->
</section>
<section class="section">
<div class="container is-max-desktop"
<!-- Experiments and results -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Experiments and Results</h2>
<div class="content has-text-justified">
<p>
We applied our method to compress three CNNs: DenseNet, GoogLeNet, and ResNet-18, all on the CIFAR-10 dataset.
These networks were selected for their different design structures and computational demands, such as parameter count,
depth, and layer types, providing a comprehensive assessment of our method’s effectiveness across different models.
We evaluated the models for image classification performance, focusing on metrics such as inference time, speedup ratio, and accuracy.
</p>
<p>
Each experiment involved 1) applying the QIANets framework to the respective model architecture and
2) evaluating the models and comparing them to their baseline counterparts.
<em>The results, including the networks’ changes before and after compression, are shown in Table 1.</em>
</p>
<p style="text-align: center;">
Table 1: Model Performance Comparison Before and After Compression with rounded figures
</p>
<table style="width:100%; text-align:center; border-collapse: collapse;">
<thead>
<tr>
<th style="border: 1px solid black; padding: 8px;">Model</th>
<th style="border: 1px solid black; padding: 8px;">Base Accuracy</th>
<th style="border: 1px solid black; padding: 8px;">Base Latency (T/I)</th>
<th style="border: 1px solid black; padding: 8px;">New Accuracy</th>
<th style="border: 1px solid black; padding: 8px;">New Latency (T/I)</th>
<th style="border: 1px solid black; padding: 8px;">Compression Ratio</th>
</tr>
</thead>
<tbody>
<tr>
<td style="border: 1px solid black; padding: 8px;">GoogLeNet</td>
<td style="border: 1px solid black; padding: 8px;">94%</td>
<td style="border: 1px solid black; padding: 8px;">0.00096</td>
<td style="border: 1px solid black; padding: 8px;">86%</td>
<td style="border: 1px solid black; padding: 8px;">0.00083</td>
<td style="border: 1px solid black; padding: 8px;">1:0.52</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">ResNet-18</td>
<td style="border: 1px solid black; padding: 8px;">93%</td>
<td style="border: 1px solid black; padding: 8px;">0.000011</td>
<td style="border: 1px solid black; padding: 8px;">87%</td>
<td style="border: 1px solid black; padding: 8px;">0.00007</td>
<td style="border: 1px solid black; padding: 8px;">1:0.61</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">DenseNet</td>
<td style="border: 1px solid black; padding: 8px;">94%</td>
<td style="border: 1px solid black; padding: 8px;">0.000050</td>
<td style="border: 1px solid black; padding: 8px;">88%</td>
<td style="border: 1px solid black; padding: 8px;">0.00042</td>
<td style="border: 1px solid black; padding: 8px;">1:0.56</td>
</tr>
</tbody>
</table>
<!-- Experimential setup -->
<h3 class="title is-4" style="text-align: center;">4.1 Experimential setup</h3>
<div class="content has-text-justified">
<p>
The experiments were conducted within the cloud-based PyTorch framework, utilizing the CIFAR-10 dataset (Krizhevsky & Hinton, 2009).
The CIFAR-10 dataset, which consists of 60,000 32x32 RGB color images in 10 classes, with 6,000 images per class,
was split into training and validation sets with an 80/20 ratio. To meet the input size requirements of the models, images were
resized to 224x224 pixels. Data preprocessing included normalizing pixel values to the range [-1, 1] using the mean and standard
deviation of the dataset. Moreover, data augmentation strategies (random horizontal flipping and random cropping) were applied to
generate data variability and improve the models’ performance on unseen data.
</p>
<p>
All computations were accelerated using CUDA on an NVIDIA A40 GPU via Runpod. Each model underwent training for 50 epochs utilizing
the Adam optimizer, with an initial learning rate of 0.001 and weight decay of 1e-4. To run our experiments, the compute used was
approximately 1,615,680 TFLOPS-seconds. To ensure consistency across models, batch sizes of 128 were used for training,
while batch sizes of 256 were employed for both evaluation and testing within the dataset.
</p>
</div>
</div>
<!--/ Exprimential setup -->
<!-- Hyperparameter Tuning -->
<h3 class="title is-4">4.2 Hyperparameter tuning</h3>
<div class="content has-text-justified">
<p>
Hyperparameter tuning is performed using Optuna, a framework that implements a multitude of techniques to optimize certain parameters.
Optuna experiments with various combinations of hyperparameters, including batch size, learning rate and ECA Kernel Size,
and dynamically adjusts them based on each trial. Following each trial, validation accuracy is calculated to test the effectiveness
of current parameters. This information dynamically refines the subsequent parameters, ultimately approaching optimized parameter
configurations for model performance.
</p>
</div>
<!--/ Hyperparameter Tuning -->
<!-- Model specific analysis -->
<h3 class="title is-4">4.3 Model specific analysis</h3>
<div class="content has-text-centered">
<p>
To effectively accommodate the unique architectures of each model, we made targeted adjustments to the QIANets method.
These modifications were carefully designed to be minimal, ensuring that all models were trained and evaluated under consistent
and fair conditions throughout the experiments.
</p>
</div>
<!--/ Model specific analysis -->
<!-- GoogleNet -->
<h3 class="title is-4">4.3.1 GoogleNet</h3>
<div class="content has-text-justified">
<p>
GoogLeNet is a convolutional network with nine multi-scale processing Inception modules. Within our experiments,
the QIANets framework targets these modules, reducing the weight in their convolutions: layer sparsity of
0.1417 (only 14.17% of weights were pruned), while employing a rank of 4 to efficiently decompose and factorize these weights.
ECA and Multi-Scale Fusion are applied to the <em>outputs</em> of the modules, integrating multi-scale attention into parallel
branches.
</p>
<p>
<strong>Performance Progression Across Epochs:</strong> The training process involved 10 trials of 10 epochs each,
followed by a final trial of 50 epochs. During Trial 0 of hyperparameter optimization, the model began with a validation accuracy of
21.08% at Epoch 1 and quickly progressed to 70.18% by Epoch 10, indicating <em>rapid learning during the first stages of training</em>.
The highest validation accuracy result happened to be 80.19% throughout Trial 5. Ultimately, the final quantum-inspired GoogLeNet’s
test accuracy was 86.65%, representing a notable improvement from the earlier 80.19%, and closely approaching the baseline accuracy
of 94.29% after fine-tuning.
</p>
<p>
<strong>Loss Reduction:</strong> Over the 50 epochs, the model’s loss steadily decreased, showing consistent improvement.
It began at 2.5205 in Epoch 1 trial 0 and reduced to 0.8256 by Epoch 50, effectively minimizing error throughout training.
A key outcome of this experiment was the final <strong>13.65%</strong> reduction in inference time, minimizing to
0.000835 seconds per image, which underscores the efficiency of our approach compared to the baseline GoogLeNet’s
0.000967 seconds. <em>See table 2</em>
</p>
<p style="text-align: center;">
Table 2: Training and validation metrics for GoogleNet
</p>
<table style="width:100%; text-align:center; border-collapse: collapse;">
<thead>
<tr>
<th style="border: 1px solid black; padding: 8px;">Metric</th>
<th style="border: 1px solid black; padding: 8px;">Quantum-Inspired GoogleNet</th>
<th style="border: 1px solid black; padding: 8px;">Baseline GoogleNet</th>
</tr>
</thead>
<tbody>
<tr>
<td style="border: 1px solid black; padding: 8px;">Training Loss</td>
<td style="border: 1px solid black; padding: 8px;">0.7786</td>
<td style="border: 1px solid black; padding: 8px;">1.7066 (Epoch 1)</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">Validation Accuracy</td>
<td style="border: 1px solid black; padding: 8px;">80.19%</td>
<td style="border: 1px solid black; padding: 8px;">38.52% (Epoch 1)</td>
<tr>
<td style="border: 1px solid black; padding: 8px;">Test Loss</td>
<td style="border: 1px solid black; padding: 8px;">0.5732</td>
<td style="border: 1px solid black; padding: 8px;">0.2557</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">Test Accuracy</td>
<td style="border: 1px solid black; padding: 8px;">86.65%</td>
<td style="border: 1px solid black; padding: 8px;">94.29%</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">Average Inference Time/Image</td>
<td style="border: 1px solid black; padding: 8px;">0.000835 seconds (13.65% Faster)</td>
<td style="border: 1px solid black; padding: 8px;">0.000967 seconds</td>
</tr>
</tbody>
</table>
</div>
<!--/ GoogleNet -->
<!-- DenseNet -->
<h3 class="title is-4">4.3.2 DenseNet</h3>
<div class="content has-text-justified">
<p>
We experimented on DenseNet – a CNN structured with 12 dense blocks, with layer-by-layer connections.
This intricate connectivity requires careful application of QAOA pruning to ensure that weight removal does not disrupt
the model’s residual stream and overall flow of information within the network. To enhance channel-wise interactions, we apply
ECA and Multi-Scale Fusion <em>after</em> the dense blocks, allowing the model to leverage both local and global
feature representations effectively.
</p>
<p>
<strong>Performance Progression Across Epochs:</strong> DenseNet was trained over 10 trials of 10 epochs each,
followed by a final trial of 50 epochs to fine-tune performance. During Trial 0, the model began with a validation accuracy of 9.66%
at Epoch 1 and exhibited minimal improvement by Epoch 10, reaching 10.34%. In contrast, Trial 1 demonstrated significant learning
progression, starting at 27.53% and achieving a remarkable 81.33% by Epoch 10. The highest validation accuracy across all trials
peaked at 86.65% in Trial 1, where the model achieved a layer sparsity of approximately 0.3779 (nearly 62% of the weights
were pruned while maintaining performance). After extensive fine-tuning, the final quantum-inspired DenseNet achieved a test accuracy
of 88.52%, a significant improvement from the earlier 86.65%, and approaching the baseline accuracy of 94.05%.
</p>
<p>
<strong>Loss Reduction:</strong> The model demonstrated steady loss reduction throughout the training process, beginning at 2.3028
during Epoch 1 in Trial 0 and decreasing to 0.5606 by Epoch 10 in Trial 1, indicating effective error minimization.
This consistent decline reflects the model’s ability to optimize its parameters and improve performance across trials.
One of the standout results of this experiment was the final reduction in inference time by <strong>15.20%</strong>,
dropping to 0.001043 seconds/image, marking a considerable improvement compared to the baseline DenseNet’s 0.00123 seconds.
See Table 3.
</p>
<p style="text-align: center;">
Table 3: Training and validation metrics for DenseNet
</p>
<table style="width:100%; text-align:center; border-collapse: collapse;">
<thead>
<tr>
<th style="border: 1px solid black; padding: 8px;">Metric</th>
<th style="border: 1px solid black; padding: 8px;">Quantum-Inspired DenseNet</th>
<th style="border: 1px solid black; padding: 8px;">Baseline DenseNet</th>
</tr>
</thead>
<tbody>
<tr>
<td style="border: 1px solid black; padding: 8px;">Training Loss</td>
<td style="border: 1px solid black; padding: 8px;">0.5351</td>
<td style="border: 1px solid black; padding: 8px;">2.3027</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">Validation Accuracy</td>
<td style="border: 1px solid black; padding: 8px;">81.33%</td>
<td style="border: 1px solid black; padding: 8px;">10.34%</td>
<tr>
<td style="border: 1px solid black; padding: 8px;">Test Loss</td>
<td style="border: 1px solid black; padding: 8px;">0.4712</td>
<td style="border: 1px solid black; padding: 8px;">0.2462</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">Test Accuracy</td>
<td style="border: 1px solid black; padding: 8px;">88.52%</td>
<td style="border: 1px solid black; padding: 8px;">94.05%</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">Average Inference Time/Image</td>
<td style="border: 1px solid black; padding: 8px;">0.000042 seconds (15.20% faster)</td>
<td style="border: 1px solid black; padding: 8px;">0.000050 seconds</td>
</tr>
</tbody>
</table>
</div>
<!--/ DenseNet -->
<!-- ResNet -->
<h3 class="title is-4">4.3.3 ResNet-18</h3>
<div class="content has-text-justified">
<p>
Lastly, ResNet-18 is a CNN characterized by its unique residual learning framework and shortcut connections that facilitate
training in deep networks. Within our experiments, the QIANets framework targets the residual blocks in the model, reducing less
significant weights and channels, detected by ECA’s straightforward 1D convolution. Feature maps are then combined by
Multi-Scale Fusion and refined highlighting essential features across different scales.
</p>
<p>
<strong>Performance Progression:</strong> Throughout the trials, there were notable fluctuations in performance.
The highest validation accuracy across all trials peaked at 91.42% in Trial 4, where the model achieved a layer sparsity
of approximately 0.3779 (meaning nearly 62% of the weights were pruned while maintaining performance).
After extensive fine-tuning, the final quantum-inspired ResNet-18 reached a test accuracy of 87.11%, a significant improvement
from the earlier 84.56% (the highest accuracy before the final fine-tuning) and approaching the baseline accuracy of 93.11%.
</p>
<p>
<strong>Loss Reduction:</strong> The loss reduction across trials also followed a clear downward trend. In Trial 3, with a layer
sparsity of 0.4805 and a rank of 10, the validation loss dropped from 1.9847 in the first epoch to 0.6321 by the tenth epoch,
indicating better model convergence. One of the standout results of this experiment was the final reduction in inference time
by <strong>36.4%</strong>, dropping to <strong>0.00007</strong> seconds/image, marking a improvement compared to the
baseline ResNet’s 0.00011 seconds/image. <em>See table 4</em>.
</p>
<p>
Table 4: Training and validation metrics for ResNet-18
</p>
<table style="width:100%; text-align:center; border-collapse: collapse;">
<thead>
<tr>
<th style="border: 1px solid black; padding: 8px;">Metric</th>
<th style="border: 1px solid black; padding: 8px;">Quantum-Inspired ResNet</th>
<th style="border: 1px solid black; padding: 8px;">Baseline ResNet</th>
</tr>
</thead>
<tbody>
<tr>
<td style="border: 1px solid black; padding: 8px;">Training Loss</td>
<td style="border: 1px solid black; padding: 8px;">0.4078</td>
<td style="border: 1px solid black; padding: 8px;">0.6501</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">Validation Accuracy</td>
<td style="border: 1px solid black; padding: 8px;">90.25%</td>
<td style="border: 1px solid black; padding: 8px;">91.30%</td>
<tr>
<td style="border: 1px solid black; padding: 8px;">Test Loss</td>
<td style="border: 1px solid black; padding: 8px;">0.6447</td>
<td style="border: 1px solid black; padding: 8px;">0.3195</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">Test Accuracy</td>
<td style="border: 1px solid black; padding: 8px;">87.11%</td>
<td style="border: 1px solid black; padding: 8px;">93.11%</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 8px;">Average Inference Time/Image</td>
<td style="border: 1px solid black; padding: 8px;">0.00007 seconds (36.4% faster)</td>
<td style="border: 1px solid black; padding: 8px;">0.00011 seconds</td>
</tr>
</tbody>
</table>
</div>
<!--/ ResNet -->
</div>
</div>
</div>
<!--/ Experiments and Results. -->
</section>
<section class="section">
<div class="container is-max-desktop">
<!-- Analysis of QIANets Framework -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Analysis of QIANets Framework</h2>
<div class="content has-text-justified">
<p>
The QIANets framework exhibits effective latency reductions across models, achieving compression ratios of x1.6 for ResNet,
x1.8 for DenseNet, and x1.9 for GoogleNet. These results fall slightly below certain CNN compression methods, where some methods
have attained ratios as high as 10x with minimal reduction in accuracy. However, each model experienced steady loss reduction
across trails, neared the baseline accuracy after fine-tuning and demonstrated quicker inference times per images compared to
the baseline.
</p>
</div>
</div>
</div>
<!--/ Analysis of QIANets Framework -->
</div>
</section>
<section class="section">
<div class="container is-max-desktop">
<!-- Conclusion. -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Conclusion</h2>
<div class="content has-text-justified">
<p>
In this paper, we introduced the QIANets framework and applied it to three prominent convolutional neural networks—DenseNet,
GoogLeNet, and ResNet—with the objective of reducing latency andimproving inference times while maintaining minimal accuracy
loss. Our experimental results showed varying degrees of success when applying quantum-inspired techniques, revealing important
insights into the trade-offs between latency and accuracy across different architectures.
</p>
</div>
</div>
</div>
<!--/ Conclusion. -->
</div>
</section>
<section class="section">
<div class="container is-max-desktop">
<!-- Limitations. -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Limitations</h2>
<div class="content has-text-justified">
<p>
While our results demonstrate the potential of QIANets and quantum-inspired principles in model compression, they also
reflect on several factors that influence the performance of our approach:
</p>
<ol>
<li>
<strong>Data Constraints:</strong> The evaluation was restricted to the relatively simple CIFAR-10 dataset,
which may not represent the full diversity, complexity, or scalability challenges encountered in
real-world scenarios that larger datasets may exhibit. Additionally, due to the computational
expense of each run, the approach was only tested on a limited number of trials.
</li>
<li>
<strong>Model Adaptation:</strong> The lack of adaptation across different model architectures may hinder
the QIANets framework’s ability to optimize the balance between latency and accuracy. Performance
in certain cases does not guarantee similar results across architectures without mandatory
model-specific adjustments, which can complicate future adaptations.
</li>
<li>
<strong>Hardware Limitations:</strong> This study does not consider hardware-specific limitations. Our
techniques have not yet been optimized for specialized hardware, such as custom FPGAs or GPUs,
which could potentially reduce latency and improve data throughput.
</li>
<li>
<strong>Scope of Focus:</strong> The narrow focus on CNNs excludes newer, more advanced architectures,
restricting the scope of our framework’s potential. Integrating our concept with transformers and
other attention-based models could significantly expand QIANets’ applicability.
</li>
</ol>
<p>
Future works should address these limitations by conducting more in-depth experiments that assess
both the scalability and practical relevance of the quantum-inspired techniques.
</p>
</div>
</div>
</div>
<!--/ Limitations. -->
</div>
</section>
<!-- Acknowledgments and Disclosure of Funding. -->
<div class="container is-max-desktop">
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Acknowledgments and Disclosure of Funding</h2>
<div class="content has-text-justified">
<p>
This work was complete through the Algoverse program and the authors gratefully acknowledge the valuable knowledge and resources
provided by the team. We also thank Sean O’Brien and the anonymous reviewers for their knowledgeable feedback.
</p>
</div>
</div>
</div>
<!--/ Acknowledgments and Disclosure of Funding. -->
</div>
</section>
<section class="section">
<div class="container is-max-desktop">
<!-- References -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Referances</h2>
<div class="content has-text-justified">
<ol>
<li>Su, N. M., & Crandall, D. J. (2021). The affective growth of computer vision. In <em>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</em>, pp. 9291–9300. <a href="https://doi.org/10.1109/CVPR46437.2021.00917">https://doi.org/10.1109/CVPR46437.2021.00917</a></li>
<li>Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In <em>Proceedings of the IEEE conference on computer vision and pattern recognition</em>, pp. 4700–4708. <a href="https://doi.org/10.1109/CVPR.2017.243">https://doi.org/10.1109/CVPR.2017.243</a></li>
<li>Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convolutions. In <em>Proceedings of Vision and Pattern Recognition</em>, pp. 1–9. <a href="https://doi.org/10.1109/CVPR.2015.7298594">https://doi.org/10.1109/CVPR.2015.7298594</a></li>
<li>He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In <em>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</em>, pp. 770–778. <a href="https://doi.org/10.1109/CVPR.2016.90">https://doi.org/10.1109/CVPR.2016.90</a></li>
<li>Anumol, C. S. (2023, November). Advancements in CNN Architectures for Computer Vision: A Comprehensive Review. In 2023 <em>Annual International Conference on Emerging Research Areas: International Conference on Intelligent Systems</em> (AICERA/ICIS), pp. 1–7. IEEE. <a href="https://doi.org/10.1109/AICERA/ICIS59538.2023.10420413">https://doi.org/10.1109/AICERA/ICIS59538.2023.10420413</a></li>
<li>Honegger, D., Oleynikova, H., & Pollefeys, M. (2014, September). Real-time and low latency embedded computer vision hardware based on a combination of FPGA and mobile CPU. In <em>2014 IEEE/RSJ International Conference on Intelligent Robots and Systems</em>, pp. 4930–4935. IEEE. <a href="https://doi.org/10.1109/IROS.2014.6943263">https://doi.org/10.1109/IROS.2014.6943263</a></li>
<li>Li, Z., Li, H., & Meng, L. (2023). Model compression for deep neural networks: A survey. <em>Computers</em>, <strong>12</strong>(3), 60. <a href="https://doi.org/10.3390/computers12030060">https://doi.org/10.3390/computers12030060</a></li>
<li>Divya, R., & Peter, J. D. (2021, November). Quantum machine learning: A comprehensive review on optimization of machine learning algorithms. In 2021 <em>Fourth International Conference on Microelectronics, Signals & Systems</em> (ICMSS), pp. 1–6. IEEE. <a href="https://doi.org/10.1109/ICMSS53060.2021.9673630">https://doi.org/10.1109/ICMSS53060.2021.9673630</a></li>
<li>Pandey, S., Basisth, N. J., Sachan, T., Kumari, N., & Pakray, P. (2023). Quantum machine learning for natural language processing application. <em>Physica A: Statistical Mechanics and its Applications</em>, <strong>627</strong>, 129123. <a href="https://doi.org/10.1016/j.physa.2023.129123">https://doi.org/10.1016/j.physa.2023.129123</a></li>
<li>Francy, S., & Singh, R. (2024). Edge AI: Evaluation of Model Compression Techniques for Convolutional Neural Networks. <em>arXiv preprint</em> arXiv:2409.02134. <a href="https://doi.org/10.48550/arXiv.2409.02134">https://doi.org/10.48550/arXiv.2409.02134</a></li>
<li>Cheng, Y., Wang, D., Zhou, P., & Zhang, T. (2017). A survey of model compression and acceleration for deep neural networks. <em>arXiv preprint arXiv:1710.09282</em>. In Proceedings of the IEEE Signal Processing Magazine, <strong>35</strong>(1), 126–136. <a href="https://doi.org/10.1109/MSP.2017.2765695">https://doi.org/10.1109/MSP.2017.276569</a></li>
<li>Hou, Z., Qin, M., Sun, F., Ma, X., Yuan, K., Xu, Y., ... & Kung, S. Y. (2022) Chex: Channel exploration for CNN model compression. In <em>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</em>, pp. 12287–12298. <a href="https://doi.org/10.1109/CVPR52688.2022.01197">https://doi.org/10.1109/CVPR52688.2022.01197/a></li>
<li>Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. <em>Advances in Neural Information Processing Systems</em>, <strong>28</strong>. <a href="https://papers.nips.cc/paper/2015/hash/3e15cc52c93b93997f4c62e87d3d0ad7-Abstract.html">NIPS Link</a><a href="https://proceedings.neurips.cc/paper_files/paper/2015/file/ae0eb3eed39d2bcef4622b2499a05fe6-Paper.pdf">NeurIPS Paper Files PDF</a></li>
<li>Shi, S., Wang, Z., Cui, G., Wang, S., Shang, R., Li, W., ... & Gu, Y. (2022). Quantum-inspired complex convolutional neural networks. Applied Intelligence, <strong>52</strong>(15), 17912–17921. <a href="https://doi.org/10.1007/s10489-022-03525-0">https://doi.org/10.1007/s10489-022-03525-0</a></li>
<li>Hu, Z., Dong, P., Wang, Z., Lin, Y., Wang, Y., & Jiang, W. (2022, October). Quantum neural network compression. In <em>Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design</em>, pp. 1–9. <a href="https://doi.org/10.1145/3508352.3549382">https://doi.org/10.1145/3508352.3549382</a></li>
<li>Tomut, A., Jahromi, S. S., Singh, S., Ishtiaq, F., Muñoz, C., Bajaj, P. S., ... & Orus, R. (2024). CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks. <em>arXiv preprint arXiv</em>:2401.14109. <a href="https/arXiv.2401.14109">https://doi.org/10.48550/arXiv.2401.14109</a></li>
<li>LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. <em>Neural Computation</em>, <strong>1</strong>(4), 541–551. <a href="https://doi.org/10.1162/neco.1989.1.4.541">https://doi.org/10.1162/neco.1989.1.4.541</a></li>
<li>Hanson, S., & Pratt, L. (1988). Comparing biases for minimal network construction with backpropagation. <em>Advances in Neural Information Processing Systems</em>, <strong>1</strong>. <a href="https://proceedings.neurips.cc/paper/1988/hash/1c9ac0159c94d8d0cbedc973445af2da-Abstract.html">NIPS</a></li>
<li>Hassibi, B., Stork, D. G., & Wolff, G. J. (1993, March). Optimal brain surgeon and general network pruning. In <em>IEEE International Conference on Neural Networks</em>, pp. 293–299. IEEE. <a href="doi:10.1109/ICNN.1993.298572">doi:10.1109/ICNN.1993.298572</a></</li>
<li>Farhi, E., Goldstone, J., Gutmann, S., & Zhou, L. (2022). The quantum approximate optimization algorithm and the Sherrington-Kirkpatrick model at infinite size. <em>Quantum</em>, 6, 759.</li> <a href="https://doi.org/10.22331/q-2022-07-07-759">https://doi.org/10.22331/q-2022-07-07-7599</a>
<li>Mitarai, K., Negoro, M., Kitagawa, M., & Fujii, K. (2018). Quantum circuit learning. <em>Physical Review A, 98</em>(3), 032309. <a href="https://doi.org/10.1103/PhysRevA.98.032309">https://doi.org/10.1103/PhysRevA.98.032309</a></li>
<li>Wang, X., Song, Z., & Wang, Y. (2021). Variational quantum singular value decomposition. <em>Quantum, 5</em>, 483. <a href="https://quantum-journal.org/papers/q-2021-06-29-483/">https://quantum-journal.org/papers/q-2021-06-29-483/</a></li>
<li>Gherardi, A., & Leporati, A. (2024). An Analysis of Quantum Annealing Algorithms for Solving the Maximum Clique Problem. <em>arXiv preprint arXiv:2406.07587</em>. <a href="https://doi.org/10.48550/arXiv.2406.07587">https://doi.org/10.48550/arXiv.2406.07587</a></li>
<li>Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. <a href="https://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf">https://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf</a></li>
<li>Lu, H., & Zhang, L. (2020). Compressing and Regularizing Deep Neural Networks. O’Reilly Media. <a href="https://www.oreilly.com/content/compressing-and-regularizing-deep-neural-networks/">O’Reilly.</a></li>
</ol>
</div>
</div>
</div>
</div>
<!--/ References. -->
</section>
<footer class="footer">
<div class="container">
<div class="content has-text-centered">
<div class="columns is-centered">
<div class="column is-8">
<div class="content">
<a href="https://github.com/edwardmagongo/QIANets-Website" target="_blank" style="display: inline-flex; align-items: center; margin: 0 10px;">
<img src="Github Logo.jpg" alt="GitHub Logo" style="width: 42px; height: 42px; margin-right: 5px;">
<span>Website Code</span>
</a>
</div>
</div>
</div>
</div>
</div>
</footer>
<div style="text-align: center;">
<p>© 2024 QIANets. All Rights Reserved</p>
</div>
</body>
</html>