-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path02-visualization-of-qualitative-data.Rmd
1707 lines (1206 loc) · 53 KB
/
02-visualization-of-qualitative-data.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Visualization of Qualitative Data
[book](pdf/book02.pdf){target="_blank"}
[eStat YouTube Channel](https://www.youtube.com/channel/UCw2Rzl9A4rXMcT8ue8GH3IA){target="_blank"}
**CHAPTER OBJECTIVES**
In this chapter, we introduce graphs to visualize qualitative data such
as a bar graph, a pie chart, a band graph and a line graph in Section
2.1.
In Section 2.2, we discuss visualization of summary data of single
categorical variable using 『eStat』. Visualization of summary data of a
categorical variable by a group is also discussed.
In Section 2.3, we discuss visualization of raw data of a categorical
variable using 『eStat』. Visualization of raw data of a categorical
variable by a group is also discussed.
:::
## Visualization of Qualitative Data
::: presentation-video-link
[presentation](pdf/0201.pdf){.presentation-link target="_blank"}
[video](https://youtu.be/2GpZCkUi0lM){.video-link target="_blank"}
:::
::: mainTable
Data of gender for students in a classroom, which are either a male or a
female, are referred to as qualitative data. Data of marital status for
employees in a company, which are either single or married, are also
qualitative data. A bar graph, a pie chart, a band graph, and a line
graph are used to visualize the qualitative data. These graphs are
frequently used as an exploratory data analysis of the qualitative data.
![](Icon/eStat_icon09_bar.png){.imgIcon}\
A bar chart (or bar graph) is a graph that presents the qualitative data
with rectangular bars in a way that their heights (or lengths) are
proportional to frequencies of their categories. Therefore, the
frequencies of all categories in a categorical variable can be easily
compared by watching the heights (or lengths) of the rectangular bars.
We usually put some space between the rectangular bars to emphasize that
they represent the distinct categories of a variable.
The rectangular bars of the bar chart can be plotted either vertically
or horizontally. One axis of the chart shows all categories of a
variable, and the other axis represents the frequencies of each
category. If the frequency of each category is represented as a vertical
height of a bar drawn up and down in the bar graph, it is called a
vertical bar graph. A bar can also be drawn left and right whose length
is proportional to the frequency of each category and it is called a
horizontal bar graph.
![](Icon/eStat_icon35_VbarSeparated.png){.imgIcon}
![](Icon/eStat_icon36_VbarStacked.png){.imgIcon}
![](Icon/eStat_icon37_VbarRatio.png){.imgIcon}
![](Icon/eStat_icon38_VbarSide.png){.imgIcon}
![](Icon/eStat_icon39_VbarBilateral.png){.imgIcon}\
A bar graph can be drawn after counting frequencies of all categories of
a variable. If there is another categorical variable, frequencies of all
categories of the first categorical variable can be counted for each
category of the second categorical variable. For example, we can count
the number of single and married employees for both a male and a female
category. We can draw two bar graphs of the marital status for both the
male and the female categories so that both graphs have the same scale
of Y-axis to compare the frequencies of the male category with the
frequencies of the female category easily. This graph is called a
separated bar graph of the marital status by gender variable. In this
case, the gender variable is called a group variable and the marital
status is called an analysis variable.
If a variable is analysed by using a group variable, there are many
variants of bar graphs which compare visually well all categories of the
group variable. A stacked bar graph divides a single bar, which
represents the frequency of a category of the analysis variable, into
pieces with different colors which are proportional to the frequency of
each category of the group variable. A ratio bar graph draws that all
bars (rectangles) of each category of the analysis variable have the
same height and divides each bar into pieces with different colors which
are proportional to the frequencies of each category of the group
variable. A side by side bar graph is that in each category of the
analysis variable rectangular bars of all categories of the group
variable are drawn side by side ways for comparison using the same
scale. If there are only two categories of the group variable, a
two-sided bar graph (or a bi-lateral bar graph) can be used which draws
bars of one category of the group variable in one side and bars of the
other category of the group variable in the opposite direction. The
direction can be either the left and right side of the Y-axis or the
above and below of the X-axis.
![](Icon/eStat_icon10_pie.png){.imgIcon}
![](Icon/eStat_icon45_Doughnut.png){.imgIcon}\
A pie chart is a graph that shows frequencies of all categories of the
analysis variable by dividing a pie (circle) into pieces with different
colors depending on angle which is proportional to the frequency of each
category. We usually draw the largest piece of category in a clockwise
order starting from 12 oclock so that the ratio can be compared well.
A doughnut chart which removes a center circle of the pie chart can also
be used.
![](Icon/eStat_icon11_band.png){.imgIcon}\
A band graph is similar to the ratio bar graph that shows frequencies of
all categories of the analysis variable by dividing a rectangle into
square pieces with different colors which are proportional to
frequencies of all categories. It is also similar to the pie chart. The
square pieces can be sorted in descending order by the frequencies of
each category, but 『eStat』 draw the square pieces in the order of
category values of a categorical variable.
![](Icon/eStat_icon12_line.png){.imgIcon}\
A line graph shows frequencies (or values) of all categories of an
analysis variable in a two-dimensional graph. The X-axis shows names of
categories and the Y-axis represents the scale of frequencies (or
values) of all categories. Each pair of the values, the category name
and its frequency, is marked as a point in a two-dimensional coordinate
plane and two adjacent points are connected with a line. The line graph
may be similar to the vertical bar graph which connects only top centers
of each bar. The line graph is usually used to visualize time dependent
data to watch its trend over time. For example, the yearly amount of
export in a country can be visualized using the line graph.
:::
::: mainTableYellow
**Graphs for Qualitative Data**
**Bar chart** (or bar graph) is a graph that shows qualitative data with
rectangular bars with heights or lengths proportional to frequencies of
their categories.
**Pie chart** is a graph that shows frequencies of all categories of an
analysis variable by dividing a pie (circle) into pieces with different
colors depending on angle which is proportional to the frequency of each
category.
**Band graph** is similar to the ratio bar graph that shows frequencies
of all categories of an analysis variable by dividing a rectangle into
square pieces with different colors which are proportional to
frequencies of all categories.
**Line graph** shows frequencies (or values) of all categories of an
analysis variable in a two-dimensional graph.
:::
::: mainTable
This chapter discusses how qualitative data are visualized using
『eStat』 by dividing the case of summary data (Section 2.2) and raw
data (Section 2.3) and by dividing the case of single analysis variable
and the case of an analysis varible with the group variable.
:::
:::
:::
:::
## Visualization of Summary Data
::: presentation-video-link
[presentation](pdf/0202.pdf){.presentation-link target="_blank"}
[video](https://youtu.be/1wNOF7ewKdw){.video-link target="_blank"}
:::
In this section visualization of summary data without a group variable
and visualization of summary data with a group variable are discussed.
### Summary Data of Categorical Variable
::: mainTable
If you investigated a gender of students in a class and reported the
result as follows.
'male', 'female', 'male', 'female', 'male'. 'male', .'male',
'female', 'female', 'male'\...
This data is called the raw data of the gender variable which is a
categorical variable.
If you counted the number of 'male' students and 'female' students in
the above raw data and reported the result as shown in [Table 2.2.1]{.table-ref}.
Table 2.2.1 Summary data of the gender in a class
Gender Students
-------- ----------
Male 6
Female 4
This data is called the summary data of the gender variable.
If the number of data increases, counting the number of cases in each
category from the raw data of a categorical variable in order to make
the summary data is not an easy task. One of the important functions of
a statistical package is to organize the raw data into the summary data
by counting the number of cases in each category. Because of this
difficult task to generate the summary data from the raw data,
governmental institutions usually provide statistics of a census to the
public in the form of the summary data such as the number of population
by gender or the number of population by region. These summary data can
be downloaded from the governmental home page as an Excel file.
An Excel file can be saved as a text file in CSV (comma separated value)
format (refer \<Figure A.2.6\> in Appendix A) which can be loaded by
『eStat』 for data processing and analysis (refer Appendix A).
This section discusses visualization of the summary data of a
categorical variable which can also be found in textbooks of an
elementary, a middle school, a high school and in governmental
publications.
:::
::: mainTableGrey
**Example 2.2.1(Gender Summary Data)**
Enter the summary data of [Table 2.2.1]{.table-ref} to the sheet of 『eStat』 and save
it as a file in CSV format. Using this data, draw a bar graph, a pie
chart and a band graph with 『eStat』. Analyze the graphs and prepare a
report using the MS Word (or any word processor you prefer).
**Answer**
Enter the data of [Table 2.2.1]{.table-ref} to the sheet of 『eStat』 as in \<Figure
2.2.1\> and enter a variable name of V1 as 'Gender' and of V2 as
'Number' using [Edit Var]{.button-ref} button located above the sheet (refer
Appedix A.2).
![](Figure/Fig020201.png){.imgFig300200}
::: figText
[Figure 2.2.1]{.figure-ref} Data input in 『eStat』
:::
Click the first variable name 'Gender' and then the second variable
name 'Number'. Selected variables will be appeared in the box of the
'Selected Var' located above the sheet. You can select the variable
'1: Gender' using the combo box of the 'Analysis Var' and the variable
'2: Number' using the combo box of the 'By Group' located above the
sheet as shown in [Figure 2.2.1]{.figure-ref}.
When variables are selected, a vertical bar graph which is the default
graph of 『eStat』 is drawn as in [Figure 2.2.2]{.figure-ref}. The height of each
bar (rectangle) is proportional to the frequency of each category in the
gender variable and therefore the frequencies of both the male and the
female categories can be easily compared by watching the heights of
bars. The bar graph shows that the number of male students is larger
than the number of female students.
A vertical bar graph which draws bars up and down as in [Figure 2.2.2]{.figure-ref}
is widely used, but a horizontal bar graph which draws bars from left to
right is often used if there are many categories. By clicking on the
horizontal bar graph icon located above the Graph Area, a horizontal bar
graph as in [Figure 2.2.3]{.figure-ref} will be appeared in the Graph Area. By
checking the 'Frequency' box located below of the graph, the frequency
of each bar. will be displayed.
<div>
<div>
<input class="qrBtn" onclick="window.open(addrStr[7])" src="QR/EX020201.svg" type="image"/>
</div>
<div>
![](Figure/Fig020202.svg)
::: figText
[Figure 2.2.2]{.figure-ref} Vertical bar graph of the number of male and female
students.
:::
</div>
</div>
![](Figure/Fig020203.svg){.imgFig600540}
::: figText
[Figure 2.2.3]{.figure-ref} Horizontal bar graph of the number of male and female
students.
:::
![](Icon/eStat_icon47_graphSave.png){.imgIcon}\
By clicking the 'Graph Save' icon located above the Graph Area, the
current graph of the Graph Area will be saved with a file name
'eStatGraph.png' which is shown at the bottom left corner of the main
screen as in [Figure 2.2.4]{.figure-ref} (Refer Appendix A.4).
![](Figure/Fig020204.png){.imgFig600540}
::: figText
[Figure 2.2.4]{.figure-ref} Graph is saved by clicking the 'Graph Save' icon
:::
The location of the saved graph file is the download folder specified in
your computer system. If you save another graph, eStatGraph(1).png will
be created in the download folder. Number in parentheses of the file
name will be increased whenever you save a new graph.
![](Icon/Word.jpg){.imgIcon}\
You can copy this graph file from the download folder and paste to the
MS Word as in [Figure 2.2.5]{.figure-ref}. You can also write comments about the
graph if necessary.
![](Figure/Fig020205.png){.imgFig600540}
::: figText
[Figure 2.2.5]{.figure-ref} Copied graph file of 『eStat』 to MS Word
:::
Click on the pie chart icon
![](Icon/eStat_icon10_pie.png){.imgIcon} to display a pie chart as
in [Figure 2.2.6]{.figure-ref} and click on the doughnut graph icon
![](Icon/eStat_icon45_Doughnut.png){.imgIcon} to display a
doughnut graph as in [Figure 2.2.7]{.figure-ref} which is a pie chart but a small
middle circle is cut off. The pie chart shows frequencies of the number
of both male and female students by dividing a pie (circle) into pieces
with two colors depending on angles which are proportional to the
frequencies of each category.
![](Figure/Fig020206.svg){.imgFig600540}
::: figText
[Figure 2.2.6]{.figure-ref} Pie chart of the number of male and female students.
:::
![](Figure/Fig020207.svg){.imgFig600540}
::: figText
[Figure 2.2.7]{.figure-ref} Doughnut chart of the number of male and female
students.
:::
Click on the band graph icon
![](Icon/eStat_icon11_band.png){.imgIcon} to display a band graph
as in [Figure 2.2.8]{.figure-ref}. A band graph is a variant of the pie chart by
dividing a rectangle into square pieces which are proportional to
frequencies of each category. It is named after a rectangular shape with
multiple square pieces which looks like a band.
![](Figure/Fig020208.svg){.imgFig600540}
::: figText
[Figure 2.2.8]{.figure-ref} Band graph of the number of male and female students.
:::
:::
::: mainTable
International institutions such as UN, OECD and EU release their
statistics to the public in the form of summary data and this data can
be downloaded as an Excel file or a text file in CSV format. The
following example shows how to download a file from the OECD and how to
draw graphs using this file.
:::
::: mainTableGrey
**Example 2.2.2(Life Expectancy at Birth : Source OECD)**
From the home page of the OECD, https://www.oecd.org, download a data
file of the life expectancy at birth. Copy the columns of the country
name and 2017 data located at the last column to 『eStat』 system and
save it as a file in CSV format. Using this data, draw a vertical bar
graph and a horizontal bar graph in descending order of the life
expectancy. Analyze the graphs.
**Answer**
The main screen of the OECD website as of December 2020,
https://www.oecd.org, looks like as in [Figure 2.2.9]{.figure-ref}.
![](Figure/Fig020209.png){.imgFig600400}
::: figText
[Figure 2.2.9]{.figure-ref} OECD home page
:::
Select the menu Topics \> Health, then the screen as in \<Figure
2.2.10\> will be appeared.
![](Figure/Fig020210.png){.imgFig600400}
::: figText
[Figure 2.2.10]{.figure-ref} OECD 'Topic' \> 'Health' menu
:::
If you click on 'Explore all our data on health', the screen as in
[Figure 2.2.11]{.figure-ref} will be appeared.
![](Figure/Fig020211.png){.imgFig600300}
::: figText
[Figure 2.2.11]{.figure-ref} OECD Statistics for life expectancy at birth
:::
If you click on '\> OECD Health Statistics 2020: Frequently Requested
Data', an Excel file of
OECD-Health-Statistics-2020-Frequently-Requested-Data.xls is downloaded.
If you open the Excel file, the menu as in [Figure 2.2.12]{.figure-ref} is
appeared.
![](Figure/Fig020212.png){.imgFig600400}
::: figText
[Figure 2.2.12]{.figure-ref} OECD Statistics for life expectancy at birth
:::
If you click on 'Life expectancy at birth, total population' in Health
status (Mortality), an Excel file as in [Figure 2.2.13]{.figure-ref} will be
appeared on the screen.
![](Figure/Fig020213.png){.imgFig600400}
::: figText
[Figure 2.2.13]{.figure-ref} OECD Statistics for life expectancy at birth
:::
The easiest way to make a file in CSV format is to copy the country name
to the first column of the sheet of 『eStat』 and the column of 2017
data located at the last column of this Excel file to the second column
of the sheet of 『eStat』 as in [Figure 2.2.14]{.figure-ref}. After you provide
variable names 'Country' and 'Years' by using [Edit Var]{.button-ref} of 『eStat』
, save the data as a file, for example,
'EX020202_OECD_LifeExpectancy.csv' in CSV format,.
![](Figure/Fig020214.png){.imgFig300400}
::: figText
[Figure 2.2.14]{.figure-ref} OECD life expectancy at birth in 2017
:::
Another way is to edit the Excel file in [Figure 2.2.13]{.figure-ref} with only two
columns, the country name and 2017 data similar to [Figure 2.2.14]{.figure-ref},
and save it as a file in CSV format. In this case, the first row should
have variable names such as 'Country' and 'Number' (refer Appendix
A.2).. In order to save this file in CSV format, select the Excel menu
'File' \> 'Save As', then a dialogue box as in [Figure 2.2.15]{.figure-ref} will be
appeared. Select the option 'CSV Utf-8', then the file will be saved in
CSV format in the download folder of your computer. Note that, if you
are using an European version of Excel, you have to change the delimiter
of semicolon ';' with comma ',' before you save the file (refer Excel
option).
![](Figure/Fig020215.png){.imgFig300200}
::: figText
[Figure 2.2.15]{.figure-ref} OECD
:::
Click the variable names 'Country' and 'Number' on the sheet of
『eStat』 , then a vertical bar graph of the life expectancy will be
appeared as in [Figure 2.2.16]{.figure-ref}. If the characters of the country name
are too small to see, you can enlarge the screen by holding the \[Ctrl\]
key and rolling up the wheel mouse. You can click on the horizontal bar
graph icon located above the Graph Area to draw a horizontal bar graph
as in [Figure 2.2.17]{.figure-ref}.
It is sometimes convenient to compare data using a horizontal bar graph
after sorting. If you check a sorting option 'Descending' located below
the graph, a horizontal bar graph sorted by descending order of the life
expectancy at the birth will be appeared as in [Figure 2.2.17]{.figure-ref}. It is
easy to check that Japan is the longest life expectancy, Switzerland is
the second and Latvia is the shortest.
![](Figure/Fig020216.svg){.imgFig600540}
::: figText
[Figure 2.2.16]{.figure-ref} Vertical bar graph of OECD life expectancy at birth in
2017
:::
![](Figure/Fig020217.svg){.imgFig600540}
::: figText
[Figure 2.2.17]{.figure-ref} Horizontal bar graph of OECD life expectancy at birth,
2017
:::
:::
::: mainTablePink
<div>
<div>
<input class="qrBtn" onclick="window.open(addrStr[40])" src="QR/PR020201.svg" type="image"/>
</div>
<div>
**Practice 2.2.1** **(Alcohol Expenditure: OECD)**
Draw a bar graph using the following data in 『eStat』 system and
analyze the graph.
Ex ⇨ eBook ⇨ PR020201_OECD_AlcoholExpenditure_2013.csv
</div>
</div>
:::
::: mainTablePink
<div>
<div>
<input class="qrBtn" onclick="window.open(addrStr[41])" src="QR/PR020202.svg" type="image"/>
</div>
<div>
**Practice 2.2.2** **(Obesity Ratio: World)**
Draw a bar graph using the following data in 『eStat』 system and
analyze the graph.
Ex ⇨ eBook ⇨ PR020202_WORLD_ObesityRatio_Age15over\_\_2017
</div>
</div>
:::
### Summary Data of Categorical Variable with Group
::: mainTable
The summary data as in [Table 2.2.1]{.table-ref} can be easily extended if you survey
the gender of two classes in a school as in [Table 2.2.2]{.table-ref}. It is the
summary data of the gender variable for two classes (groups), classes of
5-1 and 5-2. In this case, we usually want to compare the summary data
between two classes (groups) using graphs as the following example.
Table 2.2.2 Summary data of two classes
Gender 5-1 5-2
-------- ----- -----
Male 16 12
Female 14 18
:::
::: mainTableGrey
**Example 2.2.3** **(Gender Summary Data of Two Classes)**
A file of the summary data in [Table 2.2.2]{.table-ref} is saved at the following
location of 『eStat』 system.
Ex ⇨ eBook ⇨ EX020203_Summary_StudentByGender
Using this data, draw a bar graph, a pie chart and a band graph. Use
『eStat』.
**Answer**
If you load the data file from 『eStat』 , it looks like as in \<Figure
2.2.18\>.
<div>
<div>
<input class="qrBtn" onclick="window.open(addrStr[9])" src="QR/EX020203.svg" type="image"/>
</div>
<div>
![](Figure/Fig020218.png){.imgFig300200}
::: figText
[Figure 2.2.18]{.figure-ref} Load file of summary data
:::
</div>
</div>
Click the variable names 'Gender', '5-1' and '5-2' sequentially, then
the selected variables will be appeared at the box of 'Selected Var'
located above the sheet. You can select the variable '1: Gender' using
the combo box of the 'Analysis Var' and the variable '2: 5-1' and '3:
5-2' using the combo box of the 'By Group' located above the sheet.
When the variables are selected, a vertical bar graph
![](Icon/eStat_icon35_VbarSeparated.png){.imgIcon} which is the
default graph of 『eStat』 is drawn using the number of male and female
students in both classes as in [Figure 2.2.19]{.figure-ref}. A bar graph is drawn
for each class and the heights of bars are the frequencies of male and
female students. Two bar graphs has the same scale of Y-axis and
therefore the frequencies of each class can be easily compared. This bar
graph is called a separated vertical bar graph for each class. By
clicking the horizontal bar graph icon
![](Icon/eStat_icon40_HbarSeparated.png){.imgIcon}, a separated
horizontal bar graph can be drawn as in [Figure 2.2.20]{.figure-ref}
![](Figure/Fig020219.svg){.imgFig600540}
::: figText
[Figure 2.2.19]{.figure-ref} Separated vertical bar graph of the gender
distribution by class.
:::
![](Figure/Fig020220.svg){.imgFig600540}
::: figText
[Figure 2.2.20]{.figure-ref} Separated horizontal bar graph of the gender
distribution by class.
:::
For the summary data of two groups, there are many variants of showing
bar graphs in order to compare two groups visually well. If you click on
the stacked bar icon either vertical
![](Icon/eStat_icon36_VbarStacked.png){.imgIcon} or horizontal
![](Icon/eStat_icon41_HbarStacked.png){.imgIcon}, a stacked bar
graph is drawn that divides a single bar into pieces with different
colors which are proportional to the frequencies of male and female
students ([Figure 2.2.21]{.figure-ref} and [Figure 2.2.22]{.figure-ref}).
![](Figure/Fig020221.svg){.imgFig600540}
::: figText
[Figure 2.2.21]{.figure-ref} Stacked vertical bar graph of the gender by class
:::
![](Figure/Fig020222.svg){.imgFig600540}
::: figText
[Figure 2.2.22]{.figure-ref} Stacked horizontal bar graph of the gender by class
:::
If you click on the ratio bar graph icon either vertical
![](Icon/eStat_icon37_VbarRatio.png){.imgIcon} or horizontal
![](Icon/eStat_icon42_HbarRatio.png){.imgIcon}, a ratio bar graph
is drawn in which bars with the same height are divided into pieces with
different colors which are proportional to the frequencies of male and
female students ([Figure 2.2.23]{.figure-ref} and [Figure 2.2.24]{.figure-ref}).
![](Figure/Fig020223.svg){.imgFig600540}
::: figText
[Figure 2.2.23]{.figure-ref} Ratio vertical bar graph of the gender by class.
:::
![](Figure/Fig020224.svg){.imgFig600540}
::: figText
[Figure 2.2.24]{.figure-ref} Ratio horizontal bar graph of the gender by class.
:::
If you click on the side-by-side icon either vertical
![](Icon/eStat_icon38_VbarSide.png){.imgIcon} or horizontal
![](Icon/eStat_icon43_HbarSide.png){.imgIcon}, a side-by-side bar
graph is drawn which draws the bars of each group category sideways for
comparison ([Figure 2.2.25]{.figure-ref} and [Figure 2.2.26]{.figure-ref}).
![](Figure/Fig020225.svg){.imgFig600540}
::: figText
[Figure 2.2.25]{.figure-ref} Side-by-side vertical bar graph of the gender by
class.
:::
![](Figure/Fig020226.svg){.imgFig600540}
::: figText
[Figure 2.2.26]{.figure-ref} Side-by-side horizontal bar graph of the gender by
class.
:::
If there are only two categories of the group variable like this
example, then by clicking on the bi-lateral bar icon either vertical
![](Icon/eStat_icon39_VbarBilateral.png){.imgIcon} or horizontal
![](Icon/eStat_icon44_HbarBilateral.png){.imgIcon}, a two-sided
(or bi-lateral) bar graph is drawn which draws the bars in the opposite
direction either the above and below of X-axis ([Figure 2.2.27]{.figure-ref}), or
the left and right of Y-axis ([Figure 2.2.28]{.figure-ref}).
![](Figure/Fig020227.svg){.imgFig600540}
::: figText
[Figure 2.2.27]{.figure-ref} Two-sided vertical bar graph of the gender by class.
:::
![](Figure/Fig020228.svg){.imgFig600540}
::: figText
[Figure 2.2.28]{.figure-ref} Bi-lateral horizontal bar graph of the gender by
class.
:::
By clicking on the pie chart icon
![](Icon/eStat_icon10_pie.png){.imgIcon}, a pie chart is drawn as
in [Figure 2.2.29]{.figure-ref} which has two pie charts for classes of '5-1' and
'5-2'. Each pie chart shows the frequencies of the number of male and
female students by dividing a pie (circle) into pieces with two colors
depending on angles which are proportional to the frequencies of each
category.
By clicking on the band graph icon
![](Icon/eStat_icon11_band.png){.imgIcon}, a band graph is drawn
as in [Figure 2.2.30]{.figure-ref} which has two band graphs for classes of '5-1'
and '5-2'. Each band graph shows the frequencies of the number of male
and female students by dividing a rectangle into squares with two colors
which are proportional to the frequencies of each category.
![](Figure/Fig020229.svg){.imgFig600540}
::: figText
[Figure 2.2.29]{.figure-ref} Pie charts for gender distribution in two classes.
:::
![](Figure/Fig020230.svg){.imgFig600540}
::: figText
[Figure 2.2.30]{.figure-ref} Band graphs for gender distribution in two classes.
:::
:::
::: mainTableGrey
**Example 2.2.4** **(Male and Female Population by Age Groups)**
In 2015, the male and female populations by age groups in Korea are
shown in [Table 2.2.3]{.table-ref}. Using this data, draw a vertical bar graph by age
groups and then find appropriate graphs to analyze the characteristics
of this data easily.
Table 2.2.3 male and female populations by age groups in Korea\
(KOSTAT Census 2015, unit 10,000 persons)
Age Interval 2015 Male 2015 Female
-------------- ----------- -------------
00 - 04 115 109
05 - 09 116 109
10 - 14 126 116
15 - 19 166 151
20 - 24 181 158
25 - 29 158 145
30 - 34 158 176
35 - 39 193 186
40 - 44 214 207
45 - 49 215 212
50 - 54 209 205
55 - 59 192 194
60 - 64 134 141
65 - 69 102 110
70 - 74 79 97
75 - 79 55 80
80 - 84 28 54
over 85 13 39
**Answer**
<div>
<div>
<input class="qrBtn" onclick="window.open(addrStr[10])" src="QR/EX020204.svg" type="image"/>
</div>
<div>
The data of [Table 2.2.3]{.table-ref} can be loaded from 『eStat』 using the following
address.
::: textLeft
Ex ⇨ eBook ⇨ EX020204_Summary_PopulationByGender.csv.
:::
</div>
</div>
Click on the variable name of the first variable, 'AgeInterval'
followed by the second variable '2015_Male' and the third variable
'2015_Female'. As shown in [Figure 2.2.31]{.figure-ref}, you may select the
'AgeInterval' variable from the 'Analysis Var' box and '2015_Male'
and '2015_Female' variables sequentially from the 'By Group box. When
these variables are selected, a separated vertical bar graph
![](Icon/eStat_icon35_VbarSeparated.png){.imgIcon} as shown in
[Figure 2.2.32]{.figure-ref} which separates the male and female populations with
the same scale of Y-axis will be appeared in the Graph Area.
![](Figure/Fig020231.png){.imgFig300100}
::: figText
[Figure 2.2.31]{.figure-ref} Variable selection for analysis
:::
![](Figure/Fig020232.svg){.imgFig600540}
::: figText
[Figure 2.2.32]{.figure-ref} Separated vertical bar graph of population by age
group and by gender
:::
Among ten possible bar graphs, a side-by-side bar graph
![](Icon/eStat_icon38_VbarSide.png){.imgIcon} as [Figure 2.2.33]{.figure-ref}
would be useful, because it shows the comparison of the number of male
and female populations in each age interval. A ratio bar graph
![](Icon/eStat_icon42_HbarRatio.png){.imgIcon} as \<Figure
2.2.34\> which shows directly the proportions of male and female
populations in each age interval can also be useful. In each of the
graphs, you can easily see that the female population is getting larger
than the male population after the age interval of 50s and more.
![](Figure/Fig020233.svg){.imgFig600540}
::: figText
[Figure 2.2.33]{.figure-ref} Side-by-side vertical bar graph of population by age
and by gender
:::
![](Figure/Fig020234.svg){.imgFig600540}
::: figText
[Figure 2.2.34]{.figure-ref} Proportional horizontal bar graph of population by age
and by gender
:::
A line graph ![](Icon/eStat_icon12_line.png){.imgIcon} as in
[Figure 2.2.35]{.figure-ref} can also be used to see this kind of patterns.
![](Figure/Fig020235.svg){.imgFig600540}
::: figText
[Figure 2.2.35]{.figure-ref} Line graph of population by age and by gender
:::
An overall distribution of the male and female populations by age group
can be observed by using a two-sided (bi-lateral) horizontal bar graph
![](Icon/eStat_icon44_HbarBilateral.png){.imgIcon} as in \<Figure
2.2.36\> which is usually called a population pyramid. Currently, Korea
has an age-specific population structure which looks like a jar. In
other words, the population in age intervals of 40 to 50 is higher than
the population in age intervals of 30 or less which is gradually
decreasing. It would cause many problems in the future society such as
the population decrease, the medicare budget increase etc.
![](Figure/Fig020236.svg){.imgFig600540}
::: figText
[Figure 2.2.36]{.figure-ref} Bi-lateral horizontal bar graph of population by age
and by gender
:::
:::
::: mainTablePink
<div>
<div>
<input class="qrBtn" onclick="window.open(addrStr[42])" src="QR/PR020203.svg" type="image"/>
</div>
<div>
**Practice 2.2.3** **(Death rates in Virginia)**
For each of five age groups (50--54, 55--59, 60--64, 65--69, 70--74),
death rates are measured per 1000 population per year in Virginia. They
are cross-classified by population group such as Rural/Male,
Rural/Female, Urban/Male and Urban/Female. This data are saved at the
following location of 『eStat』system.
Ex ⇨ eBook ⇨ PR020203_Rdatasets_VADeaths.csv
Draw appropriate graphs to analyze characteristics of the data.
</div>
</div>
:::
::: mainTable
In general, if there are many groups (columns) on the summary data, you
can compare the difference between groups for each category of the
analysis variable using different kinds of graphs. If there are many
groups, it is recommended that you draw several kinds of graphs, because
each graph can show you different characteristics of data.
If data are observed over time, it is called a time series and a line
graph is usually used to observe a trend over time. The X-axis includes
values of a time variable which are spaced equally and Y-axis represents
a scale of all time series data. Each pair of data, time and value is
marked as a point in a two-dimensional coordinate plane and two adjacent
points are connected with a line.
:::
::: mainTableGrey
**Example 2.2.5** **(OECD Export -- Import by Country)**
In 2017, import and export data of OECD countries are stored at the
following location of 『eStat』 system.
Ex ⇨ eBook ⇨ EX020205_OECD_ExportImport_2017.csv.
Draw a line graph to find out characteristics of export and import by
country.
**Answer**
<div>
<div>
<input class="qrBtn" onclick="window.open(addrStr[11])" src="QR/EX020205.svg" type="image"/>
</div>
<div>
Retrieve the file from 『eStat』 which will show the data as in \<Figure
2.2.37\>.
![](Figure/Fig020237.png){.imgFig300400}
::: figText
[Figure 2.2.37]{.figure-ref} Export-Import data of OECD countries
:::
</div>
</div>
Click on the line graph icon
![](Icon/eStat_icon12_line.png){.imgIcon}, then click the variable
names of 'Country', 'Export', 'Import' to draw a line graph as in
[Figure 2.2.38]{.figure-ref}.
Looking in the Graph, we can see that China and Germany have lots of
surplus in trade and USA has lots of loss.
![](Figure/Fig020238.svg){.imgFig600540}
::: figText
[Figure 2.2.38]{.figure-ref} Line graph of Export-Import of OECD countries
:::
:::
::: mainTablePink
<div>
<div>
<input class="qrBtn" onclick="window.open(addrStr[43])" src="QR/PR020204.svg" type="image"/>
</div>
<div>
**Practice 2.2.4** **(Income of OECD Countries)**
National incomes of OECD countries in 2000, 2005, 2010 and 2015 are
saved at the following location of 『eStat』 system.
Ex ⇨ OECD ⇨ PR020204_OECD_NationalIncome_2017.csv.
Draw a line graph of the national incomes for each country.
</div>
</div>
:::
::: mainTablePink
<div>