-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.qmd
519 lines (416 loc) · 29 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
---
title: "Alien Encounters 🛸"
subtitle: "Visualizing Unidentified Flying Object-UFO Sightings"
author: "Dhanyapriya"
format: html
editor: visual
---
## Abstract
The present work delves into the fascinating world of UFO sightings, leveraging a comprehensive dataset featuring details on reported encounters with unidentified flying objects. Sourced from the TidyTuesday project, this dataset compiles UFO sighting reports spanning different times, locations, and attributes. The analysis focuses on uncovering intriguing insights into the patterns and characteristics of UFO sightings. The report commences by offering a concise dataset description, emphasizing its provenance and dimensions.
The project explores the temporal aspect of UFO sightings, tracking trends over the years. An in-depth look into diurnal patterns during particular years adds to the comprehensive examination. Ultimately, this report serves as an entry point into the world of UFO sightings, offering data enthusiasts, researchers, and the general public an opportunity to unravel the mysteries of these unexplained phenomena. Its focus on data-driven analysis and visualization provides a foundation for ongoing investigations into the extraordinary and enigmatic realm of UFO encounters.
## Introduction
Unidentified Flying Objects (UFOs) have long captured our collective imagination, inspiring countless stories, debates, and questions about the unknown. In this era of data-driven exploration, we have a unique opportunity to delve into the world of UFO sightings using a comprehensive dataset. The [UFO sightings Dataset](https://github.com/rfordatascience/tidytuesday/tree/master/data/2023/2023-06-20) was taken from the TidyTuesday project. This dataset compiles a treasure trove of reports, encompassing not just descriptions of UFO encounters, but also their temporal, spatial, and qualitative dimensions. Our journey through this dataset begins with an exploration of its provenance and dimensions, setting the stage for a systematic analysis aimed at unraveling the mysteries of UFO sightings. Through this report, we embark on an intellectual adventure to reveal the temporal and spatial patterns, examine correlations, and open the door to a world where the unexplained meets data-driven discovery.
## Question 1: What are the number of UFO encounters globally?
### Introduction
For the first question, we want to visualize the number of UFO sightings that have been reported in various global locations. The unexplained aerial phenomena are encountered frequently across different region of the world. Many scholars and enthusiasts are interested in these encounters in an effort to find any patterns or possible explanations. Here, we want to explore the geographical patterns and hotspots of UFO activity.
For the second part of the question, we have used circular packing to hierarchically visualize the number of sightings in each country across the continents. Here, the outermost circle shows the continent name and inner circles display the number of sightings recorded in each country. We tried to identify fascinating trends and patterns which will help us in understanding the global mystery surrounding UFO sightings.
### Approach
Initially the dataset was loaded from the "TidyTuesday" source using the `read.csv` function in R. The date and location columns were verified to ensure that they are in the correct data types. The unique values and frequency distributions of key variables, particularly `reported_date_time`, `city`, `state`, and `country_code` were studied. Relevant date and location information from the dataset, which includes `reported_date_time`, `city`, `state`, and `country_code` were extracted. To visualize the global distribution of UFO sightings, a geographical plot was drawn. Bubble map was plotted making use of the `longitude` and `latitude` information of the values `city`, `state`, and `country_code` variables.Bubble maps can simultaneously display both the geographical locations of UFO sightings and how their frequency has changed over time. Each bubble could represent a specific location and the number of sightings in that location.
After arranging the distribution of UFO sightings geographically, we thought it will be interesting to know the frequency of mysterious UFO sightings in each country globally. Initially, we decided to plot geom_treemap for hierarchical visualization of the number of sightings in each city across the continents. However, we employed circular packing to make the story more readable and easier to follow. To get to our final result, the assignment of continents to countries based on their country code is performed. Then, to make it easier for the readers to read and comprehend the story, each country's complete name is thoroughly provided. The count of the UFO sightings for each country determines the occurence rate of the incidents. For creating a circular packing for visualization of UFO sighting data, a new column `patching` is created in the dataset which specifies the structure of the circular plot. It is constructed with the varibales `region`, `country_code`, `full_countryname` & `country_ufo_sighting` from `country_count` dataset. The next step converts the dataset to hierarchical data structure with the help of `as.Node()` function. The plot is generated by using `circlepackeR`, with `size` defining the area of the circles. Suitable color range, height and width of the circle has been defined. The title is added to the plot using HTML formatting.
### Analysis
```{r echo=FALSE, warning=FALSE, error=FALSE, message=FALSE, results='hide'}
if(!require(pacman))
install.packages("pacman")
pacman::p_load(tidyverse,
dplyr,
leaflet)
```
```{r echo=FALSE, warning=FALSE, error=FALSE, message=FALSE}
#> Loading the Dataset
#> UFO_Sightings
ufo <- read.csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-20/ufo_sightings.csv")
#> Places -
places <- read.csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-20/places.csv")
#View(ufo)
#View(places)
places_sub <- places |> select(state, country, country_code, latitude, longitude)
places_sub <- places_sub %>% filter(!is.na(latitude) & !is.na(longitude))
places_sub <- places_sub %>%group_by(state) %>%
mutate(State_Count = n()) %>%
ungroup()
#View(places_sub)
map <- leaflet(data = places_sub) %>% addTiles()
map <- map %>% addCircleMarkers(lat = ~latitude, lng = ~longitude, color = "#FFAC33", radius = 5, popup = ~paste("Country_code: ", country_code,"<br>Country: ", country, "<br>State: ", state, "<br>count: ", State_Count))
map
```
```{r fig.width=11,fig.height=6, echo=FALSE, warning=FALSE, error=FALSE, message=FALSE, fig.align='center'}
#Installing and loading the required packages
if(!require(pacman))
install.packages("pacman")
pacman::p_load_gh("jeromefroe/circlepackeR")
pacman::p_load(tidyverse,
dplyr,
plotly,
scales,
hrbrthemes,
data.tree,
htmlwidgets,
htmltools)
#Reading the data
places <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-20/places.csv")
#Assigning the countries their respective region
places <- places |>
mutate(
region = case_when(
country_code %in% c("US", "MX", "CA", "CR", "GT", "BZ", "BS", "DO", "HT", "JM", "BM", "SX", "BB", "HN", "LC", "VI", "CU", "PR", "SV", "KN", "KY") ~ "North America",
country_code %in% c("AU", "NZ", "PW", "SB", "GU", "FJ", "TV") ~ "Oceania",
country_code %in% c("IN", "JP", "PK", "ID", "AM", "MY", "PH", "AE", "MV", "TH", "LK", "CN", "KZ", "IQ", "KR", "OM", "JO", "AF", "TW", "IL", "VN", "IR", "BD", "SA", "KW", "HK", "NP", "BN", "SG", "QA", "LA", "BH", "RU", "KH", "LB", "SY", "MM", "UZ", "KG", "TL", "TR", "AZ") ~ "Asia",
country_code %in% c("FR", "GB", "DE", "NO", "CH", "BG", "ES", "LT", "HR", "IE", "NL", "IS", "IT", "FI", "CZ", "EE", "HU", "SE", "PT", "PL", "DK", "CY", "BA", "SK", "RS", "RO", "MT", "BE", "LV", "AL", "AT", "MK", "GR", "BY", "GE", "SI", "LU", "FO", "MD", "UA", "GI", "XK", "ME") ~ "Europe",
country_code %in% c("CO", "BR", "VE", "BO", "AR", "TT", "CL", "PE", "EC", "UY", "PA", "PY", "GY", "SR") ~ "South America",
country_code %in% c("ZA", "MA", "ZW", "AO", "MU", "LS", "NG", "EG", "CM", "UG", "KE", "DZ", "TD", "BW", "ZM", "TZ", "ET", "TN", "SN", "LY", "LR", "GH", "CV", "MW", "SZ") ~ "Africa",
TRUE ~ "Antratica"
)
)
#Naming the country name from country_code
places <- places |>
mutate(
full_countryname = case_when(
country_code %in% c("US") ~ "United States",
country_code %in% c("MX") ~ "Mexico",
country_code %in% c("CA") ~ "Canada",
country_code %in% c("MX") ~ "Costa Rica",
country_code %in% c("GT") ~ "Guatemala",
country_code %in% c("BZ") ~ "Belize",
country_code %in% c("BS") ~ "The Bahamas",
country_code %in% c("DO") ~ "Dominican Republic",
country_code %in% c("HT") ~ "Haiti",
country_code %in% c("JM") ~ "Jamaica",
country_code %in% c("BM") ~ "Bermuda",
country_code %in% c("SX") ~ "Sint Maarten",
country_code %in% c("BB") ~ "Barbados",
country_code %in% c("MX") ~ "Honduras",
country_code %in% c("LC") ~ "St Lucia",
country_code %in% c("VI") ~ "U.S Virgin Islands",
country_code %in% c("CU") ~ "Cuba",
country_code %in% c("PR") ~ "Puerto Rico",
country_code %in% c("SV") ~ "El Salvador",
country_code %in% c("KN") ~ "St Kitts & Nevis",
country_code %in% c("KY") ~ "Cayman Islands",
country_code %in% c("AU") ~ "Australia",
country_code %in% c("NZ") ~ "New Zealand",
country_code %in% c("PW") ~ "Palau",
country_code %in% c("SB") ~ "Solomon Islands",
country_code %in% c("GU") ~ "Guam",
country_code %in% c("FJ") ~ "Fiji",
country_code %in% c("IN") ~ "India",
country_code %in% c("JP") ~ "Japan",
country_code %in% c("PK") ~ "Pakistan",
country_code %in% c("ID") ~ "Indonesia",
country_code %in% c("AM") ~ "Armenia",
country_code %in% c("MY") ~ "Malaysia",
country_code %in% c("PH") ~ "Philippines",
country_code %in% c("AE") ~ "United Arab Emirates",
country_code %in% c("MV") ~ "Maldives",
country_code %in% c("TH") ~ "Thailand",
country_code %in% c("LK") ~ "Sri Lanka",
country_code %in% c("CN") ~ "China",
country_code %in% c("KZ") ~ "Kazakhstan",
country_code %in% c("IQ") ~ "Iraq",
country_code %in% c("KR") ~ "South Korea",
country_code %in% c("OM") ~ "Oman",
country_code %in% c("JO") ~ "Jordan",
country_code %in% c("AF") ~ "Afghanistan",
country_code %in% c("TW") ~ "Taiwan",
country_code %in% c("IL") ~ "Israel",
country_code %in% c("VN") ~ "Vietnam",
country_code %in% c("IR") ~ "Iran",
country_code %in% c("BD") ~ "Bangladesh",
country_code %in% c("SA") ~ "Saudi Arabia",
country_code %in% c("KW") ~ "Kuwait",
country_code %in% c("HK") ~ "Hong Kong",
country_code %in% c("NP") ~ "Nepal",
country_code %in% c("BN") ~ "Brunei",
country_code %in% c("SG") ~ "Singapore",
country_code %in% c("QA") ~ "Qatar",
country_code %in% c("LA") ~ "Laos",
country_code %in% c("BH") ~ "Bahrain",
country_code %in% c("RU") ~ "Russia",
country_code %in% c("KH") ~ "Cambodia",
country_code %in% c("LB") ~ "Lebanon",
country_code %in% c("SY") ~ "Syria",
country_code %in% c("MM") ~ "Myanmar",
country_code %in% c("UZ") ~ "Uzbekistan",
country_code %in% c("KG") ~ "Kyrgyzstan",
country_code %in% c("TL") ~ "Timor-Leste",
country_code %in% c("TR") ~ "Turkey",
country_code %in% c("AZ") ~ "Azerbaijan",
country_code %in% c("FR") ~ "France",
country_code %in% c("GB") ~ "United Kingdom",
country_code %in% c("DE") ~ "Germany",
country_code %in% c("NO") ~ "Norway",
country_code %in% c("CH") ~ "Switzerland",
country_code %in% c("BG") ~ "Bulgaria",
country_code %in% c("ES") ~ "Spain",
country_code %in% c("LT") ~ "Lithuania",
country_code %in% c("HR") ~ "Croatia",
country_code %in% c("IE") ~ "Ireland",
country_code %in% c("NL") ~ "Netherlands",
country_code %in% c("IS") ~ "Iceland",
country_code %in% c("IT") ~ "Italy",
country_code %in% c("FI") ~ "Finland",
country_code %in% c("CZ") ~ "Czechia",
country_code %in% c("EE") ~ "Estonia",
country_code %in% c("HU") ~ "Hungary",
country_code %in% c("SE") ~ "Sweden",
country_code %in% c("PT") ~ "Portugal",
country_code %in% c("PL") ~ "Poland",
country_code %in% c("DK") ~ "Denmark",
country_code %in% c("CY") ~ "Cyprus",
country_code %in% c("BA") ~ "Bosnia and Herzegovina",
country_code %in% c("SK") ~ "Slovakia",
country_code %in% c("RS") ~ "Serbia",
country_code %in% c("RO") ~ "Romania",
country_code %in% c("MT") ~ "Malta",
country_code %in% c("BE") ~ "Belgium",
country_code %in% c("LV") ~ "Latvia",
country_code %in% c("AL") ~ "Albania",
country_code %in% c("AT") ~ "Austria",
country_code %in% c("MK") ~ "North Macedonia",
country_code %in% c("GR") ~ "Greece",
country_code %in% c("BY") ~ "Belarus",
country_code %in% c("GE") ~ "Georgia",
country_code %in% c("SI") ~ "Slovenia",
country_code %in% c("LU") ~ "Luxembourg",
country_code %in% c("FO") ~ "Faroe Islands",
country_code %in% c("MD") ~ "Moldova",
country_code %in% c("UA") ~ "Ukraine",
country_code %in% c("GI") ~ "Gibraltar",
country_code %in% c("XK") ~ "Kosovo",
country_code %in% c("ME") ~ "Montenegro",
country_code %in% c("CO") ~ "Colombia",
country_code %in% c("BR") ~ "Brazil",
country_code %in% c("VE") ~ "Venezuela",
country_code %in% c("BO") ~ "Bolivia",
country_code %in% c("AR") ~ "Argentina",
country_code %in% c("TT") ~ "Trinidad and Tobago",
country_code %in% c("CL") ~ "Chile",
country_code %in% c("PE") ~ "Peru",
country_code %in% c("EC") ~ "Ecuado",
country_code %in% c("UY") ~ "Ecuado",
country_code %in% c("PA") ~ "Panama",
country_code %in% c("PY") ~ "Paraguay",
country_code %in% c("GY") ~ "Guyana",
country_code %in% c("SR") ~ "Suriname",
country_code %in% c("ZA") ~ "South Africa",
country_code %in% c("MA") ~ "Morocco",
country_code %in% c("ZW") ~ "Zimbabwe",
country_code %in% c("AO") ~ "Angola",
country_code %in% c("MU") ~ "Mauritius",
country_code %in% c("LS") ~ "Lesotho",
country_code %in% c("NG") ~ "Nigeria",
country_code %in% c("EG") ~ "Egypt",
country_code %in% c("CM") ~ "Cameroon",
country_code %in% c("UG") ~ "Uganda",
country_code %in% c("KE") ~ "Kenya",
country_code %in% c("DZ") ~ "Algeria",
country_code %in% c("TD") ~ "Chad",
country_code %in% c("BW") ~ "Botswana",
country_code %in% c("ZM") ~ "Zambia",
country_code %in% c("TZ") ~ "Tanzania",
country_code %in% c("ET") ~ "Ethiopia",
country_code %in% c("TN") ~ "Tunisia",
country_code %in% c("SN") ~ "Senegal",
country_code %in% c("LY") ~ "Libya",
country_code %in% c("LR") ~ "Liberia",
country_code %in% c("GH") ~ "Ghana",
country_code %in% c("CV") ~ "Cabo Verde",
country_code %in% c("MW") ~ "Malawi",
country_code %in% c("SZ") ~ "Eswatini",
country_code %in% c("TV") ~ "Tuvalu",
TRUE ~ "Unknow Places"
)
)
#Counting the occurrence of UFO sightings in across the cities in each region
country_count <- places |>
group_by(country_code, region, full_countryname) |>
summarise(country_ufo_sighting = n(), .groups = "drop") |>
ungroup()
#Plotting the Circular Packing
country_count$patching <- paste("world", country_count$region,
country_count$country_code,
paste(country_count$full_countryname,
country_count$country_ufo_sighting,
sep = " = "),
sep = "/")
circle_pack <- as.Node(country_count, pathName = "patching")
circular_plot <- circlepackeR(circle_pack, size = "country_ufo_sighting",
color_min = "#CCCCFF", color_max = "#702963",
width = 1000)
title <- "Occurence of UFO Sightings across the Countries in each Continents"
final_style <- paste("color:", "#C0C0C0", ";",
"font-size", "14px", ";",
"text_aling: centre;",
"font-weight: bold;")
final_plot <- prependContent(circular_plot, tags$h3(title, style = final_style, class = "centered-title"))
final_circle <- div(
style = "text-align: center;",
final_plot)
final_circle
```
### Discussion
The bubble map uses the colored bubbles to represent data points on a geographical map. The bubbles speaks for a specific sighting, and its placement on the map. This helps to pin down where these events have happened. On clicking on the bubble, it provides a detailed information on country_code, country name, state_name and the count of the total number of sightings in that specific state. We considered the latitude and longitude values to spot these locations more accurately. For the second part of the question, we have used circular packing to hierarchically visualize the number of sightings in each cities across the continents. Here, the outermost circle shows the continent name and inner circles displays the number of sightings recorded in each countries. We tried to identify fascianting trends and patterns which will help us in understanding the global mystery surrounding UFO sightings.
The circular packing plot shows a hierarchical structure. The outer circle represents the continents. The inner bubbles inside the continents show country-code for individual countries. Upon interaction, it displays the country name with total number of UFO sightings recorded in that country.This enables viewers to access specific data about sightings in each country, which further benefits in quick and intuitive comparison of UFO sighting frequencies across each regions and countries. The region or cities having larger circles indicate a higher volume of sightings, while smaller circles tell us there are fewer occurrences. This gives a profound understanding of the data's geographical pattern with highest and lowest UFO sighting counts. For instance, here the region North America or the country US has notably larger number of sightings than other places, which aligns with the popular association of North America and US with UFO sightings. Finally, it enables viewers to understand the relative frequency of UFO encounters immediately and makes it easier to conduct thorough investigations at both the regional and national levels.
## Question 2: How did the trend of UFO sightings vary over years?
### Introduction
The fascination with unidentified flying objects (UFOs) has attracted people worldwide for decades. Over the years, countless reports of UFO sightings have been documented and to gain a deeper understanding of UFO sightings and their trends, it is important to investigate how the frequency and patterns of sightings have changed over time. Density plots were used to identify the trend in UFO sightings over the years. Further, a bar plot was used to understand relation of the sightings with reference to the parts of a day.
### Approach
We started by making inferences from the previously generated plots in question 1. A new column was created to extract the year from the `reported_date_time` or `reported_date_time_utc` column. This allowed to group sightings by year. The number of sightings for each year was counted by grouping the data by the extracted year.
To show the trend of UFO sightings over the years, density plots were used. They will be helpful in understanding the temporal distribution of UFO sightings.
A bar plot was used to see at which part of the day the sightings happened. This type of visualization can provide insights into the diurnal patterns of UFO sightings and helps in understanding if there are any specific times of day when sightings are more frequent. Additional aesthetics, such as background image to the plot was added to make the plot visually attractive.
Finally, the resulting plots can be used to analyze how UFO sightings have changed over time, showing the pattern visually.
### Analysis
```{r, fig.width=14,fig.height=14,echo=FALSE,warning=FALSE, error=FALSE, message=FALSE}
#loading packages
if(!require(pacman))
install.packages("pacman")
pacman::p_load(tidyverse,
dplyr,
ggplot2,
gridExtra,
grid)
#loading data
ufo <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-20/ufo_sightings.csv")
places <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-20/places.csv")
#mutating 'continent' to ufo dataset
ufo <- ufo |>
mutate(
continent = case_when(
country_code %in% c("US", "MX", "CA", "CR", "GT", "BZ", "BS", "DO", "HT",
"JM", "BM", "SX", "BB", "HN", "LC", "VI", "CU", "PR",
"SV", "KN", "KY") ~ "North America",
country_code %in% c("AU", "NZ", "PW", "SB", "GU", "FJ", "TV") ~ "Australia and Oceania",
country_code %in% c("IN", "JP", "PK", "ID", "AM", "MY", "PH", "AE", "MV",
"TH", "LK", "CN", "KZ", "IQ", "KR", "OM", "JO", "AF",
"TW", "IL", "VN", "IR", "BD", "SA", "KW", "HK", "NP",
"BN", "SG", "QA", "LA", "BH", "RU", "KH", "LB", "SY",
"MM", "UZ", "KG", "TL","TR","AZ") ~ "Asia",
country_code %in% c("FR", "GB", "DE", "NO", "CH", "BG", "ES", "LT", "HR",
"IE", "NL", "IS", "IT", "FI", "CZ", "EE", "HU", "SE",
"PT", "PL", "DK", "CY", "BA", "SK", "RS", "RO", "MT",
"BE", "LV", "AL", "AT", "MK", "GR", "BY", "GE", "SI",
"LU", "FO", "MD", "UA", "GI", "XK", "ME") ~ "Europe",
country_code %in% c("CO", "BR", "VE", "BO", "AR", "TT", "CL", "PE", "EC",
"UY", "PA", "PY", "GY", "SR") ~ "South America",
country_code %in% c("ZA", "MA", "ZW", "AO", "MU", "LS", "NG", "EG", "CM",
"UG", "KE", "DZ", "TD", "BW", "ZM", "TZ", "ET", "TN",
"SN", "LY", "LR", "GH", "CV", "MW", "SZ") ~ "Africa",
TRUE ~ "Antratica"
)
)
#extracting year from reported_date_time_utc variable
ufo<-ufo|>
mutate(year=year(as.Date.character(reported_date_time_utc)))
#counting by grouping year and country code
ufo<-ufo|>
group_by(year, country_code)|>
mutate(count=n())
#plotting Africa data
a<-ggplot(subset(ufo,continent=="Africa"),mapping=aes(x=year))+
geom_density(color = "darkgreen",size=1)+
labs(x = "Year",y = "Density",title = "Africa")+
theme(axis.title=element_text(size=15),
title=element_text(size=20))
#plotting North America data
b<-ggplot(subset(ufo,continent=="North America"),mapping=aes(x=year))+
geom_density(color = "darkred",size=1)+
labs(x = "Year",y = "Density",title = "North America")+
theme(axis.title=element_text(size=15),
title=element_text(size=20))
#plotting Australia and Oceania data
c<-ggplot(subset(ufo,continent=="Australia and Oceania"),mapping=aes(x=year))+
geom_density(color = "darkorange",size=1)+
labs(x = "Year",y = "Density",title = "Australia and Oceania")+
theme(axis.title=element_text(size=15),
title=element_text(size=20))
#plotting Asia data
d<-ggplot(subset(ufo,continent=="Asia"),mapping=aes(x=year))+
geom_density(color = "darkblue",size=1)+
labs(x = "Year",y = "Density",title = "Asia")+
theme(axis.title=element_text(size=15),
title=element_text(size=20))
#plotting Europe data
e<-ggplot(subset(ufo,continent=="Europe"),mapping=aes(x=year))+
geom_density(colour = "darkgrey",size=1)+
labs(x = "Year",y = "Density",title = "Europe")+
theme(axis.title=element_text(size=15),
title=element_text(size=20))
#plotting South America data
f<-ggplot(subset(ufo,continent=="South America"),mapping=aes(x=year))+
geom_density(color = "darkviolet",size=1)+
labs(x = "Year",y = "Density",title = "South America")+
theme(axis.title=element_text(size=15),
title=element_text(size=20))
#arranging all the plots in one frame
grid.arrange(a,b,c,d,e,f,nrow=4,ncol=2,
top = textGrob("Variation in UFO sightings over the years\n",gp=gpar(fontsize=40,face = "bold")))
```
```{r, fig.width=11,fig.height=6, echo=FALSE, warning=FALSE, error=FALSE, message=FALSE}
# Libraries required
library(ggplot2)
library(hrbrthemes)
library(tidyverse)
library(ggpubr)
library(jpeg)
library(png)
# Loading data
data <- read.csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-06-20/ufo_sightings.csv")
# N=Removing NAs from the data
noNAData <- na.omit(data)
# Extracting the "day_partcolumn from the dataset"
dp <- noNAData |>
count(day_part)
# Creating new columns to generate a new dataframe
c1 <- dp[,1]
c2 <- dp[,2]
# Creating a new data-frame with the day-parts and the number of sightings
new <- data.frame(c1,c2)
# Importing the background image
img <- readPNG("images/img7.png")
# Customizing y-axis ticks
y_ticks <- c("Civil dawn", "Nautical dawn", "Astronomical dawn", "Civil dusk","Morning","Nautical dusk", "Astronomical dusk", "Afternoon", "Night")
# Generating the plot
ggplot(new, aes(x=c2,y = reorder(c1,c2), fill = c1)) +
background_image(img)+
geom_bar(stat = "identity",color="white") +
theme_minimal()+
scale_fill_grey() +
scale_y_discrete(labels= y_ticks)+
ggtitle("Sighting UFOs at distinct day-parts\n") +
theme(plot.title = element_text(hjust = 0.5, face = "bold", size = 20))+
labs(x = "\nNo. of sightings",
y = "Parts of the Day\n")+
theme(axis.title.x = element_text(size = 14, face = "bold", hjust = 0.5),
axis.title.y = element_text(size = 14, face = "bold", hjust = 0.5),
axis.text.x = element_text(face="bold", size= 11),
axis.text.y = element_text(face="bold", size= 11),
legend.position="none",
plot.margin = margin(t = 1, # Top margin
r = 4, # Right margin
b = 1, # Bottom margin
l = 1, # Left margin
unit = "lines"))+
geom_text(x = 50000, # Set the position of the text to always be at '50000'
hjust = 0,
size = 4,
label = c2)+
coord_cartesian(clip = "off")
```
### Discussion
The density plots provide a visual representation of how UFO sightings have varied over the years. They allow us to observe patterns and trends of UFO sightings across the globe. Analyzing these trends help in understanding the factors that influence UFO sightings and their reporting across the world. For every continent, there has been an increase in the number of UFO sightings since the beginning of 21st century. Among all the continents, North America accounts for the highest value of density. Whereas, South America has the least value.
Further, the `day_part` column was considered to understand the pattern of UFO sightings accross different parts of the day.\
The following image represents the types of twilight that may be helpful in understanding the plot in a better way.
![Types of twilight - [source: science.howstuffworks.com](https://science.howstuffworks.com/nature/climate-weather/atmospheric/twilight-dusk.htm)](images/twilight.jpg){width="6in"}
The bar chart depicts the number of UFO sightings reported during different parts of the day. The y-axis represents the various segments of the day, categorized into nine distinct periods viz. Night, Afternoon, Astronomical dusk, Nautical dusk, Morning, Civil dusk, Astronomical dawn, Nautical dawn, and Civil dawn. The y-axis represents the number of UFO sightings reported for each of these day-parts.
From the above image, it is clear that the parts of the twilight between night-day and day-night are called dawn and dusk respectively.\
The chart reveals that the highest number of UFO sightings, reaching a staggering 47,372, during the night. This is followed by a significant drop in sightings during the afternoon, with 12,353 reports. Astronomical dusk and nautical dusk witness a further decline in sightings, with 10,182 and 7,523 sightings respectively. Morning records a slightly less number of sightings compared to nautical dusk, with 7,379 reports. Civil dusk, astronomical dawn, and nautical dawn witness a consistent decrease in sightings, with 3,384, 1,688, and 1,281 reports, respectively. The lowest number of UFO sightings, a mere 648 reports, occurs during civil dawn.\
Overall, the chart highlights a clear pattern of UFO sightings being predominantly reported during the night, with a decrease in sightings as the day progresses towards dawn. This suggests that darkness or reduced visibility might play a role in the increased frequency of UFO sightings during nighttime.