Skip to content

πŸ’»βœ¨ Fragrance Analysis: A Scent-sational Data Dive into Perfume Trends! Discover the world of perfumes through data! Using Python, pandas, and a little bit of magic πŸ§™β€β™‚οΈ, we analyze fragrance trends, pricing strategies, and brand dominance. Whether you’re a data enthusiast or a perfume aficionado, this repo combines the best of both worlds.

License

Notifications You must be signed in to change notification settings

andrewhryn/DA_Fragrance_Analysis

Repository files navigation

Fragrance Analysis: Insights into the Perfume Industry with Data πŸ”¬

Banner

GitHub Python Pandas Matplotlib Seaborn Jupyter

Project Overview πŸ“Š

This project dives into the world of fragrance sales, uncovering key insights such as trends, price distributions, brand dominance, and consumer preferences. The analysis covers both men’s and women’s fragrances, using Python-based data science tools to clean, explore, and visualize large datasets. The goal is to provide actionable insights into market trends, product positioning, and consumer behavior within the fragrance industry.

Table of Contents πŸ“š


Tools and Technologies πŸ› οΈ

  • Python: The core language for data manipulation and analysis.
  • Pandas: A powerful library for data cleaning, structuring, and analysis.
  • Matplotlib/Seaborn: Visualization libraries used to create plots and graphs that reveal trends and patterns.
  • Jupyter Notebooks: A platform for documenting the data exploration process and organizing analysis workflows.
  • GitHub: For version control, collaboration, and code sharing.

Step-by-Step Code Overview πŸ§‘β€πŸ’»

This section covers the key Python skills demonstrated in the project, focusing on data cleaning, aggregation, and visualization using Pandas, Matplotlib, and Seaborn. The full code can be found in the GitHub repository.

1. Data Import and Initial Exploration πŸ“₯

The first step was to load the fragrance sales dataset and perform an initial exploration to understand its structure. You can view the full code for this part here.

import pandas as pd

# Load dataset
data = pd.read_csv('fragrance_sales.csv')

# Display first few rows of the dataset
data.head()

Explanation:

  • Skills: Using Pandas to load and explore data.
  • Purpose: The .head() function provides a quick look at the dataset’s structure, helping to identify key columns and understand the overall data composition.

2. Data Cleaning: Handling Missing Values and Duplicates 🧹

Next, the dataset was cleaned by removing rows with missing values and eliminating duplicate records to ensure data accuracy. Full code for data cleaning can be accessed here.

# Remove rows with missing values in key columns
data.dropna(subset=['price', 'brand', 'category'], inplace=True)

# Drop duplicate entries
data.drop_duplicates(inplace=True)

# Ensure that 'price' is treated as a numeric data type
data['price'] = pd.to_numeric(data['price'], errors='coerce')

Explanation:

  • Skills: Cleaning data by handling missing values (dropna()) and removing duplicates (drop_duplicates()).
  • Purpose: These operations improve the quality of the data, ensuring that the analysis is reliable and accurate.

3. Data Transformation: Aggregating Sales Data πŸ“Š

After cleaning, the data was grouped by relevant categories to calculate total sales, providing insights into the most profitable product categories. The full code for this can be found here.

# Group data by category and calculate total sales
total_sales_by_category = data.groupby('category')['sales'].sum()

# Display the results
total_sales_by_category

Explanation:

  • Skills: Using groupby() and sum() to aggregate data.
  • Purpose: This aggregation helps to summarize sales by category, providing an understanding of which categories perform best.

Graphical Insights πŸ“Š

This section showcases the key visualizations created during the analysis of the fragrance sales data. Each graph was generated using Python libraries like Matplotlib and Seaborn, and they provide valuable insights into brand dominance, price distribution, consumer preferences, and market trends.

1. Most Popular Notes 🌟

This scatter plot shows the most frequently used fragrance notes in perfumes. Musk, Jasmine, and Amber are among the top fragrance notes, indicating their popularity among perfume brands.

Most Popular Notes

# Scatter plot for most popular fragrance notes

sns.scatterplot(data=notes, x='percent_of_all', y='count', palette='magma', hue='percent_of_all')
plt.legend().remove()
plt.xlabel('Percent of all Perfumes')
plt.ylabel('Total of Perfumes')
plt.title('Most Popular Notes')

for i in range(len(notes)):
    plt.text(notes['percent_of_all'].iloc[i], 
             notes['count'].iloc[i], 
             notes['notes_str'].iloc[i], 
             fontsize=10, ha='right')

Most Popular Notes (Barplot)

#Building Barplot 

sns.barplot(data=top10_notes, x='count', y='notes_str', palette='magma')
sns.despine()

plt.title('Most Popular Notes')
plt.ylabel('')
plt.xlabel('Total of Perfumes')

2. Brands that Created the Most Perfume Variations πŸ†

This bar chart highlights the brands that have created the highest number of perfume variations. Musk leads the way, followed by Jasmine and Amber, showing their strong presence in the fragrance market.

Brands that created the most perfume variations

#Building Barplot 

sns.barplot(data=top10_size, x='perfume', y='brand', palette='magma')
sns.despine()

plt.title('Brands that created the most perfume variations')
plt.ylabel('')
plt.xlabel('Total of Perfumes')

3. Price Distribution by Perfume Type πŸ’΅

This boxplot shows the price distribution of different types of perfumes. It highlights how Eau de Parfum generally has the highest price range, while Eau de Toilette and Cologne are more affordable.

Price Distribution by Perfume Type

sns.boxplot(data=med_category, y='type_cleaned', x='price', palette='magma')

plt.ylabel('')
plt.title('Price Distribution by Perfume Type')
plt.xlabel('Price (USD)')

4. Top 25 Men’s Perfumes by Number of Sales 🚹

This bar chart shows the top 25 men’s perfumes ranked by the number of sales. Calvin Klein, Versace, and Davidoff are among the most popular brands in men’s fragrance.

Top 25 Men Perfumes by number of sales

sns.barplot(data=top_brand, x='sold', y='brand', palette='magma')
sns.despine()

plt.title('Top 25 Men Perfumes by number of sales')
plt.ylabel('')
plt.xlabel('Number of Sales')

5. Trend of Perfumes Over the Last 20 Years πŸ“ˆ

This line chart shows the upward trend in the number of perfumes launched over the past two decades. The data reveals a steady increase in perfume launches since the early 2000s.

Trend of Perfumes Over Last 20 Years

sns.lineplot(data=year, x='launch_year', y='total', color='red')
plt.xticks(rotation=45)
sns.despine()

plt.xlabel('Launch Year')
plt.ylabel('Total of Perfumes')
plt.title('Trend of Pefumes Over Last 20 Years')

6. Total Sales for Different Perfume Categories πŸ“Š

This bar chart provides a breakdown of the total sales for different perfume categories. Eau de Toilette leads the market in terms of sales, followed by Eau de Parfum and Cologne.

Total sales for different perfume categories

sns.barplot(data=top_category, x='type_cleaned', y='sold', palette='magma')
sns.despine()

plt.title('Total sales for different perfume categories')
plt.xlabel('')
plt.ylabel('Number of Sales')

plt.figtext(0.5, -0.2, """Typical fragrance concentrations for each type:
1. Perfume (Parfum): 20% - 40%
2. Eau de Parfum: 15% - 20%
3. Eau de Toilette: 5% - 15%
4. Cologne (Eau de Cologne): 2% - 4%""",
            ha="center", fontsize=10)

Key Findings πŸ”‘:

  • Brand Dominance: Brands like Avon and Demeter Fragrance dominate the market with a wide variety of products.
  • Price Distribution: Eau de Parfum commands higher prices, while Eau de Toilette and Cologne offer more affordable options.
  • Consumer Preferences: Popular fragrance notes such as Musk, Jasmine, and Amber are consistently in demand.
  • Market Growth: The fragrance market has seen significant growth, with a steady increase in new perfume launches over the last 20 years.
  • Top Performers: Calvin Klein, Versace, and Davidoff lead the men’s fragrance market in terms of sales.

References πŸ“

About

πŸ’»βœ¨ Fragrance Analysis: A Scent-sational Data Dive into Perfume Trends! Discover the world of perfumes through data! Using Python, pandas, and a little bit of magic πŸ§™β€β™‚οΈ, we analyze fragrance trends, pricing strategies, and brand dominance. Whether you’re a data enthusiast or a perfume aficionado, this repo combines the best of both worlds.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published