This dataset contains the data of 1K+ Amazon Products' Ratings and Reviews as per their details listed on the official website of Amazon. Please note that I do not own the dataset.
- product_id: Product ID
- product_name: Name of the Product
- category: Category of the Product
- discounted_price: Discounted Price of the Product
- actual_price: Actual Price of the Product
- discount_percentage: Percentage of Discount for the Product
- rating: Rating of the Product
- rating_count: Number of people who voted for the Amazon rating
- about_product: Description about the Product
- user_id: ID of the user who wrote review for the Product
- user_name: Name of the user who wrote review for the Product
- review_id: ID of the user review
- review_title: Short review
- review_content: Long review
- img_link: Image Link of the Product
- product_link: Official Website Link of the Product
Amazon is an American Tech Multi-National Company whose business interests include E-commerce, where they buy and store the inventory, and take care of everything from shipping and pricing to customer service and returns. This dataset is created so that people can perform various analyses and tasks such as:
- Dataset Walkthrough
- Understanding Dataset Hierarchy
- Data Preprocessing
- Exploratory Data Analysis
- Data Visualization
- Making Recommendation System
- How does the distribution of discounted prices vary across different product categories, and what implications does this have for customer purchasing behavior?
- What correlations exist between product ratings and discount percentages, and how can pricing strategies be optimized to maintain customer satisfaction?
- Which product categories demonstrate high average revenue per unit, and what strategies can be employed to capitalize on these segments?
- How does user engagement contribute to the success of the platform, and what strategies can be implemented to enhance it?
- What is the overall sentiment of customer feedback, and how can negative and neutral sentiments be addressed to improve customer satisfaction?
- Are there any noticeable patterns in user behavior and preferences, particularly in relation to review attributes and product categories?
Note: The symbol used to represent rupee wasn’t available and understandable for Excel, so text split was used to extract the numbers, and then the rupee symbol was added. Also, '1' in rating was represented with a symbol that is not standard.
- Correlation analysis between numerical features
- Relationship between ratings and other numerical features (e.g., price).
- Compare the average ratings across different product categories.
- Explore the relationship between rating count and actual rating.
Source: Kaggle Dataset