diff --git a/_drafts/2018-12-05-an-exhibit-of-markdown.md b/_drafts/2018-12-05-an-exhibit-of-markdown copy.md similarity index 100% rename from _drafts/2018-12-05-an-exhibit-of-markdown.md rename to _drafts/2018-12-05-an-exhibit-of-markdown copy.md diff --git a/_posts/2024-11-04-basket-analysis-for-data-analyst.md b/_posts/2024-11-04-basket-analysis-for-data-analyst.md new file mode 100644 index 00000000..d072716e --- /dev/null +++ b/_posts/2024-11-04-basket-analysis-for-data-analyst.md @@ -0,0 +1,333 @@ +--- +layout: post +title: Implementing Basket Analysis in Power BI and SQL - Association rule +subtitle: Basket Analysis for Data analyst +categories: data +excerpt_image: "assets/images/posts/heidi-fin-2TLREZi7BUg-unsplash.jpg" +banner: "assets/images/posts/heidi-fin-2TLREZi7BUg-unsplash.jpg" +tags: [basket-analysis, excel, templates] +--- + +Images: Heidi Fin (unsplash.com) + +# Power BI (DAX) Implementation + +## Goal + +This post is share my experience of implementing basket analysis in Power BI and sql. At the end of this post, I will attach links for my Power BI file. Please note this power BI is receiving updates as I write new posts. + +You can find the Power BI file here, some screeshots are provided in this post + +Attachment: [241103_AdventureWorks_na.pbix][1] + + +## Key Idea + +Basket analysis using Association Rule, involves understanding purchase patterns over time to see how products relate to each other within a customer’s buying journey. In this approach, each customer is viewed as a "basket," meaning we analyze product relationships based on all items they’ve bought, across different orders. The purpose is to find patterns in what customers purchase together, over time, rather than in a single transaction. + +Key metrics, like *support*, *confidence*, and *lift*, are used to understand these product relationships: + +- **Support**: The percentage of total customers who bought both items. It gives a sense of the prevalence of a combination in the overall population. +- **Confidence**: The likelihood that a customer who bought one product also bought another. +- **Lift**: The strength of the association between two products. A lift greater than 1 implies a meaningful association that could predict future purchases. + +For example, if "Computers" and "Cell phones" frequently appear in a customer’s purchase history, the analysis could reveal that customers who buy "Cell phones" have a high likelihood of also purchasing "Computers," helping companies understand customer preferences and potentially guiding personalized marketing strategies. + +### Step 1: Cartesian product of Product with itself + +Create a Cartesian product of Product with itself(And Product) on a calculated table and calculate the common customers of each combination + +```jsx +DEFINE MEASURE [# Customers Both (Internal)] = + + VAR CustomersWithAndProducts = + CALCULATETABLE ( + DISTINCT ( Sales[CustomerKey] ), + REMOVEFILTERS ( 'Product' ), + REMOVEFILTERS ( Sales[ProductKey] ), + USERELATIONSHIP ( Sales[ProductKey], 'And Product'[And ProductKey] ) + ) +VAR Result = + CALCULATE ( [# Customers], KEEPFILTERS ( CustomersWithAndProducts ) ) +RETURN + Result +------------------------------------------ +EVALUATE +RawProductsCustomers = +FILTER ( + SUMMARIZECOLUMNS ( + 'Sales'[ProductKey], + 'And Product'[And ProductKey], + "Customers", [# Customers Both (Internal)] + ), + NOT ISBLANK ( [Customers] ) && 'And Product'[And ProductKey] <> 'Sales'[ProductKey] +) +``` + +### Step 2: Customer Support. + +Customer Support has a difficult name, but the idea is “what’s ratio of number of common customers of each combination and total number of customers.” + +```jsx +DEFINE MEASURE [# Customers Both] = +VAR ExistingAndProductKey = + CALCULATETABLE ( + DISTINCT ( RawProductsCustomers[And ProductKey] ), + TREATAS ( + DISTINCT ( 'And Product'[And ProductKey] ), + RawProductsCustomers[And ProductKey] + ) + ) +VAR FilterAndProducts = + TREATAS ( + EXCEPT ( + ExistingAndProductKey, + DISTINCT ( 'Product'[ProductKey] ) + ), + Sales[ProductKey] + ) +VAR CustomersWithAndProducts = + CALCULATETABLE ( + SUMMARIZE ( Sales, Sales[CustomerKey] ), + REMOVEFILTERS ( 'Product' ), + FilterAndProducts + ) +VAR Result = + CALCULATE ( + [# Customers], + KEEPFILTERS ( CustomersWithAndProducts ) + ) +RETURN + Result +---------------------------------------------------- +DEFINE MEASURE [# Customers Total] **=** +CALCULATE ( + ****[# Customers], + ****REMOVEFILTERS ( 'Product' ) +**)** + +---------------------------------------------------- +DEFINE MEASURE [% Customers Support] = +DIVIDE ( [# Customers Both], [# Customers Total] ) + + +``` + +Wait, why this common customer is so mess here. Let’s dive into the measure. The core logic here is to first, get the number of customers in the first list of Products, which is a standard measure [# Basket]. Then, you want to use a list of customers for And Product as a filter to filter this measure. This is called Table Filter technique in DAX, relatively advanced concept in DAX. + +Then the question becomes How to do you get this list of customers for And Product? You first use TREATAS to connect the sales table and And Product table,, once they are connected, you can summarize this sales table (with data lineage) by customers. This technique is called Data Lineage. is a basic concept, however easy to trip over even for advanced DAX users. + +### Step 3: Customer Lift. + +Customer Lift is nothing complex but a comparison of two ratios. Don’t get fooled by its sophisticated name. This measure is comparing the first ratio we calculated above to a second ratio, “What is the ratio of customers for **And Product** to total number of customers. + +``` +$ +$\frac{Common customers /total customers}{Product B customers / total customers}$ +$ +``` + +**This is saying how much more likely customers are to purchase a product (Product) when they have already bought another related product (And Product)** + +```jsx +Customers Lift = +DIVIDE ( + [% Customers Confidence], + DIVIDE ( + [# Customers And], + [# Customers Total] + ) +) +``` + +### **Attribution** + +This pattern is included in the book [**DAX Patterns, Second Edition**](https://www.daxpatterns.com/books/dax-patterns-second-edition/). by [**Alberto Ferrari](http://www.sqlbi.com/articles/author/alberto-ferrari/) and [Marco Russo](http://www.sqlbi.com/articles/author/marco-russo/). I highly recommend their content. It’s excellent work. They are dedicated DAX teachers with comprehensive DAX knowledge. Their work is systematic and excellent.** + +# Comment on the Power BI Implementation + +To be fully honest, DAX is not the best language to implement this analysis. First reason is that DAX's advantage in Context Transitions complicated the code from both writer and report readers' point of view. Secondly, it is highly likely the data becomes so large that the report runs slow. The Power BI advantages in producing dynamic computation as user consumes the report doesn't really shine in this situation. + +Instead, I find SQL implemtation is really easy to understand. + +# SQL Implementation + +This is based on schema of SAP Business One with SQL version v10.0 +The SalesAnalysis is a replicated view of the SAP stock sales anaysis report. + +```sql + +CREATE PROCEDURE [dbo].[BasketAnalysis] + +@CardCode NVARCHAR(50) = NULL, +@LastNMonth INT = 12 + +AS +BEGIN + +DECLARE @AssoStartDate DATE = DATEFROMPARTS ( YEAR ( GETDATE()) -2, 1,1) +DECLARE @MinRepeatOrd INT = 2 -- Minimal repeated orders that a customer has to place for this item to be considered a regular item of a customer + + + +; WITH AssoCardCat AS ( + SELECT + T80.ItmsGrpNam, + T80.CardCode, + CAST ( COUNT(T80.CardCode) OVER (PARTITION BY T80.ItmsGrpNam ) AS DECIMAL(10,5) )[CatCustomers] + FROM + ( + SELECT + T99.CardCode, + T99.ItmsGrpNam, + COUNT ( DISTINCT T99.DocNum ) [OrdCount], + COUNT ( DISTINCT T99.itemCode) [ActiveLines] + FROM SalesAnalysis T99 + + WHERE + T99.DocDate >= @AssoStartDate AND + T99.Sales > 0 AND + T99.ItmsGrpCod NOT IN (158, 163, 229) AND + T99.ProjectCod IN ('WCT', 'CRE', 'EXP') + + GROUP BY + T99.CardCode, + T99.ItmsGrpNam + ) T80 + WHERE T80.OrdCount > @MinRepeatOrd +) +, AssoCustomerCat AS +( + SELECT * + FROM + ( + SELECT + T99.*, + Prob_A = T99.CustomersforCatA/ T99.TotalCustomers , + Prob_B = T99.CustomersforCatB / T99.TotalCustomers, + Support = T99.CommonCustomers / T99.TotalCustomers, + + Confidence = T99.CommonCustomers / T99.CustomersforCatA , -- Out of total A customers, how many are into B + + Lift = (T99.CommonCustomers / T99.TotalCustomers ) / + (( T99.CustomersforCatA/ T99.TotalCustomers) * ( T99.CustomersforCatB / T99.TotalCustomers ) ), + LiftRankInCat = ROW_NUMBER() OVER + ( + PARTITION BY T99.CategoryA + ORDER BY (T99.CommonCustomers / T99.TotalCustomers ) / + (( T99.CustomersforCatA/ T99.TotalCustomers) * ( T99.CustomersforCatB / T99.TotalCustomers ) ) DESC + ) + + FROM + ( + SELECT + T0.ItmsGrpNam [CategoryA], + T0.CatCustomers [CustomersforCatA], + T1.ItmsGrpNam [CategoryB], + T1.CatCustomers [CustomersforCatB], + CAST ( COUNT ( T0.CardCode ) AS DECIMAL (10,5)) [CommonCustomers], + TotalCustomers = + ( + SELECT COUNT( DISTINCT T0.CardCode) + FROM SalesAnalysis T0 + WHERE + T0.DocDate >= @AssoStartDate AND + T0.Sales > 0 AND + T0.ItmsGrpCod NOT IN (158, 163, 229) AND + T0.ProjectCod IN ('WCT', 'CRE', 'EXP') + ) + + FROM AssoCardCat T0 + JOIN AssoCardCat T1 ON T1.CardCode = T0.CardCode AND T1.ItmsGrpNam < T0.ItmsGrpNam + GROUP BY + T0.ItmsGrpNam, T1.ItmsGrpNam, + T0.CatCustomers, T1.CatCustomers + ) T99 WHERE T99.CustomersforCatA != 0 AND T99.CustomersforCatB != 0 + ) T100 WHERE T100.Lift > 1 +), +CustomerSpecific AS +( + SELECT + T100.* , + SUM(T100.[CategorySpend]) OVER ()[TotalSpend], + T100.[CategorySpend] / SUM(T100.[CategorySpend]) OVER () [SpendPct], + SpendRank = + ROW_NUMBER () OVER (ORDER BY T100.[CategorySpend] DESC ) + FROM + ( + SELECT + SA.CardCode, + SA.CardName, + SA.ItmsGrpCod, + SA.ItmsGrpNam, + SUM(SA.Sales) [CategorySpend] + + FROM SalesAnalysis SA + WHERE SA.CardCode = @CardCode + AND + ( + SA.DocDate BETWEEN EOMONTH ( GETDATE(), -1 * @LastNMonth -1 ) AND GETDATE() + ) + AND SA.ItmsGrpCod NOT IN ( 229, 158, 163 ) + GROUP BY SA.CardCode, + SA.CardName, + SA.ItmsGrpCod, + SA.ItmsGrpNam + ) T100 +) +, FullReport AS +( + SELECT + T0.*, + T1.CustomersforCatA [CustomersforCat], + T1.CategoryB [RecommendCat], + T1.CustomersforCatB [CustommersInRecommCat], + T1.TotalCustomers, + T1.Prob_A, + T1.Prob_B, + T1.Support, + T1.Confidence, + T1.Lift + --RecommLine = + --T1.CategoryB + '(Confi: ' + CONVERT (NVARCHAR(10), ROUND ( T1.Confidence * 100 ,0 ) ) + --+ ' - Lift: ' + CONVERT (NVARCHAR(10), ROUND ( T1.Lift * 100 ,0 ) ) + ')' + FROM CustomerSpecific T0 + LEFT JOIN AssoCustomerCat T1 ON T1.CategoryA = T0.ItmsGrpNam + AND T1.CategoryB NOT IN + ( + SELECT + T99.ItmsGrpNam + FROM CustomerSpecific T99 + ) +), +Overview AS +( + SELECT + T0.CardCode, + T0.CardName, + ROW_NUMBER() OVER (ORDER BY MAX( T0.Lift) DESC) [LiftRank], + ROW_NUMBER() OVER (ORDER BY MAX ( T0.Confidence) DESC ) [ConfiRank], + T0.RecommendCat, + T0.CustommersInRecommCat, + COUNT( T0.ItmsGrpNam ) BaseCount, + -- STRING_AGG ( RecommLine, ',' ) BaseReason, + MAX( T0.Lift) [MaxLift], + MAX(T0.Confidence) [MaxConfi] + FROM FullReport T0 + WHERE T0.RecommendCat IS NOT NULL + GROUP BY T0.RecommendCat, + T0.CardCode, + T0.CardName, + T0.CustommersInRecommCat +) +SELECT * FROM Overview +END +``` + +## Vesion for Sales Mangers + +If all this is too complex for you and you are only interested in using this technique to analyst your own data and get insights from that. you can follow another post of mine. + +[Basket Analysis for Sales Managers]({% post_url 2024-11-04-basket-analysis-for-sales-managers %}) + +[1]:{{ site.url }}/assets/pbix/241103_AdventureWorks_na.pbix diff --git a/_posts/2024-11-04-basket-analysis-for-sales-managers.md b/_posts/2024-11-04-basket-analysis-for-sales-managers.md new file mode 100644 index 00000000..04f3f670 --- /dev/null +++ b/_posts/2024-11-04-basket-analysis-for-sales-managers.md @@ -0,0 +1,46 @@ +--- +layout: post +title: Unlocking cross sales opportunities with pre-made Basket Analysis template +subtitle: Basket Analysis for Sales Managers +categories: data +excerpt_image: "/assets/images/posts/jacek-dylag-jo8C9bt3uo8-unsplash.jpg" +banner: "/assets/images/posts/jacek-dylag-jo8C9bt3uo8-unsplash.jpg" +tags: [basket-analysis, excel, templates] +--- + +Images: Jacek Dylag (unsplash.com) + +## Goal + +This post provides sales managers ability to run simple Basket Analysis insights on their own without having to request support from Data Analyst. Our goal is to offer practical, easy-to-implement strategies that help you drive performance and achieve measurable results. + +## Key Idea + +Basket analysis using Association Rule, involves understanding purchase patterns over time to see how products relate to each other within a customer’s buying journey. In this approach, each customer is viewed as a "basket," meaning we analyze product relationships based on all items they’ve bought, across different orders. The purpose is to find patterns in what customers purchase together, over time, rather than in a single transaction. + +Key metrics, like *support*, *confidence*, and *lift*, are used to understand these product relationships: + +- **Support**: The percentage of total customers who bought both items. It gives a sense of the prevalence of a combination in the overall population. +- **Confidence**: The likelihood that a customer who bought one product also bought another. +- **Lift**: The strength of the association between two products. A lift greater than 1 implies a meaningful association that could predict future purchases. + +For example, if "Computers" and "Cell phones" frequently appear in a customer’s purchase history, the analysis could reveal that customers who buy "Cell phones" have a high likelihood of also purchasing "Computers," helping companies understand customer preferences and potentially guiding personalized marketing strategies. + +## Additional interests? + + +So, because the purpose of this article is to give sales manager a tool that they can just plug in with their own data to generate some reports, i am just going to focus on that. if you are interested in knowing more about this topic, feel free to move to another article of mine. + + +[Basket Analysis for Data analyst]({% post_url 2024-11-04-basket-analysis-for-data-analyst %}) + +## Template: Plug in your own data + +There are detailed instructions in the file. I’ve tested with some of proprietary data I have access to, it worked. but in case yours throw errors, feel free to pin me an email with screenshot and I shall come back as soon as i can. + +File Attachment: [Basket analysis - Basic.xlsx][1] + + + + +[1]:{{ site.url }}/assets/xlsx/Basket-analysis-Basic.xlsx diff --git a/_posts/2024-11-04-basket-free-power-bi-template-showcase.md b/_posts/2024-11-04-basket-free-power-bi-template-showcase.md new file mode 100644 index 00000000..ac87f8b0 --- /dev/null +++ b/_posts/2024-11-04-basket-free-power-bi-template-showcase.md @@ -0,0 +1,35 @@ +--- +layout: post +title: Free Power Bi template - a showcase of dashboard development +categories: data, showcase +excerpt_image: "assets/images/screenshot/241104_powerbi_Slide3.PNG" +banner: "assets/images/banners/home-original.jpeg" +tags: [basket-analysis, excel, templates] +top: 1 +--- + +# Power BI (DAX) Implementation + +## Goal + +This post is share my experience of developing power bi dashboard. I will find another time to discuss the internal working, data model and other bits around power bi development with you. In this post, I will just share the file with you. You can do whatever you want with this file. + +I taught myself the skills around this work in the last three years, if anything I did in the posts I shared is considered less optimal, I am more than happy to learn from you. Please don't hestate to point out for me! I appreciate that kind guesture. + +I couldn't post my work that's done for my last employer because privacy reasons. All the information I shared in this post is purely sourced from internet and you don't need to worry about the patent issues. + + +## Screenshots + +Here are some screenshots. + +![Overview](/assets/images/screenshot/241104_powerbi_Slide3.jpeg) +![Disclaimer](/assets/images/screenshot/241104_powerbi_Slide4.jpeg) +![OneReportToRuleThemAll](/assets/images/screenshot/241104_powerbi_Slide5.jpeg) +![BasketAnalysis](/assets/images/screenshot/241104_powerbi_Slide6.jpeg) + +## File Download + +Attachment: [241103_AdventureWorks_na.pbix][1] + +[1]:{{ site.url }}/assets/pbix/241103_AdventureWorks_na.pbix diff --git a/assets/images/posts/heidi-fin-2TLREZi7BUg-unsplash.jpg b/assets/images/posts/heidi-fin-2TLREZi7BUg-unsplash.jpg new file mode 100644 index 00000000..4a925a6d Binary files /dev/null and b/assets/images/posts/heidi-fin-2TLREZi7BUg-unsplash.jpg differ diff --git a/assets/images/posts/jacek-dylag-jo8C9bt3uo8-unsplash.jpg b/assets/images/posts/jacek-dylag-jo8C9bt3uo8-unsplash.jpg new file mode 100644 index 00000000..d8321570 Binary files /dev/null and b/assets/images/posts/jacek-dylag-jo8C9bt3uo8-unsplash.jpg differ diff --git a/assets/images/screenshot/241104_powerbi_Slide3.jpeg b/assets/images/screenshot/241104_powerbi_Slide3.jpeg new file mode 100644 index 00000000..fa157eea Binary files /dev/null and b/assets/images/screenshot/241104_powerbi_Slide3.jpeg differ diff --git a/assets/images/screenshot/241104_powerbi_Slide4.jpeg b/assets/images/screenshot/241104_powerbi_Slide4.jpeg new file mode 100644 index 00000000..310bd0a5 Binary files /dev/null and b/assets/images/screenshot/241104_powerbi_Slide4.jpeg differ diff --git a/assets/images/screenshot/241104_powerbi_Slide5.jpeg b/assets/images/screenshot/241104_powerbi_Slide5.jpeg new file mode 100644 index 00000000..aaae96d0 Binary files /dev/null and b/assets/images/screenshot/241104_powerbi_Slide5.jpeg differ diff --git a/assets/images/screenshot/241104_powerbi_Slide6.jpeg b/assets/images/screenshot/241104_powerbi_Slide6.jpeg new file mode 100644 index 00000000..3b36eea2 Binary files /dev/null and b/assets/images/screenshot/241104_powerbi_Slide6.jpeg differ diff --git a/assets/pbix/241103_AdventureWorks_na.pbix b/assets/pbix/241103_AdventureWorks_na.pbix new file mode 100644 index 00000000..80cea840 Binary files /dev/null and b/assets/pbix/241103_AdventureWorks_na.pbix differ diff --git a/assets/xlsx/Basket-analysis-Basic.xlsx b/assets/xlsx/Basket-analysis-Basic.xlsx new file mode 100644 index 00000000..e1308c05 Binary files /dev/null and b/assets/xlsx/Basket-analysis-Basic.xlsx differ diff --git a/resource.html b/resource.html index 48ac9477..2c4b2d60 100644 --- a/resource.html +++ b/resource.html @@ -1,65 +1,10 @@ --- layout: about -title: Resource Collection +title: Resource --- -

What is No Arbitrage

-

- :earth_asia: No arbitrage is a fundamental concpet to the Modern Portfolio Theory and pricing models - for the most complex financial products such as Black-Scholes option pricing model. - Some works from those academic giants has been recognised with Nobel Prizes. However, The basic idea is simple, -No profit gain is possible without taking appropriate risks in an efficient market. Or simply No pain, no gain. - It is a practical philosophy of living. -

No Arbitrate (this site) is my personal blog created to share practical financial knowledge, - ranging from basics to advanced concepts, and from theory to hands-on spreadsheet techniques. - It’s a space for exchanging ideas, making connections, and inspiring both readers and myself. - -

-

My Background

-

- I am Joseph Cai, a financial analyst living in London, originally from China. - With over a decade of formal academic training in finance, my expertise lies primarily in corporate finance. - However, my interests extend well beyond that field. I'm deeply fascinated by derivative products, largely due to their mathematical rigor and the sophisticated modeling they require. -

-

- In recent years, I've also embarked on a journey to expand my technical skill set through programming. - I started with SQL and then moved on to DAX (Power BI), M-Scripting (Power Query), and Python. - Lately, I've been experimenting with automating tasks using TypeScript in Excel. -

- -

Who Is This Blog For?

-

- If you have a curiosity about finance and a desire to grow their knowledge in the field. - Whether you're a finance major, a professional preparing for exams like the CFA, a business manager looking to sharpen your strategic skills, or an entrepreneur seeking funding for your next venture, I hope you'll find value here. -

-

- I'll be sharing a mix of original content and insightful pieces from other sources. My goal is to create a resource that not only educates but also inspires. - Whether you're looking to pick up a new technical skill, unwind on a quiet Sunday afternoon, or find motivation during a late-night work session, I hope my posts will provide what you need. -

- -

Life is about Learning

-

- I've taken many turns in my career, and one thing I've learned is that life's path is rarely a straight line. - A valued mentor and friend once encouraged me that if you are focused on your goal and learn from mistakes, you will reach there sooner or later. - - The most meaningful achievements often take time to materialize. It's crucial to be patient and stay committed to the process because, - in the words of James Clear from Atomic Habits: -

-
-

"The seed of every habit is a single, tiny decision... It is your commitment to the process that will determine your progress."

-
-

You have to trust that you can rise through and above adversity.

-

- ▶ Please note that none of the content on this site should be considered financial advice; - any decision-making should be evaluated based on specific cases. -

- -

Let's Connect

-

- I'm always open to making new connections and exchanging ideas. Feel free to reach out to me on - LinkedIn, if you'd like to connect or have a chat. - Let's learn, grow, and inspire each other! -

+

Yet to come

+

👏 Sorry, check again later!