Expected the # of transactions & CLV etc #1221
Replies: 9 comments
-
Hey! Have you tried the Pareto/NB model (see https://www.pymc-marketing.io/en/stable/notebooks/clv/pareto_nbd.html) ? This model allows for time-invariant covariates, this might help the model performance. |
Beta Was this translation helpful? Give feedback.
-
Hey @sevimcengiz, I presume you're working with this dataset: https://www.kaggle.com/datasets/ankitverma2010/ecommerce-customer-churn-analysis-and-prediction/data Databricks uses this same dataset in a notebook utilizing the The plots in that notebook suggest a wildly disparate population of customers. It's best to filter out extraneous customers when identified to improve performance. The choice of time period on which to do the train/test split is also important due to the possibility of data drift over time. The same concept also makes It's also best to evaluate these models in aggregate because historical purchase frequency is given in integers, but predicted purchases are provided as decimal floats over time. Applying a discount rate with
Yes - not only does the |
Beta Was this translation helpful? Give feedback.
-
Thanks @juanitorduz, I'll implement it. @ColtAllen thanks for the long and detailed answer. The databricks link helped me a a lot. The data that I'm working is real data but similar to the online retail data, I've quickly implemented all parts from the data bricks. I've followed each step but get "NaN" as a CLV value for all customers. Firstly, why is this "purchases in the calibration period" written as a x-axis title name in the first figure? I think it should be "purchases in the holdout period". As I'm keeping the holdout period for the validation check. Model calculates and validates averagely. It doesn't fit perfectly but accaptable at the first point but getting "NaN" CLV for all customers makes me "hmm, which points I'm missing?". |
Beta Was this translation helpful? Give feedback.
-
This is a common bug in |
Beta Was this translation helpful? Give feedback.
-
Hi @sevimcengiz |
Beta Was this translation helpful? Give feedback.
-
I faced with NaN when I used pymc. |
Beta Was this translation helpful? Give feedback.
-
@sevimcengiz the API for these models has changed since this issue was created. Do you have any code examples you're able to share? |
Beta Was this translation helpful? Give feedback.
-
Actually, i haven’t check the code the last 5 months. But, I can check.Sevim CengizOn Nov 15, 2024, at 12:18 AM, Colt Allen ***@***.***> wrote:
@sevimcengiz the API for these models has changed since this issue was created. Do you have any code examples you're able to share?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi,
I have the real e-commerce data. Firstly, I've used "lifetimes" library for expected transactions, and expected average revenue, and CLV prediction. However, the model overfits and it fails when it comes to test data. There is a seriuos mismatch with the actual data and predicted data.
Then, I've found "Pymc".
I've transaction data and required information like you have in your tutorials. I've used Pymc bg nbd and gamma gamma models.
I've split the data for training and testing. I fit the model with the training data and then I predict the testing data and compared with the current actual data. The result doesn't satisfy. What to do to increase the accuracy? Any trick to validate the results, etc?
Which points do I miss?
Beta Was this translation helpful? Give feedback.
All reactions