Skip to content

Commit 552b179

Browse files
committed
Merge branch 'development' into feature-pnbddyncov-pmf
2 parents e33e791 + f45f703 commit 552b179

File tree

121 files changed

+1410
-1488
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

121 files changed

+1410
-1488
lines changed

DESCRIPTION

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Package: CLVTools
22
Title: Tools for Customer Lifetime Value Estimation
3-
Version: 0.10.1
4-
Date: 2023-10-23
3+
Version: 0.11.2
4+
Date: 2024-10-13
55
Authors@R: c(
66
person(given="Patrick", family="Bachmann", email = "pbachma@ethz.ch", role = c("cre","aut")),
77
person(given="Niels", family="Kuebler", email = "niels.kuebler@uzh.ch", role = "aut"),
@@ -155,7 +155,7 @@ Collate:
155155
'pnbd_dyncov_expectation.R'
156156
'pnbd_dyncov_palive.R'
157157
'pnbd_dyncov_pmf.R'
158-
RoxygenNote: 7.3.1
158+
RoxygenNote: 7.3.2
159159
VignetteBuilder: knitr
160160
Config/testthat/parallel: false
161161
Config/testthat/edition: 3

NAMESPACE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ export(clvdata)
4646
export(latentAttrition)
4747
export(newcustomer)
4848
export(newcustomer.dynamic)
49+
export(newcustomer.spending)
4950
export(newcustomer.static)
5051
export(spending)
5152
exportMethods(bgbb)

NEWS.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,43 @@
1+
# CLVTools 0.11.1
2+
3+
### NEW FEATURES
4+
* Updated the apparel example data
5+
* Prediction bootstrapping: Calculate confidence intervals using regular rather than "reversed-quantiles"
6+
7+
### BUG FIXES
8+
* Prediction bootstrapping: Re-fit model using exact original specification
9+
* GGomNBD: Set limit in integration method to size of workspace
10+
11+
12+
13+
# CLVTools 0.11.0
14+
15+
### NEW FEATURES
16+
* More memory efficient and faster creation of repeat transactions in `clv.data`
17+
* Use existing repeat transactions when calling `gg` with `remove.first.transaction = TRUE`
18+
* Simplify the formula interfaces `latentAttrition()` and `spending()`
19+
* Add `predicted.total.spending` to predictions
20+
* Harmonize parameter names used in various S3 methods
21+
* Bootstrapping: Add facilities to estimate parameter uncertainty for all models
22+
* Ability to predict future transactions of customers with no existing transaction history
23+
* New start parameters for all latent attrition models
24+
* Pareto/NBD dyncov: Improved numeric stability of PAlive
25+
* GGomNBD: Implement erratum by Jost Adler to predict CET correctly
26+
* GGomNBD: Improve numerical stability and runtime of LL integral
27+
* GGomNBD: Implement PMF as derived by Jost Adler
28+
* lrtest(): Likelihood ratio testing for latent attrition models
29+
* Accept `data.table::IDate` as data inputs to `clvdata`
30+
* `summary.clv.data`:Much faster by improving the calculation of the mean inter-purchase time
31+
* Reduced fitting times for all models by using a compressed CBS as input to the LL sum
32+
* Faster hessian calculation if a model was using correlation
33+
34+
### BUG FIXES
35+
* Estimating the Pareto/NBD dyncov with correlation was not possible
36+
* GGomNBD: Free workspace after it is not used anymore to avoid memory-leak
37+
* `SetDynamicCovariates`: Verify there is no covariate data for nonexistent customers
38+
39+
40+
141
# CLVTools 0.10.0
242

343
### NEW FEATURES

R/all_generics.R

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -165,10 +165,10 @@ setGeneric(name="clv.model.process.newdata", def=function(clv.model, clv.fitted,
165165
setGeneric(name="clv.model.pmf", def=function(clv.model, clv.fitted, x)
166166
standardGeneric("clv.model.pmf"))
167167

168-
# .. New customer expectation -----------------------------------------------------------------------------------------------
169-
# predict unconditional expectation until individual t_i for all customers in clv.fitted@clv.data
170-
setGeneric("clv.model.predict.new.customer.unconditional.expectation", function(clv.model, clv.fitted, clv.newcustomer, t)
171-
standardGeneric("clv.model.predict.new.customer.unconditional.expectation"))
168+
# .. New customer prediction -----------------------------------------------------------------------------------------------
169+
setGeneric("clv.model.predict.new.customer", function(clv.model, clv.fitted, clv.newcustomer)
170+
standardGeneric("clv.model.predict.new.customer"))
171+
172172

173173

174174

R/class_clv_model_bgnbd.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -106,15 +106,15 @@ setMethod("clv.model.expectation", signature(clv.model="clv.model.bgnbd.no.cov")
106106
fct.expectation = fct.bgnbd.expectation, clv.time = clv.fitted@clv.data@clv.time))
107107
})
108108

109-
# . clv.model.predict.new.customer.unconditional.expectation --------------------------------------------------------------------------------------------------------
110-
setMethod("clv.model.predict.new.customer.unconditional.expectation", signature = signature(clv.model="clv.model.bgnbd.no.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer, t){
109+
# . clv.model.predict.new.customer --------------------------------------------------------------------------------------------------------
110+
setMethod("clv.model.predict.new.customer", signature = signature(clv.model="clv.model.bgnbd.no.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer){
111111

112112
return(bgnbd_nocov_expectation(
113113
r = clv.fitted@prediction.params.model[["r"]],
114114
alpha = clv.fitted@prediction.params.model[["alpha"]],
115115
a = clv.fitted@prediction.params.model[["a"]],
116116
b = clv.fitted@prediction.params.model[["b"]],
117-
vT_i = t))
117+
vT_i = clv.newcustomer@num.periods))
118118
})
119119

120120

R/class_clv_model_bgnbd_staticcov.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -130,8 +130,8 @@ setMethod("clv.model.expectation", signature(clv.model="clv.model.bgnbd.static.c
130130
fct.expectation = fct.bgnbd.expectation, clv.time = clv.fitted@clv.data@clv.time))
131131
})
132132

133-
# . clv.model.predict.new.customer.unconditional.expectation -----------------------------------------------------------------------------------------------------
134-
setMethod("clv.model.predict.new.customer.unconditional.expectation", signature = signature(clv.model="clv.model.bgnbd.static.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer, t){
133+
# . clv.model.predict.new.customer -----------------------------------------------------------------------------------------------------
134+
setMethod("clv.model.predict.new.customer", signature = signature(clv.model="clv.model.bgnbd.static.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer){
135135

136136
m.cov.trans <- clv.newcustomer.static.get.matrix.cov.trans(clv.newcustomer=clv.newcustomer, clv.fitted=clv.fitted)
137137
m.cov.life <- clv.newcustomer.static.get.matrix.cov.life(clv.newcustomer=clv.newcustomer, clv.fitted=clv.fitted)
@@ -153,7 +153,7 @@ setMethod("clv.model.predict.new.customer.unconditional.expectation", signature
153153
vAlpha_i = alpha_i,
154154
vA_i = a_i,
155155
vB_i = b_i,
156-
vT_i = t
156+
vT_i = clv.newcustomer@num.periods
157157
))
158158
})
159159

R/class_clv_model_gg.R

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ setMethod("clv.model.backtransform.estimated.params.model", signature = signatur
6060
#' @importFrom utils modifyList
6161
setMethod(f = "clv.model.prepare.optimx.args", signature = signature(clv.model="clv.model.gg"), definition = function(clv.model, clv.fitted, prepared.optimx.args){
6262

63-
dt.compressed.cbs <- clv.fitted@cbs[, .(n = .N), by=c('x', 'Spending')]
63+
dt.compressed.cbs <- clv.fitted@cbs[, list(n = .N), by=c('x', 'Spending')]
6464

6565
optimx.args <- modifyList(prepared.optimx.args,
6666
list(LL.function.sum = gg_LL,
@@ -115,6 +115,18 @@ setMethod("clv.model.predict", signature(clv.model="clv.model.gg"), function(clv
115115
})
116116

117117

118+
# .clv.model.predict.newcustomer --------------------------------------------------------------------------------------------------------
119+
setMethod("clv.model.predict.new.customer", signature(clv.model="clv.model.gg"), function(clv.model, clv.fitted, clv.newcustomer){
120+
121+
p <- clv.fitted@prediction.params.model[["p"]]
122+
q <- clv.fitted@prediction.params.model[["q"]]
123+
gamma <- clv.fitted@prediction.params.model[["gamma"]]
124+
125+
# setting x=0 in the ordinary prediction function
126+
return( (gamma) * p/(q - 1) )
127+
})
128+
129+
118130
# .clv.model.vcov.jacobi.diag --------------------------------------------------------------------------------------------------------
119131
setMethod(f = "clv.model.vcov.jacobi.diag", signature = signature(clv.model="clv.model.gg"), definition = function(clv.model, clv.fitted, prefixed.params){
120132

R/class_clv_model_ggomnbd_nocov.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -118,16 +118,16 @@ setMethod("clv.model.expectation", signature(clv.model="clv.model.ggomnbd.no.cov
118118
})
119119

120120

121-
# . clv.model.predict.new.customer.unconditional.expectation --------------------------------------------------------------------------------------------------------
122-
setMethod("clv.model.predict.new.customer.unconditional.expectation", signature = signature(clv.model="clv.model.ggomnbd.no.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer, t){
121+
# . clv.model.predict.new.customer --------------------------------------------------------------------------------------------------------
122+
setMethod("clv.model.predict.new.customer", signature = signature(clv.model="clv.model.ggomnbd.no.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer){
123123

124124
return(ggomnbd_nocov_expectation(
125125
r = clv.fitted@prediction.params.model[["r"]],
126126
alpha_0 = clv.fitted@prediction.params.model[["alpha"]],
127127
beta_0 = clv.fitted@prediction.params.model[["beta"]],
128128
b = clv.fitted@prediction.params.model[["b"]],
129129
s = clv.fitted@prediction.params.model[["s"]],
130-
vT_i = t))
130+
vT_i = clv.newcustomer@num.periods))
131131
})
132132

133133

R/class_clv_model_ggomnbd_staticcov.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -100,8 +100,8 @@ setMethod("clv.model.expectation", signature(clv.model="clv.model.ggomnbd.static
100100
fct.expectation = fct.expectation, clv.time = clv.fitted@clv.data@clv.time))
101101
})
102102

103-
# . clv.model.predict.new.customer.unconditional.expectation -----------------------------------------------------------------------------------------------------
104-
setMethod("clv.model.predict.new.customer.unconditional.expectation", signature = signature(clv.model="clv.model.ggomnbd.static.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer, t){
103+
# . clv.model.predict.new.customer -----------------------------------------------------------------------------------------------------
104+
setMethod("clv.model.predict.new.customer", signature = signature(clv.model="clv.model.ggomnbd.static.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer){
105105

106106
m.cov.trans <- clv.newcustomer.static.get.matrix.cov.trans(clv.newcustomer=clv.newcustomer, clv.fitted=clv.fitted)
107107
m.cov.life <- clv.newcustomer.static.get.matrix.cov.life(clv.newcustomer=clv.newcustomer, clv.fitted=clv.fitted)
@@ -122,7 +122,7 @@ setMethod("clv.model.predict.new.customer.unconditional.expectation", signature
122122
s = clv.fitted@prediction.params.model[["s"]],
123123
vAlpha_i= alpha_i,
124124
vBeta_i = beta_i,
125-
vT_i = t))
125+
vT_i = clv.newcustomer@num.periods))
126126
})
127127

128128

R/class_clv_model_pnbd.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -269,15 +269,15 @@ setMethod("clv.model.expectation", signature(clv.model="clv.model.pnbd.no.cov"),
269269

270270

271271

272-
# . clv.model.predict.new.customer.unconditional.expectation --------------------------------------------------------------------------------------------------------
273-
setMethod("clv.model.predict.new.customer.unconditional.expectation", signature = signature(clv.model="clv.model.pnbd.no.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer, t){
272+
# . clv.model.predict.new.customer --------------------------------------------------------------------------------------------------------
273+
setMethod("clv.model.predict.new.customer", signature = signature(clv.model="clv.model.pnbd.no.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer){
274274

275275
return(pnbd_nocov_expectation(
276276
r = clv.fitted@prediction.params.model[["r"]],
277277
s = clv.fitted@prediction.params.model[["s"]],
278278
alpha_0 = clv.fitted@prediction.params.model[["alpha"]],
279279
beta_0 = clv.fitted@prediction.params.model[["beta"]],
280-
vT_i = t))
280+
vT_i = clv.newcustomer@num.periods))
281281
})
282282

283283

R/class_clv_model_pnbd_dynamiccov.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -175,11 +175,11 @@ setMethod("clv.model.expectation", signature(clv.model="clv.model.pnbd.dynamic.c
175175
})
176176

177177

178-
# . clv.model.predict.new.customer.unconditional.expectation -----------------------------------------------------------------------------------------------------
179-
setMethod("clv.model.predict.new.customer.unconditional.expectation", signature = signature(clv.model="clv.model.pnbd.dynamic.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer, t){
178+
# . clv.model.predict.new.customer -----------------------------------------------------------------------------------------------------
179+
setMethod("clv.model.predict.new.customer", signature = signature(clv.model="clv.model.pnbd.dynamic.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer){
180180
return(pnbd_dyncov_newcustomer_expectation(
181181
clv.fitted=clv.fitted,
182-
t=t,
182+
t=clv.newcustomer@num.periods,
183183
tp.first.transaction=clv.newcustomer@first.transaction,
184184
dt.cov.life=clv.newcustomer@data.cov.life,
185185
dt.cov.trans=clv.newcustomer@data.cov.trans))

R/class_clv_model_pnbd_staticcov.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -211,8 +211,8 @@ setMethod("clv.model.expectation", signature(clv.model="clv.model.pnbd.static.co
211211

212212

213213

214-
# . clv.model.predict.new.customer.unconditional.expectation -----------------------------------------------------------------------------------------------------
215-
setMethod("clv.model.predict.new.customer.unconditional.expectation", signature = signature(clv.model="clv.model.pnbd.static.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer, t){
214+
# . clv.model.predict.new.customer -----------------------------------------------------------------------------------------------------
215+
setMethod("clv.model.predict.new.customer", signature = signature(clv.model="clv.model.pnbd.static.cov"), definition = function(clv.model, clv.fitted, clv.newcustomer){
216216

217217

218218
m.cov.trans <- clv.newcustomer.static.get.matrix.cov.trans(clv.newcustomer=clv.newcustomer, clv.fitted=clv.fitted)
@@ -232,7 +232,7 @@ setMethod("clv.model.predict.new.customer.unconditional.expectation", signature
232232
s = clv.fitted@prediction.params.model[["s"]],
233233
vAlpha_i = alpha_i,
234234
vBeta_i = beta_i,
235-
vT_i = t
235+
vT_i = clv.newcustomer@num.periods
236236
))
237237
})
238238

R/clv_template_controlflow_predict.R

Lines changed: 14 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -75,40 +75,33 @@ clv.controlflow.predict.add.uncertainty.estimates <- function(clv.fitted, dt.pre
7575
message("Calculating confidence intervals...")
7676
}
7777

78-
# Customers that are sampled multiple times are added to the boostrapping data with suffix "_BOOTSID_<i?"
78+
# Customers that are sampled multiple times are added to the boostrapping data with suffix "_BOOTSID_<i>"
7979
# Remove this suffix again to get the original Id and calculate the quantiles across a single customers multiple draws
8080
# regex: "ends with _BOOTSTRAP_ID_<one or more digits>"
8181
dt.boots[, Id := sub("_BOOTSTRAP_ID_[0-9]+$", "", Id)]
8282

83-
# quantiles for each predicted quantity
84-
# select only the existing ones
83+
# quantiles for each predicted quantity: select only the existing ones
8584
cols.predictions <- c("PAlive", "CET", "DERT", "DECT", "predicted.mean.spending", "predicted.total.spending", "predicted.CLV")
8685
cols.predictions <- cols.predictions[cols.predictions %in% colnames(dt.boots)]
8786

8887
# Long-format for easier handling of different prediction columns
8988
dt.boots <- melt(dt.boots, id.vars="Id", measure.vars=cols.predictions, variable.name="variable", value.name="value")
90-
dt.predictions.long <- melt(dt.predictions, id.vars="Id", measure.vars=cols.predictions, variable.name="variable", value.name="value")
9189

92-
# Calculate quantiles for each customer and prediction column
93-
#
94-
# Reversed quantiles
95-
# [theta_star - q_upper(diff), theta_star - q_lower(diff)]
96-
# where diff = theta_boot - theta_star
97-
# Note that q_upper is used for the lower boundary and q_lower for the upper boundary while subtracting in both cases.
98-
# Therefore quantile(probs=) is reversed.
90+
ci.levels <- c((1-level)/2, 1-(1-level)/2)
9991

100-
# Calculate difference between bootstrapped and regular predictions
101-
dt.boots[dt.predictions.long, value.star := i.value, on=c("Id", "variable")]
102-
dt.boots[, value.diff := value - value.star]
103-
104-
levels <- c((1-level)/2, 1-(1-level)/2)
105-
name.levels <- paste0(".CI.", levels*100) # outside table to avoid doing it for each customer
92+
# create names outside table to avoid doing it for each customer
93+
# only post-fix which is then appended to the content of col `variable`
94+
ci.post.fixes <- paste0(".CI.", ci.levels*100)
10695

96+
# Calculate quantiles for each customer and prediction column, using
97+
# ordinary quantiles
10798
dt.CI <- dt.boots[, list(
108-
ci.name=name.levels,
109-
# Have to use value.star[1] because there are >1 row if sampled more than once.
110-
# names=FALSE is considerably faster.
111-
ci.value = value.star[1] - quantile(value.diff, probs = rev(levels), names = FALSE)),
99+
# store the lower and upper CI name directly with the calculated value
100+
# this might could be moved to `ci.name := paste0(variable, ci.name)` but to
101+
# be sure the
102+
ci.name=ci.post.fixes,
103+
# names=FALSE is considerably faster
104+
ci.value = quantile(value, probs = ci.levels, names = FALSE)),
112105
keyby=c("Id", "variable")]
113106

114107
# Presentable names

R/data.R

Lines changed: 30 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,10 @@
2626
#'
2727
#' @description
2828
#' This is a simulated dataset containing the entire purchase history of customers made their first purchase at an
29-
#' apparel retailer on January 3rd 2005. In total the dataset contains 250 customers who made
30-
#' 3648 transactions between January 2005 and mid July 2006.
29+
#' apparel retailer on January 2nd 2005. In total the dataset contains 600 customers who made
30+
#' 3,187 transactions between January 2005 and end of December 2010.
3131
#'
32-
#' @format A \code{data.table} with 2353 rows and 3 variables:
32+
#' @format A \code{data.table} with 3,187 rows and 3 variables:
3333
#' \describe{
3434
#' \item{\code{Id}}{Customer Id}
3535
#' \item{\code{Date}}{Date of purchase}
@@ -45,12 +45,12 @@
4545

4646
#' @name apparelStaticCov
4747
#' @title Time-invariant Covariates for the Apparel Retailer Dataset
48-
t
48+
4949
#' @description
50-
#' This simulated data contains additional demographic information on all 250 customers in the
50+
#' This simulated data contains additional demographic information on all 600 customers in the
5151
#' "apparelTrans" dataset. This information can be used as time-invariant covariates.
5252
#'
53-
#' @format A \code{data.table} with 250 rows and 3 variables:
53+
#' @format A \code{data.table} with 600 rows and 3 variables:
5454
#'
5555
#' \describe{
5656
#' \item{Id}{Customer Id}
@@ -68,14 +68,14 @@ t
6868
#' @title Time-varying Covariates for the Apparel Retailer Dataset
6969

7070
#' @description
71-
#' This simulated data contains direct marketing information on all 250 customers in the "apparelTrans" dataset.
71+
#' This simulated data contains seasonal information and additional covariates on all 600 customers in the "apparelTrans" dataset.
7272
#' This information can be used as time-varying covariates.
7373
#'
74-
#' @format A data.table with 20500 rows and 5 variables
74+
#' @format A data.table with 187,800 rows and 5 variables
7575
#' \describe{
7676
#' \item{Id}{Customer Id}
7777
#' \item{Cov.Date}{Date of contextual factor}
78-
#' \item{Marketing}{Direct marketing variable: number of times a customer was contacted with direct marketing in this time period}
78+
#' \item{High.Season}{Seasonal variable: 1 indicating a time-period that is considered "high season".}
7979
#' \item{Gender}{0=male, 1=female}
8080
#' \item{Channel}{Acquisition channel: 0=online, 1=offline}
8181
#' }
@@ -84,3 +84,24 @@ t
8484
#' @usage data("apparelDynCov")
8585
#' @docType data
8686
"apparelDynCov"
87+
88+
#' @name apparelDynCovFuture
89+
#' @title Future Time-varying Covariates for the Apparel Retailer Dataset
90+
91+
#' @description
92+
#' This simulated data contains seasonal information and additional covariates on all 600 customers in the "apparelTrans" after the last transaction in the dataset.
93+
#' This information can be used as time-varying covariates for prediction future customer behavior.
94+
#'
95+
#' @format A data.table with 56,400 rows and 5 variables
96+
#' \describe{
97+
#' \item{Id}{Customer Id}
98+
#' \item{Cov.Date}{Date of contextual factor}
99+
#' \item{High.Season}{Seasonal variable: 1 indicating a time-period that is considered "high season".}
100+
#' \item{Gender}{0=male, 1=female}
101+
#' \item{Channel}{Acquisition channel: 0=online, 1=offline}
102+
#' }
103+
#'
104+
#' @keywords datasets
105+
#' @usage data("apparelDynCovFuture")
106+
#' @docType data
107+
"apparelDynCovFuture"

0 commit comments

Comments
 (0)