Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tweak_initial_estimates: Matrix handling #640

Open
barrettk opened this issue Jan 18, 2024 · 8 comments · May be fixed by #656
Open

tweak_initial_estimates: Matrix handling #640

barrettk opened this issue Jan 18, 2024 · 8 comments · May be fixed by #656

Comments

@barrettk
Copy link
Collaborator

During development of the tweak_inital_estimates PR, we determined that some additional checks and modification is required for matrix-type records (OMEGA and SIGMA).

For one, we have to check that jittered values still result in a positive-definite matrix. Altering off-diagonal values seems to increase the likelihood of this, but doesn't make the problem any more difficult to solve.

Proposal:

  • We determine if the final tweaked matrix is positive-definite. If not, pass it through a variation of Matrix::nearPD(). The exact function may not work, as the matrix passed in to nmrec::set_omega() utilizes NA values to denote 'skipping' that index (i.e. dont overwrite it). It also may denote that value was not specified in the control stream file (most common scenario).

Other questions:

  • Does this change for correlation matrices? As in, do we need some other check, rather than a positive-definite for correlation matrices?
  • We currently 'skip' FIXED values or block(n) records. Do we want this handling for any other flag commonly found in matrix-type records?

Examples to think about

$omega 1 sd fix 2
$theta 10 11
$omega block(2) fix
3
0.1 cor 4
$omega block(2)
3 fix
0.1 uni 4
@barrettk
Copy link
Collaborator Author

FYI It was determined that we do in fact need additional handling for cor matrices. In this case, the diagonals are variances, and off-diagonal values specify correlation. In these instances, we first have to turn the matrix into a correlation or covariance matrix, check for positive-definiteness/making necessary adjustments, and then convert it back to the correct format (variance on diagonal, correlation on off-diagonal)

@barrettk
Copy link
Collaborator Author

We also determined that it would not be possible to check for positive-definiteness using the full matrix in the example below:

> nmrec::extract_omega(ctl)
      [,1]  [,2]  [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0.500    NA    NA   NA   NA   NA   NA   NA
[2,] 0.001 0.500    NA   NA   NA   NA   NA   NA
[3,] 0.001 0.001 0.500   NA   NA   NA   NA   NA
[4,] 0.001 0.001 0.001  0.5   NA   NA   NA   NA
[5,]    NA    NA    NA   NA 0.01   NA   NA   NA
[6,]    NA    NA    NA   NA 0.00 0.01   NA   NA
[7,]    NA    NA    NA   NA 0.00 0.00 0.01   NA
[8,]    NA    NA    NA   NA 0.00 0.00 0.00 0.01

We would first have to convert NA values to 0 to run Matrix::nearPD, and then change them back to NA after positive-definiteness was achieved. This is necessary, as the indices of NA occurrences are be used to tell nmrec to 'skip' these values when setting them in the control stream file (also denoting which values were actually specified in the control stream).

This presents a problem, as this could change the 0's to some other value to ensure positive-definiteness, and turning them back to NA would break this.

I proposed an alternative: we follow a similar approach for each matrix-type record, rather than the combined matrix. We would then be checking these two matrices separately:

      [,1]  [,2]  [,3]  [,4]
[1,] 0.500    NA    NA   NA 
[2,] 0.001 0.500    NA   NA   
[3,] 0.001 0.001 0.500   NA 
[4,] 0.001 0.001 0.001  0.5 
     [,1]  [,2] [,3] [,4]
[1,] 0.01   NA   NA   NA
[2,] 0.00 0.01   NA   NA
[3,] 0.00 0.00 0.01   NA
[4,] 0.00 0.00 0.00 0.01

After they are confirmed to be positive-definite, we could the diagonally concatenate them, and set the other values to NA. @curtisKJ suggested the simpar package did something pretty similar.

cc @kyleam - this would require extract_omega to return the individual matrices (per record). I really like the full matrix output, so perhaps that could either A), be an attribute, or B) we could have a helper for formatting a given record as a matrix? Im sure we will need to discuss this more/provide more background, but just giving you some preliminary background on this discussion.

@barrettk
Copy link
Collaborator Author

barrettk commented Jan 25, 2024

cc @seth127 @curtisKJ @timwaterhouse
The only issue im seeing right now after my investigation, is rounding (again). After making the matrix positive-definite, rounding can make it no longer positive-definite. That being said, the example I tested included a negative covariance, which I havent seen in any examples before. I do think rounding could be an issue still though.

We can't follow the method we used for $THETA records (in terms of addressing rounding issues with bounds), as the loop for checking positive-definiteness would become a bit absurd. Im starting to think we should just remove the rounding as an argument, as it just seems like a lot to juggle while also balancing positive-definiteness.

  • My gut says we do this: attempt to round to the number of digits passed. However, if the jittered matrix is not positive-definite, we abandon the rounding for that matrix-type record, and just use the returned matrix. I've done a lot of testing on this front, and I dont think we can reliably round while preserving positive-definiteness (at least with Matrix::nearPD, which seems to be the gold standard in R).

Functions used in this example

Functions
#' Function to check and modify a matrix for positive-definiteness
#' 
#' @param mat a symmetric matrix, or a matrix with only lower triangular values
#'  specified (assumed to be symmetric).
#' @param corr logical indicating if the matrix should be a correlation matrix.
#' 
#' @keywords internal
check_and_modify_pd <- function(mat, corr = FALSE, digits = 3) {
  
  # Coerce NA values to 0 for checking positive-definiteness
  # pass in original matrix to determine placement of NA values
  mat_zero_spec <- fmt_mat_zeros(mat)
  mat_zero <- mat_zero_spec$mat
  
  # TODO: check if matrix is correlation or covariance matrix, and pass to nearPD
  if (!is_positive_definite(mat_zero)) {
    mat_zero <- Matrix::nearPD(
      mat_zero, base.matrix = TRUE, 
      corr = corr, ensureSymmetry = TRUE)$mat
  }
  
  # Round and ensure matrix is still postive-definite
  mat_zero <- round(mat_zero, digits)
  is_positive_definite(mat_zero)
  
  # Reset original NA values
  mat_zero[mat_zero_spec$na_indices] <- NA
  return(mat_zero)
}

#' Check if matrix is positive-definite
#' 
#' Assumes matrix is a square matrix, is symmetric, and contains only numeric 
#' values
#' 
#' @details 
#' References:
#' https://github.com/TomKellyGenetics/matrixcalc/blob/master/R/is.positive.definite.R
#' https://www.geeksforgeeks.org/fixing-non-positive-definite-correlation-matrices-using-r/
#' 
#' @param mat a square matrix
#' 
#' @keywords internal
is_positive_definite <- function(mat){
  eigenvalues <- eigen(mat, only.values = TRUE)$values
  return(all(eigenvalues > 0))
}


#' Make matrix symmetric and format NA values zeros
#' 
#' @inheritParams check_and_modify_pd
#' 
#' @returns a list containing the formatted matrix, and orignal `NA` indices
#' 
#' @keywords internal
fmt_mat_zeros <- function(mat){
  # Store *original* NA indices
  na_indices <- is.na(mat)
  
  # make matrix symmetric
  mat_sym <- make_mat_symmetric(mat)
  
  # Set current NA values to 0 (if any)
  mat_sym[is.na(mat_sym)] <- 0
  
  # Return new matrix and indices of NA values
  return(
    list(
      mat = mat_sym,
      na_indices = na_indices
    )
  )
}


#' Make a lower triangular matrix a symmetric one
#' 
#' @inheritParams check_and_modify_pd
#' 
#' @noRd
make_mat_symmetric <- function(mat){
  mat[upper.tri(mat)] <- mat[lower.tri(mat)]
  return(mat)
}

#' Check if matrix is symmetric (with no tolerance)
#' 
#' @param mat a square matrix
#' 
#' @noRd
mat_is_symmetric <- function(mat){
  Matrix::isSymmetric(mat, checkDN = FALSE, tol = 0)
}

Example

In this example, I am only checking a specific $OMEGA record, rather than the full matrix.

  • The setup isn't too important for illustrating the rounding issue, but helps to illustrate the full process.
Setup

For 1001.ctl (modified: first record has two diagonal values):

$OMEGA
0.09          ;[P] KOUThealthy
0.07          ; new eta value
$OMEGA BLOCK (3)
    0.09          ;[P] KOUTdmd
    0.01          ;[P] cov21
    0.09          ;[P] KINdmd
    0.01          ;[P] cov31
    0.01          ;[P] cov32
    0.09          ;[P] SCALAR
.mod <- read_model(file.path(MODEL_DIR_X, "1001"))
omega_full <- get_initial_est(.mod)$omegas

# Ideally, this separation would be done on the `nmrec` side
omega_1 <- omega_full[1:2, 1:2]
omega_2 <- omega_full[3:5, 3:5]

> omega_full
     [,1] [,2] [,3] [,4] [,5]
[1,] 0.09   NA   NA   NA   NA
[2,]   NA 0.07   NA   NA   NA
[3,]   NA   NA 0.09   NA   NA
[4,]   NA   NA 0.01 0.09   NA
[5,]   NA   NA 0.01 0.01 0.09
> omega_2
     [,1] [,2] [,3]
[1,] 0.09   NA   NA
[2,] 0.01 0.09   NA
[3,] 0.01 0.01 0.09

# Set a value as negative to ensure matrix is -not- positive-definite
> omega_2[3,3] <- -0.3
> omega_2
     [,1] [,2] [,3]
[1,] 0.09   NA   NA
[2,] 0.01 0.09   NA
[3,] 0.01 0.01 -0.3
Illustration of rounding being an issue
# Original Matrix
> mat <- omega_2
> mat
     [,1] [,2] [,3]
[1,] 0.09   NA   NA
[2,] 0.01 0.09   NA
[3,] 0.01 0.01 -0.3

# Coerce NA values to 0 for checking positive-definiteness
# pass in original matrix to determine placement of NA values
> mat_zero_spec <- fmt_mat_zeros(mat)
> mat_zero <- mat_zero_spec$mat

> mat_zero
     [,1] [,2]  [,3]
[1,] 0.09 0.01  0.01
[2,] 0.01 0.09  0.01
[3,] 0.01 0.01 -0.30

> # Make matrix positive-definite if it isn't already
>   if (!is_positive_definite(mat_zero)) {
+     mat_zero <- Matrix::nearPD(
+       mat_zero, base.matrix = TRUE, 
+       corr = corr, ensureSymmetry = TRUE)$mat
+   }

> # New positive-definite matrix
> mat_zero
            [,1]        [,2]         [,3]
[1,] 0.090187111 0.010187111 0.0025062166
[2,] 0.010187111 0.090187111 0.0025062166
[3,] 0.002506217 0.002506217 0.0001251551

> is_positive_definite(mat_zero)
[1] TRUE

> # Round and ensure matrix is still postive-definite
> mat_zero <- round(mat_zero, digits)
> mat_zero
      [,1]  [,2]  [,3]
[1,] 0.090 0.010 0.003
[2,] 0.010 0.090 0.003
[3,] 0.003 0.003 0.000

# No longer positive-definite after rounding
> is_positive_definite(mat_zero)
[1] FALSE

In the event the matrix was actually positive-definite here after rounding, I would have then done the following:

> # Reset original NA values
> mat_zero[mat_zero_spec$na_indices] <- NA
> mat_zero
      [,1]  [,2] [,3]
[1,] 0.090    NA   NA
[2,] 0.010 0.090   NA
[3,] 0.003 0.003    0

After doing this for each record, I would then diagonally concatenate the two matrices again:

# `check_and_modify_pd()` handles all the steps illustrated above^
omega_1 <- check_and_modify_pd(omega_1)
omega_2 <- check_and_modify_pd(omega_2)

# combine matrices into full matrix 
# Note: I wouldn't use this method, as it doesnt allow NAs, but showing for illustration purposes
omega_full <- as.matrix(Matrix::bdiag(list(omega_1, omega_2)))
  • I would then check for positive definiteness for the combined matrix (with zeros instead of NAs)
  • I would have a dev_bug() in the event it wasn't, because everything ive read indicated diagonally concatenated matrices should retain their positive-definiteness

@barrettk
Copy link
Collaborator Author

barrettk commented Jan 25, 2024

FWIW I tried something like the following, in an attempt to use the rounded matrix as the starting point, and that didnt work either:

while (iteration <= 100 && !is_positive_definite(mat_zero)) {
  # Use nearPD to ensure positive-definiteness
  mat_zero <- Matrix::nearPD(
    mat_zero, base.matrix = TRUE,
    corr = corr, ensureSymmetry = TRUE)$mat
  
  # Apply rounding to the positive-definite matrix
  mat_zero <- round(mat_zero, digits)
  
  # Increment iteration counter
  iteration <- iteration + 1
}

In this example, it does work with digits = 5, but obviously we cant rely on that for any type of matrix. We could attempt to "shrink" the number of required decimal places using a similar approach (iteratively increase digits), but is that really worth it if it doesnt match the specified number of digits?

However, it's possible the example im working with is flawed for one reason or another. Perhaps, we would be able to round in some, or even most cases. We could attempt to round, but not require it if it doesnt seem possible. Maybe something like this:

  • Note: I changed mat_zero to mat_init at this point
check_and_modify_pd <- function(mat, corr = FALSE, digits = 3) {
  
  # Coerce NA values to 0 for checking positive-definiteness
  # pass in original matrix to determine placement of NA values
  mat_zero_spec <- fmt_mat_zeros(mat)
  mat_init <- mat_zero_spec$mat
  
  # TODO: check if matrix is correlation or covariance matrix, and pass to nearPD
  if (!is_positive_definite(signif(mat_init, digits))) {
    mat_init <- Matrix::nearPD(
      mat_init, base.matrix = TRUE,
      corr = corr, ensureSymmetry = TRUE)$mat
    
    # Attempt to round matrix if it can still be postive-definite
    if(is_positive_definite(round(mat_init, digits))){
      mat_init <- signif(mat_init, digits)
    }else{
      message("A record could not be rounded while ensuring postive-definiteness")
    }
  }
  
  # Reset original NA values
  mat_init[mat_zero_spec$na_indices] <- NA
  return(mat_init)
}

Example

> omega_1
     [,1] [,2]
[1,] 0.09   NA
[2,]   NA 0.07
> omega_2
     [,1] [,2] [,3]
[1,] 0.09   NA   NA
[2,] 0.01 0.09   NA
[3,] 0.01 0.01 -0.3
> check_and_modify_pd(omega_1)
     [,1] [,2]
[1,] 0.09   NA
[2,]   NA 0.07
> check_and_modify_pd(omega_2)
A record could not rounded while ensuring postive-definiteness
            [,1]        [,2]         [,3]
[1,] 0.090187111          NA           NA
[2,] 0.010187111 0.090187111           NA
[3,] 0.002506217 0.002506217 0.0001251551

@kyleam
Copy link
Collaborator

kyleam commented Jan 26, 2024

Some scattered thoughts...

FYI It was determined that we do in fact need additional handling for cor matrices. In this case, the diagonals are variances, and off-diagonal values specify correlation.

I'm confused by what you're referring to as "cor" matrices.

Given these NONMEM options

  • VARIANCE (default) or STANDARD controls whether diagonals are taken as variance or standard deviation, respectively
  • COVARIANCE (default) or CORRELATION controls whether the off-diagonals are taken as covariance or correlations, respectively

... I guess by "cor" matrix you're referring to a block with marked with CORRELATION along with VARIANCE?

Wouldn't you also need to handle the mismatch in the other direction (i.e. STANDARD with COVARIANCE)?

Also, I guess CHOLESKY representations need to be considered.


I'm not spotting any discussion on xN values or VALUES(diag,odiag), at least explicitly. Say you have a record like

$OMEGA BLOCK(6)
0.1
0.01 0.1
(0.01)x2 0.1
(0.01)x3 0.1
(0.01)x4 0.1
(0.01)x5 0.1

where extract_omega would give you

      [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 20.10   NA   NA   NA   NA   NA
[2,] 20.01 20.1   NA   NA   NA   NA
[3,] 20.01   NA 20.1   NA   NA   NA
[4,] 20.01   NA   NA 20.1   NA   NA
[5,] 20.01   NA   NA   NA 20.1   NA
[6,] 20.01   NA   NA   NA   NA 20.1

I take it the idea is to fill in those repeat values before computing the nearest positive definite matrix. If that's the case, even if those repeated values are filled in (by bbr or by nmrec or by using final results of previous run), set_omega will overwrite only the explicit values. So, doesn't that mean the value that actually ends up in the control stream may not line up with the computed value? That seems like a showstopper that would require a major update to the nmrec::set_* helpers to solve.


For the rounding issues, have you checked whether you can just let NONMEM deal with those? From the docs:

With NONMEM 7.3 if the initial estimate of a block is not positive
definite because of rounding errors, a value will be added to the
diagonal elements to make it positive definite. A message in the
NONMEM report file will indicate that this was done.  E.g.,

  DIAGONAL SHIFT OF  1.1000E-03 WAS IMPOSED TO ENSURE POSITIVE DEF-
  INITENESS

@barrettk
Copy link
Collaborator Author

barrettk commented Jan 26, 2024

In terms of cor, I was referring to the example I show in the main issue (shorthand for CORRELATION). @curtisKJ indicated that this meant the off diagonals specified correlations, but the diagonals were still variances, meaning the matrix has to first be converted to either a correlation matrix, or a covariance matrix, before checking for positive-definiteness.

In terms of the other flags you mentioned, I hadn't looked into them too closely, but yeah we would absolutely have to convert those as well. nearPD can only take a covariance or correlation matrix, so we would definitely have to convert STANDARD diagonals as well.

I take it the idea is to fill in those repeat values before computing the nearest positive definite matrix. If that's the case, even if those repeated values are filled in (by bbr or by nmrec or by using final results of previous run), set_omega will overwrite only the explicit values. So, doesn't that mean the value that actually ends up in the control stream may not line up with the computed value? That seems like a showstopper that would require a major update to the nmrec::set_* helpers to solve.

I assume the SAME() flag is similar to the example you provided, though I can definitely see some differences. Seth mentioned the SAME() flag late yesterday, and it wasnt something I had thought about. Im a little confused by your example though -where is the 20 coming from? (assuming that's a typo).

My original idea was to just overwrite the values not specified in the control stream (as an actual value) with NA. That wouldnt work though: nearPD would likely change each of those repeated values to ensure positive-definiteness, and there isnt a way to ensure they're all the same (you can only use keepDiag = TRUE). I could envision a way to handle that during the jittering portion, but not during the nearPD() call. Im honestly not sure how we could address that.

@kyleam
Copy link
Collaborator

kyleam commented Jan 26, 2024

Im a little confused by your example though -where is the 20 coming from? (assuming that's a typo).

Sorry, just a typo (copied the wrong output):

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.10   NA   NA   NA   NA   NA
[2,] 0.01  0.1   NA   NA   NA   NA
[3,] 0.01   NA  0.1   NA   NA   NA
[4,] 0.01   NA   NA  0.1   NA   NA
[5,] 0.01   NA   NA   NA  0.1   NA
[6,] 0.01   NA   NA   NA   NA  0.1

@kyleam
Copy link
Collaborator

kyleam commented Jan 26, 2024

In terms of cor, I was referring to the example I show in the main issue (shorthand for CORRELATION). @curtisKJ indicated that this meant the off diagonals specified correlations, but the diagonals were still variances [...]

Okay, so in that particular example, given that the VARIANCE default. But yes, as far as I understand, you need to consider the other flags when making that decision, rather than just say CORR means X. In other words, the sets of options interact. I think the combinations you need to worry about are VARIANCE + CORRELATION and STANDARD + COVARIANCE as well as CHOLESKY.

barrettk added a commit that referenced this issue Feb 22, 2024
 - as referred to in `nmrec`: 'vpair' subtypes, or when parameter options represent `VALUES(diag,odiag)`
 - 'values' subtypes, or when parameter options contain the form `(0.01)x2 0.1`

This change pulls in a few internal `nmrec` functions (`matrix_is_vpair`, `param_options`, and a modified version of `param_x`). These functions are pretty simple, and I probably would have developed similar ones on my own if I didnt think to grab them from `nmrec`. That being said, I imagine we will want to pull out some of the functionality `get_matrix_opts` is responsible for, and move it to the `nmrec` side.

As of this commit, all concerns mentioned in #640 have been addressed. Additional tests are needed, and the `nmrec`/`bbr` refactor I mentioned above will likely come beforehand.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants