Implement automatic ridge regression #124

witherscp · 2023-02-03T15:35:20Z

PR Description

Closes #123

Adds auto_reg parameter to vector_auto_regression. This will regularize the input matrix using scikit-learn's RidgeCV which searches for optimal alpha value. The auto_reg parameter is automatically set to True for ill-conditioned matrices.

Merge checklist

Maintainer, please confirm the following before merging:

adam2392 · 2023-02-03T19:22:14Z

mne_connectivity/vector_ar/var.py

+    auto_reg : bool, optional
+        Whether to perform automatic regularization of X matrix using RidgeCV, 
+        by default False. If matrix is not full rank, this will be adjusted to
+        True.


Suggested change

True.

True. If ``l2_reg`` is non-zero, then this must be set to `False`.

adam2392 · 2023-02-03T19:22:34Z

mne_connectivity/vector_ar/var.py

@@ -31,6 +32,10 @@ def vector_auto_regression(
        Autoregressive model order, by default 1.
    l2_reg : float, optional
        Ridge penalty (l2-regularization) parameter, by default 0.0.


Suggested change

Ridge penalty (l2-regularization) parameter, by default 0.0.

Ridge penalty (l2-regularization) parameter, by default 0.0.

If ``auto_reg`` is `True`, then this must be set to 0.0.

adam2392 · 2023-02-03T19:23:05Z

mne_connectivity/vector_ar/var.py

+    elif auto_reg and l2_reg:
+        raise ValueError("If l2_reg is set, then auto_reg must be set to False")


Can you add a unit-test for this case? Lmk if you need help w/ that.

adam2392 · 2023-02-03T19:24:20Z

mne_connectivity/vector_ar/var.py

@@ -31,6 +32,10 @@ def vector_auto_regression(
        Autoregressive model order, by default 1.
    l2_reg : float, optional
        Ridge penalty (l2-regularization) parameter, by default 0.0.
+    auto_reg : bool, optional
+        Whether to perform automatic regularization of X matrix using RidgeCV, 
+        by default False. If matrix is not full rank, this will be adjusted to


Suggested change

by default False. If matrix is not full rank, this will be adjusted to

by default False. If the data matrix has condition number less than 1e6, then this will be adjusted to

adam2392

Taking a quick look, I have another thought to make the API easier and more intuitive perhaps.

Instead of adding an auto_reg parameter, what are your thoughts on making l2_reg = 'auto' by default? So l2_reg can be 'auto', a float, or a list of floats, or None.

'auto': l2_reg will compute the optimal using a pre-defined grid of alphas, like you have
float: l2_reg will just be applied with this alpha value
list of floats: same as auto, but you specify the pre-defined grid of alphas
None, or if l2_reg is 0: no regularization

WDYT? This might make your/users life easier because you just have to worry about 1 parameter, which is documented vs figuring out how to play around with 2 parameters.

witherscp · 2023-02-03T19:33:27Z

I agree that using only the l2_reg argument will make things much simpler. Do you want me to implement this and then commit the changes?

adam2392 · 2023-02-03T19:36:03Z

I agree that using only the l2_reg argument will make things much simpler. Do you want me to implement this and then commit the changes?

That would be great and then I can help review the code!

witherscp · 2023-02-03T21:22:24Z

I opted to keep the underlying functions unchanged w/r/t the l2_reg parameter, but I had to overwrite the l2_reg value in line 185 to avoid problems. This isn't clean for the code but will not affect the user experience. I also thought it was helpful to include the cross-validation alpha values in the displayed model parameters, so that the user can view them if verbose is set to True.

mne_connectivity/vector_ar/var.py

adam2392

Looking good! Just some minor touchups. The next steps would be to:

add a unit-test for the new case
add an example, or augment an existing example demonstrating this usage and when it might be necessary

I think the CI are also showing some unit tests getting errors: https://github.com/mne-tools/mne-connectivity/actions/runs/4087951173/jobs/7049413823

You can follow the CONTRIBUTING guide to install relevant test packages and docs-building packages, so you can run the relevant checks locally since they are faster. You can also check coding style by running make pep. Lmk if you run into any issues.

Co-authored-by: Adam Li <adam2392@gmail.com>

witherscp · 2023-02-06T18:35:52Z

I've never worked with unit tests before. What would be a good unit test to add for the new features and how do I add that? As for the current tests that are failing, I am getting the same problem across 10 tests: numpy.core._exceptions._UFuncNoLoopError: ufunc 'multiply' did not contain a loop with signature matching types (dtype('<U4'), dtype('float64')) -> None. I'm not sure the cause of this error, but it may be related to the new default l2_reg value which is a str instead of float.

adam2392 · 2023-02-06T18:44:23Z

Re adding unit tests:

you can experiment w/ creating a dataset that should tests the functionality you have added. For example,

mne-connectivity/mne_connectivity/vector_ar/tests/test_var.py

Lines 28 to 86 in d71b067

    
           def create_noisy_data( 
        
               add_noise, 
        
               sigma=1e-4, 
        
               m=100, 
        
               random_state=12345, 
        
           ): 
        
               """Create noisy test data. 
        
               Generate a 2x2 linear system, and perturb 
        
               observations of the state variables with Gaussian noise. 
        
               Parameters 
        
               ---------- 
        
               add_noise : bool 
        
                   Whether to add noise or not. 
        
               sigma : float 
        
                   noise standard deviation. 
        
               m : int 
        
                   The number of samples. 
        
               return_A : bool 
        
                   Whether to return the A matrix 
        
               random_state : None | int | instance of ~numpy.random.RandomState 
        
                   If ``random_state`` is an :class:`int`, it will be used as a seed for 
        
                   :class:`~numpy.random.RandomState`. If ``None``, the seed will be 
        
                   obtained from the operating system (see 
        
                   :class:`~numpy.random.RandomState` for details). Default is 
        
                   ``None``. 
        
               Returns 
        
               ------- 
        
               sample_data : ndarray, shape (n_channels, n_samples) 
        
                   Observed sample data. Possibly with noise. 
        
               sample_eigs : np.ndarray 
        
                   The true eigenvalues of the system. 
        
               sample_A : np.ndarray 
        
                   (Optional) if ``return_A`` is True, then returns the 
        
                   true linear system matrix. 
        
               """ 
        
               rng = np.random.RandomState(random_state) 
        
               mu = 0.0 
        
               noise = rng.normal(mu, sigma, m)  # gaussian noise 
        
               A = np.array([[1.0, 1.0], [-1.0, 2.0]]) 
        
               A /= np.sqrt(3) 
        
               # compute true eigenvalues 
        
               true_eigvals = np.linalg.eigvals(A) 
        
               n = 2 
        
               X = np.zeros((n, m)) 
        
               X[:, 0] = np.array([0.5, 1.0]) 
        
               # evolve the system and perturb the data with noise 
        
               for k in range(1, m): 
        
                   X[:, k] = A.dot(X[:, k - 1]) 
        
                   if add_noise: 
        
                       X[:, k - 1] += noise[k - 1] 
        
               return X, true_eigvals, A

creates a noisy dataset, which is used to make sure that certain functions are robust to noise. You can add a function here to generate the dataset. You probably want to create a dataset that is "ill-conditioned". So augment a dataset that is not ill-conditioned with a few sensors/rows that are "almost" copies of a few other rows. Copy the rows, and then add some small noise. This should result in ill-conditioned data since the new rows are almost exact copies.

Then you want to add a unit test itself. The unit tests are named def test_**. Examples are in https://github.com/mne-tools/mne-connectivity/blob/main/mne_connectivity/vector_ar/tests/test_var.py.
We use pytest, which is used heavily in all fields. You might find it helpful/educational to peruse their docs: https://docs.pytest.org/en/7.2.x/. It's a good tool for helping you develop robust scientific software.

Lmk if you have any questions!

adam2392 · 2023-02-06T18:55:35Z

I've never worked with unit tests before. What would be a good unit test to add for the new features and how do I add that? As for the current tests that are failing, I am getting the same problem across 10 tests: numpy.core._exceptions._UFuncNoLoopError: ufunc 'multiply' did not contain a loop with signature matching types (dtype('<U4'), dtype('float64')) -> None. I'm not sure the cause of this error, but it may be related to the new default l2_reg value which is a str instead of float.

Re the errors you see, I think you are right. The issue is that l2_reg = 'auto' should be converted into a number. IIUC there are two cases for l2_reg == 'auto':

The condition number if below our threshold, therefore we will do CV to determine an optimal l2_reg value
The condition number does not fall below our threshold. Looking at the docstring and code, I don't think this case is covered, causing the errors. In this setting, we can either proceed with still running CV to determine the optimal l2_reg value, or we set l2_reg = 0.

So in summary:

l2_reg = None/0: no CV is done regardless of condition number. Maybe a warning is released
l2_reg = 'auto' (default now): CV is done only if condition number if high.
l2_reg = array of floats: CV is done using those array of floats regardless of condition number
l2_reg = number > 0: no CV is done, and the l2_reg is applied to all cases.

Does this seem right to you? If so, perhaps you can walk through the code and see where there might be issues? Once you write the unit-tests, those will also help in debugging the behavior in the different cases.

witherscp · 2023-02-08T18:28:42Z

Thanks for the debugging help in the last comment! I was able to fix the problem where I did not account for the case when l2_reg='auto' and data is not ill-conditioned.

I'm now having some trouble making the unit-tests. I noticed how you have a correct A matrix in your create_noisy_data() function which you can use to test whether the VAR-created A matrix is correct. I can create an ill-conditioned matrix using the below code, but how do I know the actual A matrix for this dataset?

rng = np.random.RandomState(0)
n_epochs, n_signals, n_times = 2, 4, 64
data = rng.randn(n_epochs, n_signals, n_times)
times = np.arange(n_times)
final_row = np.reshape(data[:,-1,:],(2,1,64))
illconditioned = np.concatenate((data, final_row), axis=1)
illconditioned[:,-1,:] += rng.normal(0,1e-6,128).reshape(2,-1)

Right now, my unit test would simply iterate through a set of parameters for l2_reg -- 'auto', None, 0, 0.3, np.array(0.1,1), [0.1,1] -- to make sure that no errors are encountered. But how can I know if the result is "correct"?

adam2392 · 2023-02-08T21:54:53Z

I'm now having some trouble making the unit-tests. I noticed how you have a correct A matrix in your create_noisy_data() function which you can use to test whether the VAR-created A matrix is correct. I can create an ill-conditioned matrix using the below code, but how do I know the actual A matrix for this dataset?
rng = np.random.RandomState(0)
n_epochs, n_signals, n_times = 2, 4, 64
data = rng.randn(n_epochs, n_signals, n_times)
times = np.arange(n_times)
final_row = np.reshape(data[:,-1,:],(2,1,64))
illconditioned = np.concatenate((data, final_row), axis=1)
illconditioned[:,-1,:] += rng.normal(0,1e-6,128).reshape(2,-1)

Hmm good question. Maybe you can try the following:

generate an A matrix that is ill-conditioned by: i) Maybe an easy way is to just create the eigenvalue matrix and set one of the eigenvalues to like 1e-6. Then generate random eigenvector matrices, or ii) create a lower/upper-triangular random matrix with diagonal entries <= 1 (to ensure stability) and then artificially set one of the diagonal entries to like 1e-6. Since they're triangular, the eigenvalues are exactly the diagonal entries.
generate a random x0 initial conditions vector and set 2/3 of the entries to be the same (i.e. 2/3 of the "artificial sensors" have the same starting condition).
generate forward in time
double check the condition number of the output test dataset (the generated time-series)
verify that your procedure does "well" (either compared to the non-regularized method, or just overall close to the ground-truth matrix)

Lmk if steps 1-4 end up giving you at least what you need in terms of the test dataset. / if you have any coding questions.

Right now, my unit test would simply iterate through a set of parameters for l2_reg -- 'auto', None, 0, 0.3, np.array(0.1,1), [0.1,1] -- to make sure that no errors are encountered. But how can I know if the result is "correct"?

We'll want to test this at least works on some simulated data. You also want to create a set of tests to verify that certain behavior is carried out in each case of l2. You can verify at the very minimum here that matrix is "close" to the ground-truth and/or that the model params passed to the connectivity data structure are what you might expect.

Unrelated note: I saw you work with Sara Inati? She was a collaborator when I was in my PhD. Say hi :).

witherscp · 2023-02-13T19:09:43Z

OK, I tried my best to follow your suggestions. I only needed to set one eigenvalue to 1e-6 to obtain an ill-conditioned matrix (as long as there are 12 channels in the dataset), so I did not need to set 2/3 of the starting conditions to the same value as well. Using the ill-conditioned data matrix, I was able to test out the l2_reg parameter with and without regularization.

Now the problem is that RidgeCV does not seem to outperform OLS in this sample dataset. In addition, many of the eigenvalues are dissimilar from the actual sample eigenvalues. What is your suggestion for next steps in debugging?

And yes, I do work with Sara Inati! Glad to see you have collaborated with us in the past. I'm thankful for your help on getting this improvement finished.

adam2392 · 2023-02-13T19:10:31Z

mne_connectivity/vector_ar/var.py

+            with warnings.catch_warnings():
+                warnings.filterwarnings(
+                    action='ignore',
+                    message="Ill-conditioned matrix"
+                )


Why do we need this?

RidgeCV tests out an array of alpha values and some of them do not regularize the matrix enough to avoid an ill-conditioned matrix error. If the user sees many of these messages pop up, they may think that something is going wrong, when in fact the expected behavior of the function is happening. RidgeCV will choose the best alpha value and that will be from an instance when this error was not thrown.

adam2392 · 2023-02-13T19:22:40Z

Now the problem is that RidgeCV does not seem to outperform OLS in this sample dataset. In addition, many of the eigenvalues are dissimilar from the actual sample eigenvalues. What is your suggestion for next steps in debugging?

Hmm I'll list some things to try systematically. Before you do so, it might be helpful to track the performance changes relative to your current attempt. The two metrics I have in mind are i) how closely you approximate the true A (so some type of matrix norm of the difference), and ii) how closely you approximate the set of true eigenvalues. Then you can see if any of these changes are at least moving in the right direction (i.e. RidgeCV better than OLS).

instead of simulating data with A being upper-triangular, perhaps it might help to have A not be upper-triangular. So to generate a random matrix with fixed set of eigenvalues, you just need to generate a random eigenvector matrix. Eigenvector matrices are unitary matrices (i.e. eigenvalues of 1 and some other special properties). Scipy seems to have a function for doing so: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.unitary_group.html. Also see https://mathematica.stackexchange.com/questions/228560/generate-random-matrix-with-specific-eigenvalues about generating matrices with specific set of eigenvalues.

Intuitively, this will allow the data to mix more. Perhaps rn the problem is so easy that OLS is just the better thing to do.

I would also try adding more ill-conditioning, so a few more very small eigenvalues.
Another thing to try is make some of the initial conditions exactly the same. Intuitively, A is mixing x0 (the initial conditions).
Another thing to try is to change the dimensionality (i.e. number of sensors). Usually regularization helps in higher-dimensions. Rn you have 12 sensors and 100 time points. You can play around w/ increasing n_sensors and n_timepoints. My suspicion is that n_sensors plays a greater role.

Lmk if that helps, or if you are stuck anywhere.

adam2392 · 2023-03-06T20:01:27Z

Hey @witherscp just checking in on this to see if you need any assistance?

If there is trouble figuring out a rigorous scenario where this is tested, we can always disable this by default and instead you can add a short example on some simulated, or real data (see existing examples), where this shows "improvement".

witherscp · 2023-03-07T20:55:20Z

Hey @witherscp just checking in on this to see if you need any assistance?

If there is trouble figuring out a rigorous scenario where this is tested, we can always disable this by default and instead you can add a short example on some simulated, or real data (see existing examples), where this shows "improvement".

Hey, sorry for the delay; I got sidetracked by other tasks that I have been working on. I will try to test out a few scenarios where RidgeCV outperforms OLS before opting to disable the default. If this proves to be too difficult, though, it will be easy to show an example where it is necessary (as in the case of common average reference).

Can you provide a little more help with the example you suggested of using a non-upper triangular matrix? I am not that well-versed in linear algebra, so I am started to get confused by some of these ideas. What lines of code would create a non-upper triangular matrix with multiple small eigenvalues?

Implement automatic ridge regression

91aa8b8

adam2392 self-requested a review February 3, 2023 16:30

adam2392 reviewed Feb 3, 2023

View reviewed changes

Replace auto_reg with revised l2_reg argument

7f8a19d

adam2392 reviewed Feb 3, 2023

View reviewed changes

mne_connectivity/vector_ar/var.py Outdated Show resolved Hide resolved

adam2392 reviewed Feb 3, 2023

View reviewed changes

mne_connectivity/vector_ar/var.py Show resolved Hide resolved

adam2392 reviewed Feb 3, 2023

View reviewed changes

mne_connectivity/vector_ar/var.py Outdated Show resolved Hide resolved

adam2392 reviewed Feb 3, 2023

View reviewed changes

witherscp and others added 4 commits February 6, 2023 12:57

Update mne_connectivity/vector_ar/var.py

f4ade9b

Co-authored-by: Adam Li <adam2392@gmail.com>

Add warn message and document default alphas

acad4d5

Make PEP8 compliant

49c189c

Add case for l2_reg = None

720edbd

Fix failed pytests

5feb309

Price Withers added 2 commits February 13, 2023 14:04

Suppress warning for ill-conditioned matrix

d85e80e

Add automatic l2_reg test

eccb46a

adam2392 reviewed Feb 13, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement automatic ridge regression #124

Implement automatic ridge regression #124

witherscp commented Feb 3, 2023 •

edited by adam2392

Loading

adam2392 Feb 3, 2023

adam2392 Feb 3, 2023

adam2392 Feb 3, 2023

adam2392 Feb 3, 2023

adam2392 left a comment

witherscp commented Feb 3, 2023

adam2392 commented Feb 3, 2023

witherscp commented Feb 3, 2023 •

edited

Loading

adam2392 left a comment •

edited

Loading

witherscp commented Feb 6, 2023

adam2392 commented Feb 6, 2023

adam2392 commented Feb 6, 2023

witherscp commented Feb 8, 2023

adam2392 commented Feb 8, 2023 •

edited

Loading

witherscp commented Feb 13, 2023 •

edited

Loading

adam2392 Feb 13, 2023

witherscp Feb 13, 2023

adam2392 commented Feb 13, 2023

adam2392 commented Mar 6, 2023

witherscp commented Mar 7, 2023

	True.
	True. If ``l2_reg`` is non-zero, then this must be set to `False`.

	Ridge penalty (l2-regularization) parameter, by default 0.0.
	Ridge penalty (l2-regularization) parameter, by default 0.0.
	If ``auto_reg`` is `True`, then this must be set to 0.0.

		elif auto_reg and l2_reg:
		raise ValueError("If l2_reg is set, then auto_reg must be set to False")

	by default False. If matrix is not full rank, this will be adjusted to
	by default False. If the data matrix has condition number less than 1e6, then this will be adjusted to

Implement automatic ridge regression #124

Are you sure you want to change the base?

Implement automatic ridge regression #124

Conversation

witherscp commented Feb 3, 2023 • edited by adam2392 Loading

PR Description

Merge checklist

adam2392 Feb 3, 2023

Choose a reason for hiding this comment

adam2392 Feb 3, 2023

Choose a reason for hiding this comment

adam2392 Feb 3, 2023

Choose a reason for hiding this comment

adam2392 Feb 3, 2023

Choose a reason for hiding this comment

adam2392 left a comment

Choose a reason for hiding this comment

witherscp commented Feb 3, 2023

adam2392 commented Feb 3, 2023

witherscp commented Feb 3, 2023 • edited Loading

adam2392 left a comment • edited Loading

Choose a reason for hiding this comment

witherscp commented Feb 6, 2023

adam2392 commented Feb 6, 2023

adam2392 commented Feb 6, 2023

witherscp commented Feb 8, 2023

adam2392 commented Feb 8, 2023 • edited Loading

witherscp commented Feb 13, 2023 • edited Loading

adam2392 Feb 13, 2023

Choose a reason for hiding this comment

witherscp Feb 13, 2023

Choose a reason for hiding this comment

adam2392 commented Feb 13, 2023

adam2392 commented Mar 6, 2023

witherscp commented Mar 7, 2023

witherscp commented Feb 3, 2023 •

edited by adam2392

Loading

witherscp commented Feb 3, 2023 •

edited

Loading

adam2392 left a comment •

edited

Loading

adam2392 commented Feb 8, 2023 •

edited

Loading

witherscp commented Feb 13, 2023 •

edited

Loading