Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any possibility to mask the missing values when calculating EOF? #13

Open
matteodefelice opened this issue Nov 2, 2017 · 3 comments
Open

Comments

@matteodefelice
Copy link
Contributor

Hi, unfortunately it's common to find datasets with NaN values, is it possible to mask in some way the NaN values when computing the EOF? Currently, the routine stops with a Error: There are missing values in the input data array

@jbedia
Copy link
Member

jbedia commented Feb 23, 2018

Hi @matteodefelice, sorry for our slow reply. We have been actually looking at alternatives for dealing with missing data in the past, but we did not find a straightforward solution (there are of course ways of handling this...). Because we mainly use PCA for perfect condition predictors (reanalysis), we did not bother much about it so far. The routine is stopped before the variance/covariance matrix is built in the presence of NA/NaN to prevent subsequent errors. Of course, suggestions are welcome for new functionalities to be included.

@jorgebanomedina
Copy link
Collaborator

Hi @matteodefelice and @jbedia, sorry for the delay in solving this issue. We have included an imputation approach based on replacing missing data by the mean or median value, according to what has been found and commonly used in the literature.

@jbedia
Copy link
Member

jbedia commented Jul 16, 2018

Hi @jorgebanomedina, no worries, it is great that you took the lead with this old issue. I would suggest to take the routine for NA filling out of this function in order to reuse it in other applications when needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants