Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_meta fails with large multi event data sets #82

Open
lukasjonkers opened this issue Oct 6, 2022 · 0 comments
Open

read_meta fails with large multi event data sets #82

lukasjonkers opened this issue Oct 6, 2022 · 0 comments

Comments

@lukasjonkers
Copy link

lukasjonkers commented Oct 6, 2022

For data sets with many events (or with many variables) the metadata block at the top of the file is larger than 1,000 lines (e.g. doi 10.1594/PANGAEA.61061). Because only the first 1,000 lines are read in read_meta function (in zzz.R):

lns <- readLines(x, n = 1000)
ln_no <- grep("\\*/", lns)

does not yield a value as */ only occurs at the end of the metadata block and

all_lns <- seq_len(ln_no)

fails.

I guess the easiest solution would be to increase or remove the limit on the number of lines that are read. Removing the limit entirely is perhaps impractical with very large datasets, but simply increasing it doesn't guarantee that the issue never occurs. Perhaps add a loop to sequentially increase n until */ is found? Something like

nlines <- 1000
lns <- readLines(x, n = nlines)
ln_no <- grep("\\*/", lns)
while(length(ln_no) == 0){
    nlines <- nlines + 1000
    lns <- readLines(x, n = nlines)
    ln_no <- grep("\\*/", lns)
  }

read_csv also in zzz.R has the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant