-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore bad data found #44
Comments
I'm not ignoring you, btw. I just haven't had time to respond yet. It would be really useful if you could write a unit test to demonstrate this not working. That would save me some time when I have time to look into this feature. It's technically not valid CSV, but very rarely is CSV ever following some sort of standard. It makes sense to support this, though. As long as a quote isn't followed by a comma, I can see it being treated as part of the token. I am curious how I handle this today -- I am guessing I just throw an exception -- I am not sure. A unit test would be really useful. |
Encountered this challenge as well, a tab-delimited file whose values may contain quote characters as part of the value. For example ... with this record, the final value contains an unterminated quote that is part of the value. In reality, there should be a closing quote after noncash, but either way, I know this file does not quote values but instead includes quotes within values.
Within SeparatedValueRecordParser.cs (line 86) in function GetNextToken on line 101 this block exists:
In this case, it is encountering a quote " character and treats it as a quoted value. When it does not find the closing quote it throws an error within GetQuotedToken(). It would be great if an option existed that allowed quote characters to be ignored and treated as normal characters. Then the block above would be:
Without this option, I have worked around this by specifying a quote character that I know will not exist in the file hence turning actual quotes " into normal tokens and bypassing this check:
|
I'll take a look. The example data you posted seems a bit odd, being just one line. Can you provide some more background on the schema? I'm just going to operate under the assumption it's just tsv and go from there. |
I created a test with the data you posted above. Specifically, I tested how my library handles embedded quotes. The only time my library should care about quotes is if they are the first character at the start of a value. In your example, the quote seems to be in the middle of a value. This is different than what @shardick is running into, which is where a value is starting with a quote and an embedded quote is not a terminating quote. I am going to ask you to create a new ticket, providing expected vs actual data because, so far, I am not sure what issue you are reporting. Here is the test I wrote and it is passing: string source = "DebtConversionConvertedInstrumentAmount1\tus-gaap/2019" +
"\t0\t0\tmonetary\tD\tC\tDebt Conversion, Converted Instrument, Amount" +
"\tThe value of the financial instrument(s) that the original debt is being converted into in a noncash (or part noncash) transaction. \"Part noncash refers to that portion of the transaction not resulting in cash receipts or cash payments in the period.";
StringReader stringReader = new StringReader(source);
SeparatedValueOptions options = new SeparatedValueOptions()
{
IsFirstRecordSchema = false,
Separator = "\t"
};
SeparatedValueReader reader = new SeparatedValueReader(stringReader, options);
object[][] expected = new object[][]
{
new object[]
{
"DebtConversionConvertedInstrumentAmount1",
"us-gaap/2019",
"0",
"0",
"monetary",
"D",
"C",
"Debt Conversion, Converted Instrument, Amount",
"The value of the financial instrument(s) that the original debt is being converted into in a noncash (or part noncash) transaction. \"Part noncash refers to that portion of the transaction not resulting in cash receipts or cash payments in the period."
}
};
assertRecords(expected, reader); Also please include any version information to help me diagnose this. I just tried with your exact Btw, I think using |
Hi,
i'm trying to import a csv with quotes. With CsvHelper it works but FlatFiles is way more better and i would to use it.
A field has this value = "LICEO ARTISTICO "GAUDENZIO FERRARI"" (quotes inside not escaped).
SeparatedValueReader fires the error "SeparatedValueSyntaxException: A syntax error was encountered: Unmatched quote."
I would to ignore this error and import the field in any case.
There is a way to achieve this? Maybe with Preprocessor? I cannot figure how.
Thank you
The text was updated successfully, but these errors were encountered: