Skip to content

Commit

Permalink
Add Unescape Support (#61)
Browse files Browse the repository at this point in the history
* Add `SepReaderOptions.Unescape` option, set to `true` to enable
automatic unescaping or columns. That is, removing outer quotes and
every second inner quote. Default is `false` and hence not to unescape.
* Update benchmarks to incl. performance when `Unescape = true` as
separate `Sep_Unescape` method.
* Internally parsers have been made generic to support unescaping
without any performance impact if not used.
* Fixes #19
  • Loading branch information
nietras authored Nov 16, 2023
1 parent 66a8e1a commit 30962ef
Show file tree
Hide file tree
Showing 39 changed files with 1,608 additions and 558 deletions.
259 changes: 166 additions & 93 deletions README.md

Large diffs are not rendered by default.

54 changes: 27 additions & 27 deletions benchmarks/AMD.Ryzen.9.5950X/FloatsReaderBench.md
Original file line number Diff line number Diff line change
@@ -1,41 +1,41 @@
```
BenchmarkDotNet v0.13.9+228a464e8be6c580ad9408e98f18813f6407fb5a, Windows 10 (10.0.19044.3086/21H2/November2021Update)
BenchmarkDotNet v0.13.10, Windows 10 (10.0.19044.3086/21H2/November2021Update)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 8.0.100-rc.2.23502.2
[Host] : .NET 8.0.0 (8.0.23.47906), X64 RyuJIT AVX2
Job-VQBAPT : .NET 7.0.12 (7.0.1223.47720), X64 RyuJIT AVX2
Job-QUTMRR : .NET 8.0.0 (8.0.23.47906), X64 RyuJIT AVX2
Job-WYCHAH : .NET 7.0.13 (7.0.1323.51816), X64 RyuJIT AVX2
Job-OXIFBK : .NET 8.0.0 (8.0.23.47906), X64 RyuJIT AVX2
InvocationCount=Default IterationTime=300.0000 ms MaxIterationCount=15
MinIterationCount=5 WarmupCount=6 Reader=String
```
| Method | Runtime | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
|---------- |--------- |------- |------ |-----------:|------:|---:|--------:|-------:|------------:|------------:|
| Sep______ | .NET 7.0 | Row | 25000 | 2.642 ms | 1.00 | 27 | 10318.8 | 105.7 | 1.56 KB | 1.00 |
| Sylvan___ | .NET 7.0 | Row | 25000 | 3.025 ms | 1.14 | 27 | 9013.6 | 121.0 | 10.55 KB | 6.78 |
| ReadLine_ | .NET 7.0 | Row | 25000 | 14.068 ms | 5.31 | 27 | 1938.0 | 562.7 | 89986.82 KB | 57,808.35 |
| CsvHelper | .NET 7.0 | Row | 25000 | 47.926 ms | 18.16 | 27 | 568.9 | 1917.1 | 20.74 KB | 13.32 |
| Sep______ | .NET 8.0 | Row | 25000 | 2.571 ms | 0.97 | 27 | 10604.7 | 102.8 | 1.56 KB | 1.00 |
| Sylvan___ | .NET 8.0 | Row | 25000 | 2.925 ms | 1.11 | 27 | 9320.4 | 117.0 | 10.55 KB | 6.78 |
| ReadLine_ | .NET 8.0 | Row | 25000 | 13.122 ms | 5.01 | 27 | 2077.7 | 524.9 | 89986.83 KB | 57,808.35 |
| CsvHelper | .NET 8.0 | Row | 25000 | 33.886 ms | 12.83 | 27 | 804.6 | 1355.4 | 20.61 KB | 13.24 |
| Sep______ | .NET 7.0 | Row | 25000 | 2.649 ms | 1.00 | 27 | 10291.1 | 106.0 | 1.2 KB | 1.00 |
| Sylvan___ | .NET 7.0 | Row | 25000 | 3.073 ms | 1.16 | 27 | 8873.0 | 122.9 | 10.59 KB | 8.84 |
| ReadLine_ | .NET 7.0 | Row | 25000 | 13.823 ms | 5.09 | 27 | 1972.3 | 552.9 | 89986.84 KB | 75,160.30 |
| CsvHelper | .NET 7.0 | Row | 25000 | 40.128 ms | 15.14 | 27 | 679.4 | 1605.1 | 20.74 KB | 17.32 |
| Sep______ | .NET 8.0 | Row | 25000 | 2.579 ms | 0.98 | 27 | 10571.6 | 103.2 | 1.2 KB | 1.00 |
| Sylvan___ | .NET 8.0 | Row | 25000 | 3.059 ms | 1.15 | 27 | 8911.7 | 122.4 | 10.59 KB | 8.84 |
| ReadLine_ | .NET 8.0 | Row | 25000 | 13.625 ms | 5.13 | 27 | 2000.9 | 545.0 | 89986.83 KB | 75,160.29 |
| CsvHelper | .NET 8.0 | Row | 25000 | 34.168 ms | 12.90 | 27 | 797.9 | 1366.7 | 20.61 KB | 17.22 |
| | | | | | | | | | | |
| Sep______ | .NET 7.0 | Cols | 25000 | 3.077 ms | 1.00 | 27 | 8860.0 | 123.1 | 1.56 KB | 1.00 |
| Sylvan___ | .NET 7.0 | Cols | 25000 | 4.852 ms | 1.58 | 27 | 5618.9 | 194.1 | 10.55 KB | 6.77 |
| ReadLine_ | .NET 7.0 | Cols | 25000 | 14.240 ms | 4.58 | 27 | 1914.5 | 569.6 | 89986.84 KB | 57,735.92 |
| CsvHelper | .NET 7.0 | Cols | 25000 | 42.056 ms | 13.67 | 27 | 648.3 | 1682.2 | 28451.27 KB | 18,254.45 |
| Sep______ | .NET 8.0 | Cols | 25000 | 3.095 ms | 1.00 | 27 | 8809.6 | 123.8 | 1.56 KB | 1.00 |
| Sylvan___ | .NET 8.0 | Cols | 25000 | 4.609 ms | 1.50 | 27 | 5915.6 | 184.3 | 10.55 KB | 6.77 |
| ReadLine_ | .NET 8.0 | Cols | 25000 | 13.020 ms | 4.22 | 27 | 2093.9 | 520.8 | 89986.83 KB | 57,735.91 |
| CsvHelper | .NET 8.0 | Cols | 25000 | 36.009 ms | 11.70 | 27 | 757.1 | 1440.4 | 28451.15 KB | 18,254.37 |
| Sep______ | .NET 7.0 | Cols | 25000 | 3.574 ms | 1.00 | 27 | 7628.8 | 142.9 | 1.2 KB | 1.00 |
| Sylvan___ | .NET 7.0 | Cols | 25000 | 4.968 ms | 1.39 | 27 | 5488.1 | 198.7 | 10.59 KB | 8.83 |
| ReadLine_ | .NET 7.0 | Cols | 25000 | 14.102 ms | 3.90 | 27 | 1933.3 | 564.1 | 89986.84 KB | 75,037.89 |
| CsvHelper | .NET 7.0 | Cols | 25000 | 42.826 ms | 11.98 | 27 | 636.6 | 1713.0 | 28451.27 KB | 23,724.84 |
| Sep______ | .NET 8.0 | Cols | 25000 | 3.294 ms | 0.92 | 27 | 8275.5 | 131.8 | 1.2 KB | 1.00 |
| Sylvan___ | .NET 8.0 | Cols | 25000 | 4.717 ms | 1.32 | 27 | 5779.3 | 188.7 | 10.59 KB | 8.83 |
| ReadLine_ | .NET 8.0 | Cols | 25000 | 13.991 ms | 3.92 | 27 | 1948.7 | 559.6 | 89986.83 KB | 75,037.88 |
| CsvHelper | .NET 8.0 | Cols | 25000 | 39.277 ms | 10.94 | 27 | 694.1 | 1571.1 | 28451.15 KB | 23,724.74 |
| | | | | | | | | | | |
| Sep______ | .NET 7.0 | Floats | 25000 | 32.440 ms | 1.00 | 27 | 840.4 | 1297.6 | 8.89 KB | 1.00 |
| Sylvan___ | .NET 7.0 | Floats | 25000 | 68.701 ms | 2.12 | 27 | 396.8 | 2748.1 | 18.86 KB | 2.12 |
| ReadLine_ | .NET 7.0 | Floats | 25000 | 79.471 ms | 2.45 | 27 | 343.1 | 3178.8 | 89993.42 KB | 10,122.28 |
| CsvHelper | .NET 7.0 | Floats | 25000 | 133.372 ms | 4.13 | 27 | 204.4 | 5334.9 | 22039.48 KB | 2,478.96 |
| Sep______ | .NET 8.0 | Floats | 25000 | 21.978 ms | 0.68 | 27 | 1240.5 | 879.1 | 9.11 KB | 1.02 |
| Sylvan___ | .NET 8.0 | Floats | 25000 | 65.359 ms | 2.01 | 27 | 417.1 | 2614.4 | 18.84 KB | 2.12 |
| ReadLine_ | .NET 8.0 | Floats | 25000 | 72.653 ms | 2.24 | 27 | 375.3 | 2906.1 | 89990.3 KB | 10,121.93 |
| CsvHelper | .NET 8.0 | Floats | 25000 | 110.129 ms | 3.39 | 27 | 247.6 | 4405.2 | 22036.58 KB | 2,478.63 |
| Sep______ | .NET 7.0 | Floats | 25000 | 33.288 ms | 1.00 | 27 | 819.0 | 1331.5 | 8.18 KB | 1.00 |
| Sylvan___ | .NET 7.0 | Floats | 25000 | 78.853 ms | 2.37 | 27 | 345.8 | 3154.1 | 18.89 KB | 2.31 |
| ReadLine_ | .NET 7.0 | Floats | 25000 | 87.688 ms | 2.62 | 27 | 310.9 | 3507.5 | 89993.42 KB | 11,002.06 |
| CsvHelper | .NET 7.0 | Floats | 25000 | 143.571 ms | 4.29 | 27 | 189.9 | 5742.8 | 22039.48 KB | 2,694.42 |
| Sep______ | .NET 8.0 | Floats | 25000 | 23.568 ms | 0.71 | 27 | 1156.8 | 942.7 | 8.13 KB | 0.99 |
| Sylvan___ | .NET 8.0 | Floats | 25000 | 70.200 ms | 2.10 | 27 | 388.4 | 2808.0 | 18.87 KB | 2.31 |
| ReadLine_ | .NET 8.0 | Floats | 25000 | 81.667 ms | 2.45 | 27 | 333.8 | 3266.7 | 89990.3 KB | 11,001.68 |
| CsvHelper | .NET 8.0 | Floats | 25000 | 121.367 ms | 3.65 | 27 | 224.6 | 4854.7 | 22035.94 KB | 2,693.98 |
68 changes: 37 additions & 31 deletions benchmarks/AMD.Ryzen.9.5950X/PackageAssetsBench.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,48 @@
```
BenchmarkDotNet v0.13.9+228a464e8be6c580ad9408e98f18813f6407fb5a, Windows 10 (10.0.19044.3086/21H2/November2021Update)
BenchmarkDotNet v0.13.10, Windows 10 (10.0.19044.3086/21H2/November2021Update)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 8.0.100-rc.2.23502.2
[Host] : .NET 8.0.0 (8.0.23.47906), X64 RyuJIT AVX2
Job-VQBAPT : .NET 7.0.12 (7.0.1223.47720), X64 RyuJIT AVX2
Job-QUTMRR : .NET 8.0.0 (8.0.23.47906), X64 RyuJIT AVX2
Job-WYCHAH : .NET 7.0.13 (7.0.1323.51816), X64 RyuJIT AVX2
Job-OXIFBK : .NET 8.0.0 (8.0.23.47906), X64 RyuJIT AVX2
InvocationCount=Default IterationTime=300.0000 ms MaxIterationCount=15
MinIterationCount=5 WarmupCount=6 Quotes=False
Reader=String
```
| Method | Runtime | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
|---------- |--------- |------ |------ |-----------:|------:|---:|--------:|-------:|-------------:|------------:|
| Sep______ | .NET 7.0 | Row | 50000 | 2.481 ms | 1.00 | 29 | 11761.3 | 49.6 | 1.13 KB | 1.00 |
| Sylvan___ | .NET 7.0 | Row | 50000 | 3.117 ms | 1.26 | 29 | 9360.8 | 62.3 | 7.17 KB | 6.34 |
| ReadLine_ | .NET 7.0 | Row | 50000 | 13.023 ms | 5.20 | 29 | 2240.8 | 260.5 | 88608.25 KB | 78,287.19 |
| CsvHelper | .NET 7.0 | Row | 50000 | 51.579 ms | 20.76 | 29 | 565.8 | 1031.6 | 20.65 KB | 18.25 |
| Sep______ | .NET 8.0 | Row | 50000 | 2.436 ms | 0.98 | 29 | 11978.3 | 48.7 | 1.13 KB | 1.00 |
| Sylvan___ | .NET 8.0 | Row | 50000 | 2.929 ms | 1.18 | 29 | 9962.0 | 58.6 | 7.17 KB | 6.33 |
| ReadLine_ | .NET 8.0 | Row | 50000 | 11.788 ms | 4.76 | 29 | 2475.5 | 235.8 | 88608.24 KB | 78,287.18 |
| CsvHelper | .NET 8.0 | Row | 50000 | 42.562 ms | 17.15 | 29 | 685.6 | 851.2 | 20.59 KB | 18.19 |
| | | | | | | | | | | |
| Sep______ | .NET 7.0 | Cols | 50000 | 3.166 ms | 1.00 | 29 | 9218.1 | 63.3 | 1.13 KB | 1.00 |
| Sylvan___ | .NET 7.0 | Cols | 50000 | 5.460 ms | 1.72 | 29 | 5344.6 | 109.2 | 7.18 KB | 6.33 |
| ReadLine_ | .NET 7.0 | Cols | 50000 | 13.603 ms | 4.26 | 29 | 2145.1 | 272.1 | 88608.25 KB | 78,152.32 |
| CsvHelper | .NET 7.0 | Cols | 50000 | 83.833 ms | 26.47 | 29 | 348.1 | 1676.7 | 446.31 KB | 393.65 |
| Sep______ | .NET 8.0 | Cols | 50000 | 3.142 ms | 0.99 | 29 | 9288.8 | 62.8 | 1.13 KB | 1.00 |
| Sylvan___ | .NET 8.0 | Cols | 50000 | 5.181 ms | 1.64 | 29 | 5632.6 | 103.6 | 7.18 KB | 6.33 |
| ReadLine_ | .NET 8.0 | Cols | 50000 | 12.208 ms | 3.85 | 29 | 2390.4 | 244.2 | 88608.24 KB | 78,152.32 |
| CsvHelper | .NET 8.0 | Cols | 50000 | 70.302 ms | 22.22 | 29 | 415.1 | 1406.0 | 446.35 KB | 393.68 |
| | | | | | | | | | | |
| Sep______ | .NET 7.0 | Asset | 50000 | 38.120 ms | 1.00 | 29 | 765.5 | 762.4 | 13800.21 KB | 1.00 |
| Sylvan___ | .NET 7.0 | Asset | 50000 | 44.675 ms | 1.17 | 29 | 653.2 | 893.5 | 14025 KB | 1.02 |
| ReadLine_ | .NET 7.0 | Asset | 50000 | 113.648 ms | 2.98 | 29 | 256.8 | 2273.0 | 102133.41 KB | 7.40 |
| CsvHelper | .NET 7.0 | Asset | 50000 | 105.184 ms | 2.77 | 29 | 277.4 | 2103.7 | 13971.28 KB | 1.01 |
| Sep______ | .NET 8.0 | Asset | 50000 | 30.393 ms | 0.80 | 29 | 960.1 | 607.9 | 13799.66 KB | 1.00 |
| Sylvan___ | .NET 8.0 | Asset | 50000 | 38.855 ms | 1.02 | 29 | 751.0 | 777.1 | 14025.03 KB | 1.02 |
| ReadLine_ | .NET 8.0 | Asset | 50000 | 121.473 ms | 3.19 | 29 | 240.2 | 2429.5 | 102133.36 KB | 7.40 |
| CsvHelper | .NET 8.0 | Asset | 50000 | 93.300 ms | 2.45 | 29 | 312.8 | 1866.0 | 13972.05 KB | 1.01 |
| Method | Runtime | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
|------------ |--------- |------ |------ |-----------:|------:|---:|--------:|-------:|------------:|------------:|
| Sep______ | .NET 7.0 | Row | 50000 | 2.537 ms | 1.00 | 29 | 11503.7 | 50.7 | 935 B | 1.00 |
| Sep_Unescape| .NET 7.0 | Row | 50000 | 2.569 ms | 1.01 | 29 | 11360.9 | 51.4 | 935 B | 1.00 |
| Sylvan___ | .NET 7.0 | Row | 50000 | 3.197 ms | 1.26 | 29 | 9128.8 | 63.9 | 7383 B | 7.90 |
| ReadLine_ | .NET 7.0 | Row | 50000 | 13.549 ms | 5.27 | 29 | 2153.8 | 271.0 | 90734847 B | 97,042.62 |
| CsvHelper | .NET 7.0 | Row | 50000 | 61.317 ms | 24.28 | 29 | 475.9 | 1226.3 | 21150 B | 22.62 |
| Sep______ | .NET 8.0 | Row | 50000 | 2.446 ms | 0.96 | 29 | 11931.8 | 48.9 | 934 B | 1.00 |
| Sep_Unescape| .NET 8.0 | Row | 50000 | 2.521 ms | 0.99 | 29 | 11576.6 | 50.4 | 934 B | 1.00 |
| Sylvan___ | .NET 8.0 | Row | 50000 | 2.965 ms | 1.17 | 29 | 9842.7 | 59.3 | 7382 B | 7.90 |
| ReadLine_ | .NET 8.0 | Row | 50000 | 12.622 ms | 5.00 | 29 | 2311.9 | 252.4 | 90734841 B | 97,042.61 |
| CsvHelper | .NET 8.0 | Row | 50000 | 43.642 ms | 17.19 | 29 | 668.6 | 872.8 | 21081 B | 22.55 |
| | | | | | | | | | | |
| Sep______ | .NET 7.0 | Cols | 50000 | 3.705 ms | 1.00 | 29 | 7875.9 | 74.1 | 938 B | 1.00 |
| Sep_Unescape| .NET 7.0 | Cols | 50000 | 4.391 ms | 1.19 | 29 | 6645.3 | 87.8 | 941 B | 1.00 |
| Sylvan___ | .NET 7.0 | Cols | 50000 | 5.598 ms | 1.51 | 29 | 5213.0 | 112.0 | 7389 B | 7.88 |
| ReadLine_ | .NET 7.0 | Cols | 50000 | 13.541 ms | 3.61 | 29 | 2155.1 | 270.8 | 90734847 B | 96,732.25 |
| CsvHelper | .NET 7.0 | Cols | 50000 | 77.553 ms | 20.93 | 29 | 376.3 | 1551.1 | 457022 B | 487.23 |
| Sep______ | .NET 8.0 | Cols | 50000 | 3.628 ms | 0.98 | 29 | 8043.9 | 72.6 | 937 B | 1.00 |
| Sep_Unescape| .NET 8.0 | Cols | 50000 | 3.984 ms | 1.08 | 29 | 7325.4 | 79.7 | 938 B | 1.00 |
| Sylvan___ | .NET 8.0 | Cols | 50000 | 5.157 ms | 1.39 | 29 | 5658.3 | 103.1 | 7386 B | 7.87 |
| ReadLine_ | .NET 8.0 | Cols | 50000 | 13.149 ms | 3.54 | 29 | 2219.3 | 263.0 | 90734841 B | 96,732.24 |
| CsvHelper | .NET 8.0 | Cols | 50000 | 70.761 ms | 19.10 | 29 | 412.4 | 1415.2 | 457060 B | 487.27 |
| | | | | | | | | | | |
| Sep______ | .NET 7.0 | Asset | 50000 | 34.415 ms | 1.00 | 29 | 847.9 | 688.3 | 14130898 B | 1.00 |
| Sep_Unescape| .NET 7.0 | Asset | 50000 | 34.273 ms | 1.00 | 29 | 851.4 | 685.5 | 14130898 B | 1.00 |
| Sylvan___ | .NET 7.0 | Asset | 50000 | 42.214 ms | 1.22 | 29 | 691.3 | 844.3 | 14296698 B | 1.01 |
| ReadLine_ | .NET 7.0 | Asset | 50000 | 113.604 ms | 3.29 | 29 | 256.9 | 2272.1 | 104584612 B | 7.40 |
| CsvHelper | .NET 7.0 | Asset | 50000 | 103.655 ms | 3.02 | 29 | 281.5 | 2073.1 | 14307286 B | 1.01 |
| Sep______ | .NET 8.0 | Asset | 50000 | 30.453 ms | 0.88 | 29 | 958.3 | 609.1 | 14130846 B | 1.00 |
| Sep_Unescape| .NET 8.0 | Asset | 50000 | 30.480 ms | 0.89 | 29 | 957.4 | 609.6 | 14130886 B | 1.00 |
| Sylvan___ | .NET 8.0 | Asset | 50000 | 38.244 ms | 1.11 | 29 | 763.0 | 764.9 | 14296692 B | 1.01 |
| ReadLine_ | .NET 8.0 | Asset | 50000 | 105.568 ms | 2.96 | 29 | 276.4 | 2111.4 | 104584668 B | 7.40 |
| CsvHelper | .NET 8.0 | Asset | 50000 | 85.633 ms | 2.49 | 29 | 340.8 | 1712.7 | 14306936 B | 1.01 |
Loading

0 comments on commit 30962ef

Please sign in to comment.