-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: clarify thread-safety #66
Comments
When I was working on this for 4.8, at first tried to do some sort of equals/hashing to detect if something changed to the type mapper that'd require me to generate new code to reduce the amount of code generation going on. I was concerned that generating an assembly was a slow operation. After benchmarking and profiling a bit, I found it's not that costly. Prior to 4.8 I was also creating generated code each time you create a reader/writer; the only thing that's different is that I create a new dynamic assembly each time now. Unless your files are small, this shouldn't be noticeably slower than it was before. In your situation, probably the best thing to do is stick your type mapper somewhere you can access it across threads, then just have each thread call GetReader or GetWriter. Don't quote me on it but I'm pretty sure calling these methods results in a Mapper getting created, which has its own code generator instance, so each thread would generate it's own assembly. If you try this out and something awful happens, just let me know. My paranoia is that having the same assembly name might cause issues. |
The issue here is that finding concurrency issues is quite finicky and this is in an environment where I can't afford failures. It would be nice being able to generate a Mapper per file format in a static variable and then get readers out of it on multiple threads. What I understand from your answer is that thread-safety of mappers is not guaranteed, so I'll stick to one mapper per thread. Feel free to close this issue or keep it open if you like the suggestion to guarantee thread safety of mappers. |
Yeah, I similarly have no magical way to tell if there'd be unexpected threading issues with this code. All I can say for certain is that there's no longer any shared state (static values) among type mappers/readers/writers, as each reader/writer instance will get its own code generator now. The only thread-safety guarantee prior to 4.8.0 was that the shared dynamic assembly was created in a static constructor (essentially) which guarantees that only one assembly would be created and shared among instances. That static constructor nonsense isn't even necessary anymore and any other threading concerns would be the same as code prior to 4.8. If you look here: https://github.com/jehugaleahsa/FlatFiles/blob/master/FlatFiles/TypeMapping/SeparatedValueTypeMapper.cs#L1484 you can see that every new "read" operation gets its own "mapper", which gets its own "code generator". The only thing I see that's suspicious is this line here: https://github.com/jehugaleahsa/FlatFiles/blob/master/FlatFiles/TypeMapping/Mapper.cs#L49 where I am caching the reader. This style of caching is not thread-safe except I am pretty sure (not 100%) that I always call this code in single-threaded context. I am not sure I even need this caching mechanism anymore but VS's debugging is not working at the moment in order to see if those if-statements ever even get hit. Nonetheless, that code shouldn't even be callable across threads, so I think its a moot point. My only fear is related to how .NET handles two or more dynamic assemblies have the same name. It doesn't seem to cause any issues in a single-threaded context so I doubt it will cause issues in a multi-threaded context. I never need to look up the assembly and simply use the generated code in it directly, so the name is sort of irrelevant. I did a quick look to see where all I am using |
I dug a little more into the code and have some observations to share.
2.a. Have you considered using Lightweight Code Generation (LCG), aka DynamicMethod? You go through a lot of hassle with dynamically creating assemblies, assigning public keys and unique names, when you could just create a method? 2.b. Codegen for a typed mapper is 100% derived from |
Your comment about mutating your config after starting reading got me thinking. I ran into this problem before with people modifying the "schema options" objects a long time ago. It seems like I will have the same problem with people modifying columns and schema after-the-fact too. In the past I solved this by cloning the objects - I probably need to do something similar here. Then I should wrap my schema object and the returned column definitions in immutable wrappers with read-only interfaces. This would solve the issue related to the way I create the property mapping objects right now: I am just storing a |
If you use the overload CreateDelegate(Type,Object) the second parameter is a capture: it can be whatever you want and will be bound to the 1st argument of your dynamic method (so the delegate type has to have one less argument than the IL code).
I saw that and I was surprised. I was expecting a single dynamic method that would straight read a whole line instead of a loop over the metadata.
If you rewrite the whole thing (which is not required to use (Going off track here: C# source generators seem even better for this use case but that's the future and not really usable by most devs right now).
My thoughts on this: it's a weird use case to mutate your file format once defined. If you really want to support crazy use cases, I'd say the best design is a builder pattern. Maybe you can do this without breaking the public API. A final unrelated thought I had while reading the code: parsing flat files may benefit from using |
This library is great, thanks! ❤️
Could you please clarify proper usage if we're going to parse many files with the same format?
As we don't want to generate dynamic code for each file, I was wondering if it is possible to create static mappers/selectors for the file format we need and then to use
GetReader
on possibly concurrent threads.I guess the question is: are Mappers and Selectors thread-safe?
If not I would like to know if there is a way to re-use the dynamically generated code (esp. after the changes in 4.8) without putting the parsing in a critical section.
The text was updated successfully, but these errors were encountered: