Skip to content

Use IEnumerable<KeyValuePair<string, object>> for Log() methods (again) #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 19, 2018

Conversation

cwe1ss
Copy link
Member

@cwe1ss cwe1ss commented Apr 26, 2018

In v0.10.0 we used Log(IEnumerable<KeyValuePair<string, object>> fields). With our big java parity-change in v0.11.0 we changed the signature to Log(IDictionary<string, object> fields). This was an attempt to translate Java's Map<K, V> interface.

Java's Map interface does not implement any other interfaces. But in C# IDictionary<K,V> implements ICollection<KeyValuePair<K, V>> and ICollection<T> implements IEnumerable<T> so moving back to IEnumerable<KeyValuePair<string, object>> would be the most general type and a reasonable deviation from Java.

Using IDictionary does not give any guarantees regarding uniqueness anyway as people could call Log() with the same key multiple times or they could also use a non-unique implementation of IDictionary. This means that tracers have to deal with duplicate keys anyway and define their own behavior (e.g. last wins)

This change would also make it easier to use more efficient data structures (e.g. linked lists).

This is a breaking change for tracers, but it will NOT break instrumentation code as it will just use a more general interface.

Copy link
Contributor

@ndrwrbgs ndrwrbgs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely better approach, I don't see any scenarios that would be broken by this (tracer can check the type if they need a dictionary, and if it isn't one someone would have to make one anyway).

Only concern would have existed before anyway - in that IDictionary.GetEnumerator and IEnumerable.GetEnumerator are not performance optimal methods (both use a CallVirt and allocate objects, even if the enumerator is a struct since it has to be boxed into IEnumerator).

However, I doubt we'd be able to gain consensus on using Dictionary or KeyValuePair[] which implement more efficient access options.

@cwe1ss
Copy link
Member Author

cwe1ss commented May 7, 2018

@ndrwrbgs thanks for the review.

Only concern would have existed before anyway - in that IDictionary.GetEnumerator and IEnumerable.GetEnumerator are not performance optimal methods (both use a CallVirt and allocate objects, even if the enumerator is a struct since it has to be boxed into IEnumerator).

As noted here, it would be great if you could provide proof/a benchmark about the impact of CallVirt. I included a few links there that indicate that CallVirt is used almost always anyway.

However, I doubt we'd be able to gain consensus on using Dictionary or KeyValuePair[] which implement more efficient access options.

What would be your ideal signature?

@ndrwrbgs
Copy link
Contributor

ndrwrbgs commented May 7, 2018

Oi, I lost track of the other thread, I'm sorry. I cannot speak to the theory of this but I know first hand from measuring actual programs and improving performance of them at work that CallVirt is used for interface calls, but not (¿always?) concrete object calls.

For the first link - I'm thinking it must just be outdated.
For the second two - SO is down for my area so I can't comment :P

It's worth noting though that call and callvirt aren't really the question - both require a control flow jump. The question is more on what can be inlined and what cannot - a CallVirt can never be inlined because it does not know where the call will go.

My "ideal" signature would be to use interfaces as you've done, because they're better "code style". However if I were tasked with performance here then the story changes, as I've learned that for the sake of performance a lot of nice code (like interfaces) has to go away in C# :(

If we were to take perf as a top most important thing, then I probably would've avoided interfaces altogether for the types being passed around (ISpan and such) and tried to make a concrete sealed data-only-type implementation that could be handed off to a tracer only on write events (like Finish) so that a call like using (startSpan) { /* code */ } can be almost fully inlined (omitting of course that final call to something like tracer.LogSpan(span)). As it is today there has to be quite a few program jumps to instrument any piece of code which means performance wise (if we want tracing to be less costly than the code it's tracing) it's best suited only for code that already is containing jumps. The jumps today required by the interfaces:

  1. ITracer.BuildSpan
  2. ISpanBuilder.StartActive
  3. IScope.Dispose

This is too big a change to make just for C# though, and given the language interop I don't think OT will progress down such a route any time soon. Thus far I've been holding hope that it doesn't matter in the end, and the usage characteristics of users make it so that it's unimportant (they only instrument high enough granularity).

All in all, I didn't expect to write a book here but seems I have. It's been something I've wanted to bring up for a while not to suggest change - since I think that'll be too hard - but to at least in the back of your mind when working on the API as perf concerns are something I neglected completely when porting over the java code.

@ndrwrbgs
Copy link
Contributor

ndrwrbgs commented May 7, 2018

^ I got a little off subject here. To answer the question more directly...

In terms of style and proper design, my ideal would be ISpan Log(IEnumerable<KeyValuePair<string, object>> fields); - it's the most permissive and doesn't prevent the implementation from taking Dictionary even if it so chose.

In terms of performance, something more like ISpan Log(KeyValuePair<string, object>[] fields); would be best. Dictionary (and I think even List) override GetEnumerator with a non-interface implementation that returns a struct (thereby avoiding the CallVirt because you can only actually access their virtual interface method implementation via casting - and avoiding allocation by using a struct) but if I were designing an API I would rather tie people to array (a primitive) than a specific class implementation (even if that class is from the BCL).

@cwe1ss
Copy link
Member Author

cwe1ss commented May 8, 2018

@ndrwrbgs We should definitely have a general performance-related discussion in a separate issue. I think it's important to discuss the goals and non-goals of this API and how it relates to other tracing APIs like ETW.

My concern with forcing an array type is that this will lead to people doing stuff like myFieldList.ToArray() (which will copy the whole list) because they are too lazy to deal with arrays. Using IEnumerable<...> will allow performance-sensitive instrumentation code to pass arrays while still supporting the "user friendly" lists. Tracers could optimize for the array-case by checking the type (at the cost of one cast of course) and then doing a regular for-loop instead of using the enumerator.

@opentracing/opentracing-c-maintainers More feedback would be appreciated so that we can include this in the next release. thank you :)

@carlosalberto
Copy link
Contributor

@cwe1ss I think this PR makes lots of sense (and I've always wondered why wasn't this the preferred way in Java, as opposed to using Maps).

@cwe1ss cwe1ss force-pushed the cweiss/LogEnumerable branch from cff3847 to 48bb56c Compare May 14, 2018 18:28
@cwe1ss
Copy link
Member Author

cwe1ss commented May 14, 2018

I'll merge this on/after Friday (May 18th) if there's no objections.

@cwe1ss cwe1ss force-pushed the cweiss/LogEnumerable branch from 48bb56c to 426e7b7 Compare May 19, 2018 20:25
@cwe1ss
Copy link
Member Author

cwe1ss commented May 19, 2018

I'll merge this now. I want to do an RC-release anyway so if there's any concerns, feel free to create a new issue!

@cwe1ss cwe1ss merged commit d9b5244 into master May 19, 2018
@cwe1ss cwe1ss deleted the cweiss/LogEnumerable branch May 19, 2018 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants