Use IEnumerable<KeyValuePair<string, object>> for Log() methods (again) #91

cwe1ss · 2018-04-26T20:07:09Z

In v0.10.0 we used Log(IEnumerable<KeyValuePair<string, object>> fields). With our big java parity-change in v0.11.0 we changed the signature to Log(IDictionary<string, object> fields). This was an attempt to translate Java's Map<K, V> interface.

Java's Map interface does not implement any other interfaces. But in C# IDictionary<K,V> implements ICollection<KeyValuePair<K, V>> and ICollection<T> implements IEnumerable<T> so moving back to IEnumerable<KeyValuePair<string, object>> would be the most general type and a reasonable deviation from Java.

Using IDictionary does not give any guarantees regarding uniqueness anyway as people could call Log() with the same key multiple times or they could also use a non-unique implementation of IDictionary. This means that tracers have to deal with duplicate keys anyway and define their own behavior (e.g. last wins)

This change would also make it easier to use more efficient data structures (e.g. linked lists).

This is a breaking change for tracers, but it will NOT break instrumentation code as it will just use a more general interface.

ndrwrbgs

Definitely better approach, I don't see any scenarios that would be broken by this (tracer can check the type if they need a dictionary, and if it isn't one someone would have to make one anyway).

Only concern would have existed before anyway - in that IDictionary.GetEnumerator and IEnumerable.GetEnumerator are not performance optimal methods (both use a CallVirt and allocate objects, even if the enumerator is a struct since it has to be boxed into IEnumerator).

However, I doubt we'd be able to gain consensus on using Dictionary or KeyValuePair[] which implement more efficient access options.

cwe1ss · 2018-05-07T19:33:44Z

@ndrwrbgs thanks for the review.

Only concern would have existed before anyway - in that IDictionary.GetEnumerator and IEnumerable.GetEnumerator are not performance optimal methods (both use a CallVirt and allocate objects, even if the enumerator is a struct since it has to be boxed into IEnumerator).

As noted here, it would be great if you could provide proof/a benchmark about the impact of CallVirt. I included a few links there that indicate that CallVirt is used almost always anyway.

However, I doubt we'd be able to gain consensus on using Dictionary or KeyValuePair[] which implement more efficient access options.

What would be your ideal signature?

ndrwrbgs · 2018-05-07T20:38:04Z

Oi, I lost track of the other thread, I'm sorry. I cannot speak to the theory of this but I know first hand from measuring actual programs and improving performance of them at work that CallVirt is used for interface calls, but not (¿always?) concrete object calls.

For the first link - I'm thinking it must just be outdated.
For the second two - SO is down for my area so I can't comment :P

It's worth noting though that call and callvirt aren't really the question - both require a control flow jump. The question is more on what can be inlined and what cannot - a CallVirt can never be inlined because it does not know where the call will go.

My "ideal" signature would be to use interfaces as you've done, because they're better "code style". However if I were tasked with performance here then the story changes, as I've learned that for the sake of performance a lot of nice code (like interfaces) has to go away in C# :(

If we were to take perf as a top most important thing, then I probably would've avoided interfaces altogether for the types being passed around (ISpan and such) and tried to make a concrete sealed data-only-type implementation that could be handed off to a tracer only on write events (like Finish) so that a call like using (startSpan) { /* code */ } can be almost fully inlined (omitting of course that final call to something like tracer.LogSpan(span)). As it is today there has to be quite a few program jumps to instrument any piece of code which means performance wise (if we want tracing to be less costly than the code it's tracing) it's best suited only for code that already is containing jumps. The jumps today required by the interfaces:

ITracer.BuildSpan
ISpanBuilder.StartActive
IScope.Dispose

This is too big a change to make just for C# though, and given the language interop I don't think OT will progress down such a route any time soon. Thus far I've been holding hope that it doesn't matter in the end, and the usage characteristics of users make it so that it's unimportant (they only instrument high enough granularity).

All in all, I didn't expect to write a book here but seems I have. It's been something I've wanted to bring up for a while not to suggest change - since I think that'll be too hard - but to at least in the back of your mind when working on the API as perf concerns are something I neglected completely when porting over the java code.

ndrwrbgs · 2018-05-07T20:41:51Z

^ I got a little off subject here. To answer the question more directly...

In terms of style and proper design, my ideal would be ISpan Log(IEnumerable<KeyValuePair<string, object>> fields); - it's the most permissive and doesn't prevent the implementation from taking Dictionary even if it so chose.

In terms of performance, something more like ISpan Log(KeyValuePair<string, object>[] fields); would be best. Dictionary (and I think even List) override GetEnumerator with a non-interface implementation that returns a struct (thereby avoiding the CallVirt because you can only actually access their virtual interface method implementation via casting - and avoiding allocation by using a struct) but if I were designing an API I would rather tie people to array (a primitive) than a specific class implementation (even if that class is from the BCL).

cwe1ss · 2018-05-08T19:33:56Z

@ndrwrbgs We should definitely have a general performance-related discussion in a separate issue. I think it's important to discuss the goals and non-goals of this API and how it relates to other tracing APIs like ETW.

My concern with forcing an array type is that this will lead to people doing stuff like myFieldList.ToArray() (which will copy the whole list) because they are too lazy to deal with arrays. Using IEnumerable<...> will allow performance-sensitive instrumentation code to pass arrays while still supporting the "user friendly" lists. Tracers could optimize for the array-case by checking the type (at the cost of one cast of course) and then doing a regular for-loop instead of using the enumerator.

@opentracing/opentracing-c-maintainers More feedback would be appreciated so that we can include this in the next release. thank you :)

carlosalberto · 2018-05-08T23:25:08Z

@cwe1ss I think this PR makes lots of sense (and I've always wondered why wasn't this the preferred way in Java, as opposed to using Maps).

cwe1ss · 2018-05-14T18:31:01Z

I'll merge this on/after Friday (May 18th) if there's no objections.

cwe1ss · 2018-05-19T20:44:09Z

I'll merge this now. I want to do an RC-release anyway so if there's any concerns, feel free to create a new issue!

cwe1ss added enhancement deviation-from-java breaks-tracers A breaking change for tracers labels Apr 26, 2018

This was referenced Apr 26, 2018

v0.12.0 Release Plan #92

Closed

Use IReadonlyDictionary<TKey, TValue> instead of IEnumerable<KeyValuePair<TKey, TValue>> #44

Closed

cwe1ss added this to the 0.12.0 milestone May 1, 2018

cwe1ss mentioned this pull request May 1, 2018

Support single key:value pair for ISpan.Log #85

Open

ndrwrbgs approved these changes May 7, 2018

View reviewed changes

cwe1ss force-pushed the cweiss/LogEnumerable branch from cff3847 to 48bb56c Compare May 14, 2018 18:28

Use IEnumerable<KeyValuePair<string, object>> for Log() methods (again)

426e7b7

cwe1ss force-pushed the cweiss/LogEnumerable branch from 48bb56c to 426e7b7 Compare May 19, 2018 20:25

cwe1ss merged commit d9b5244 into master May 19, 2018

cwe1ss deleted the cweiss/LogEnumerable branch May 19, 2018 20:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use IEnumerable<KeyValuePair<string, object>> for Log() methods (again) #91

Use IEnumerable<KeyValuePair<string, object>> for Log() methods (again) #91

Uh oh!

cwe1ss commented Apr 26, 2018

Uh oh!

ndrwrbgs left a comment

Uh oh!

cwe1ss commented May 7, 2018

Uh oh!

ndrwrbgs commented May 7, 2018

Uh oh!

ndrwrbgs commented May 7, 2018

Uh oh!

cwe1ss commented May 8, 2018

Uh oh!

carlosalberto commented May 8, 2018

Uh oh!

cwe1ss commented May 14, 2018

Uh oh!

cwe1ss commented May 19, 2018

Uh oh!

Uh oh!

Use IEnumerable<KeyValuePair<string, object>> for Log() methods (again) #91

Use IEnumerable<KeyValuePair<string, object>> for Log() methods (again) #91

Uh oh!

Conversation

cwe1ss commented Apr 26, 2018

Uh oh!

ndrwrbgs left a comment

Choose a reason for hiding this comment

Uh oh!

cwe1ss commented May 7, 2018

Uh oh!

ndrwrbgs commented May 7, 2018

Uh oh!

ndrwrbgs commented May 7, 2018

Uh oh!

cwe1ss commented May 8, 2018

Uh oh!

carlosalberto commented May 8, 2018

Uh oh!

cwe1ss commented May 14, 2018

Uh oh!

cwe1ss commented May 19, 2018

Uh oh!

Uh oh!