Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOFATracer support report trace data to jaeger and skywalking #443

Open
glmapper opened this issue Jul 15, 2021 · 3 comments · May be fixed by #449
Open

SOFATracer support report trace data to jaeger and skywalking #443

glmapper opened this issue Jul 15, 2021 · 3 comments · May be fixed by #449
Assignees
Milestone

Comments

@glmapper
Copy link
Contributor

@glmapper glmapper added this to the 3.1.2 milestone Jul 15, 2021
@chenzhao11
Copy link

上报SW的span的转换

每一个Span封装成一个Segment发送

几种关键数据属性

For trace format, note that:

  1. The segment is a unique concept in SkyWalking. It should include all spans for each request in a single OS process, which is usually a single language-based thread.
  2. There are three types of spans.
  • EntrySpan EntrySpan represents a service provider, which is also the endpoint on the server end. As an APM system, SkyWalking targets the application servers. Therefore, almost all the services and MQ-consumers are EntrySpans.
  • LocalSpan LocalSpan represents a typical Java method which is not related to remote services. It is neither a MQ producer/consumer nor a provider/consumer of a service (e.g. HTTP service).
  • ExitSpan ExitSpan represents a client of service or MQ-producer. It is known as the LeafSpan in the early stages of SkyWalking. For example, accessing DB by JDBC, and reading Redis/Memcached are classified as ExitSpans.
  1. Cross-thread/process span parent information is called “reference”. Reference carries the trace ID, segment ID, span ID, service name, service instance name, endpoint name, and target address used on the client end (note: this is not required in cross-thread operations) of this request in the parent. See Cross Process Propagation Headers Protocol v3 for more details.
  2. Span#skipAnalysis may be TRUE, if this span doesn’t require backend analysis.

Segment

public class Segment {
    private String     traceId;
    private String     traceSegmentId;
    // 这一个segment中的所有span
    private List<Span> spans = new LinkedList<>();
    //执行同样操作的一组服务名,每一个服务名会在拓扑图中单独显示为一个节点,同一个服务名下的从span中得到的指标汇集到到一起作为这个服务的指标
    private String     service;
    //实例名
    private String     serviceInstance;
    // Whether the segment includes all tracked spans.
    // In the production environment tracked, some tasks could include too many spans for one request context, such as a batch 		   update for a cache, or an async job.
    // The agent/SDK could optimize or ignore some tracked spans for better performance.
    // In this case, the value should be flagged as TRUE.
    private boolean    isSizeLimited;

}

Span

    private int                      spanId;
    private int                      parentSpanId;
    private Long                     startTime;
    private Long                     endTime;
    private List<SegmentReference>   refs = new LinkedList<>();
    private String                   operationName;
    //peer在exit span中使用对构建拓扑图有至关重要的作用
    private String                   peer;
    private SpanType                 spanType;
    // Span layer represent the component tech stack, related to the network tech.
    private SpanLayer                spanLayer;
    private int                      componentId;
    private boolean                  isError;
    private List<KeyStringValuePair> tags = new LinkedList<>();
    private List<Log>                logs = new LinkedList<>();
    private boolean                  skipAnalysis;

SegmentReference

代表segment之间的关系

public class SegmentReference {
    private RefType refType;
    private String  traceId;
    private String  parentTraceSegmentId;
    private int     parentSpanId;
    private String  parentService;
    private String  parentServiceInstance;
    private String  parentEndpoint;
    // The network address, including ip/hostname and port, which is used in the client side.
    // Such as Client --> use 127.0.11.8:913 -> Server
    // then, in the reference of entry span reported by Server, the value of this field is 127.0.11.8:913.
    // This plays the important role in the SkyWalking STAM(Streaming Topology Analysis Method)
    // For more details, read https://wu-sheng.github.io/STAM/
    private String  networkAddressUsedAtPeer;

}

把每一个span单独封装成一个segment

SW中的三种Span,EntrySpan(代表service provider),EntrySpan(代表没有和其他remote service关联的方法),ExitSpan(client of service ),在转换的过程中是根据SofaTracerSpan的span.kind来判断是转换成ExitSpan还是EntrySpan。server转换成EntrySpan,client转换成ExitSpan。

字段转换

segment
字段 方法
traceId 利用SofaTracerSpan中的traceId,两者都是字符串
traceSegmentId 利用SofaTracerSpan中的SpanId和TraceId拼接(要限制一下长度不然可能长度过长)都是String
Spans 根据SofaTracerSpan的span.kind决定转化成哪种类型的Span
service 使用local.app
serviceInstance 使用serviceName@IP的方式获取
isSizeLimited false
EntrySpan
字段 方法
spanId 这个Id是一个属于[0, maxspannum]的整数,这里只有一个固定为0
parentSpanId Segment中只有一个span没有,取-1
startTime SofaTracerSpan中的数据
endTime SofaTracerSpan中的数据
refs SegmentReference
如果是rootSpan就没有ref 判断是不是root 可以根据sofaTracerSpan中Ref列表是否为空来判断
operationName SofaTracerSpan中的数据
peer 当类型是ExitSpan的时候需要指定,是构造拓扑图必须的
根据REMOTE_HOST和REMOTE_PORT设置
SpanType span的类型根据span.kind是server还是client来对应
SpanLayer 定义:Span layer represent the component tech stack, related to the network tech.
可以根据tracerType来对应到SW中的类型需要一个对应类
componentId 代表span属于的Component,sofatracer中的类别除了datasource外都可以找到对应的转换
Span属于哪一个数据库可以根据tag来进一步判断
isError 根据tag RESULT_CODE是否等于SofaTracerConstant.RESULT_CODE_SUCCESS来判断
tags sofatracerSpan中的
logs sofatracerSpan中的
skipAnalysis false
Span中refs

Cross-thread/process span parent information is called “reference”. Reference carries the trace ID, segment ID, span ID, service name, service instance name, endpoint name, and target address used on the client end (note: this is not required in cross-thread operations) of this request in the parent.

字段 方法
refType 跨线程还是跨进程(这里转换目前取的都是跨线程)
TraceId 使用输入span的traceId
parentTraceSegmentId 使用getParentId()获取
parentSpanId 只有一个span肯定是0
parentService 跨线程的SofaTracerSpan是拿不到的,baggage中有哪些数据不确定
parentServiceInstance 跨线程的SofaTracerSpan是拿不到的,baggage中有哪些数据不确定
parentEndpoint 跨线程的SofaTracerSpan是拿不到的,baggage中有哪些数据不确定
networkAddressUsedAtPeer 父span请求当前Span用到的网络地址 ,转换过程中
根据本地是不是有LOCAL_HOST LOCAL_PORT这两个字段设置

目前有的字段取不到数据还是空着的测试还是简单的测试例子

结果

仪表盘

image

trace数据

image

一个线程中一次请求的所有span封装成一个segment发送

在内部维护 一个segment,当一个span结束的时候看是clientSpan还是serverSpan,如果是serverSpan说明当前线程中所有需要的操作已完成可以发送segment了,如果是后者加入segment中继续。

segmentId: traceId+当前的EntrySpan的spanId
parentSegmentId:traceId+(当前SofaTracerSpan的refs数组中的SofaTracerspancontext中存的parentspanId)
spanId可以取最后一个整数 如0.1.2 spanId取2
其他的字段转化方式和上面类似
这种方式符合SW中segment的设计初衷,后面可以测试使用这种方式

@chenzhao11
Copy link

关于SW展示拓扑图的不可行讨论

Skywalking中拓扑图的构造方法 STAM:针对大型分布式应用系统的拓扑自动检测方法

正常展示拓扑图需要的参数

  • Exit Span中需要设置Peer,在Dubbo中可以通过拼接remote.hostremote.port得到,在SofaRPC中通过remote.ip的到ip和端口信息

  • Entry Span中的peer(对应调用自己的Exit Span的地址)在Dubbo中可以通过拼接remote.hostremote.port得到,在sofaRpc中remote.ip只有ip信息没有端口,同时也没有local.ip只能自己获取本机的ip

  • Entry Span中的SegmentReference定义如下;

    public class SegmentReference {
        private RefType refType;
        private String  traceId;
        private String  parentTraceSegmentId;
        private int     parentSpanId;
        private String  parentService;
        private String  parentServiceInstance;
        private String  parentEndpoint;
        // The network address, including ip/hostname and port, which is used in the 		client side.
        private String  networkAddressUsedAtPeer;
    }

    其中traceIdparentTraceSegmentIdparentSpanId是构建trace的几个关键字段,可以获得。networkAddressUsedAtPeer字段在构建拓扑图起到连接链路的作用,Exit Span的peer地址和某个networkAddressUsedAtPeer相同会连通链路。这个字段在Dubbo中可以通过local.hostlocal.port获取,但是在sofaRpc的server span中无法获取本地的ip和port。只能利用网络API获取本地第一个合法IP地址,client span也只取ip,但是网络API获取的IP可能与实际的IP地址不一样,会导致链路断开,如图一。

image

​ 图一 SofaRPC中网络API获取的IP与client span中的不一样同时未在segmentReference中设置父服务相关信息

不可行原因

  • parentServiceparentServiceInstanceparentEndpoint这三个字段在拓扑图的构建过程中不可或缺,如果没有这三个参数在server端会显示一个空白的实例,如图二所示。但是在SOFATracer中在能在上下文中传播的只有traceIdspanIdparentIdsysBaggagebizBaggage从其中无法得到以上的三个字段因此无法构建拓扑图。

image

​ 图二 server端显示空白实例

设置parentServiceparentServiceInstanceparentEndpoint三个字段后正常展示如图三。

                segmentReference.setParentService("dubbo-consumer");
                segmentReference.setParentServiceInstance("dubbo-consumer@172.28.16.26");
                segmentReference.setParentEndpoint("HelloService#SayHello");

image

​ 图三 拓扑图正常展示

chenzhao11 added a commit to chenzhao11/sofa-tracer that referenced this issue Sep 7, 2021
chenzhao11 added a commit to chenzhao11/sofa-tracer that referenced this issue Sep 7, 2021
chenzhao11 added a commit to chenzhao11/sofa-tracer that referenced this issue Sep 7, 2021
@chenzhao11 chenzhao11 linked a pull request Sep 7, 2021 that will close this issue
@glmapper
Copy link
Contributor Author

@nobodyiam @xzchaoo 请评估下这个 PR 是否可以 merge 并且发布

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants