Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seatunnel 2.3.8 使用过程中遇到的Bug清单,绝大多数已修复 #8785

Open
wangxiaoliang001 opened this issue Feb 21, 2025 · 7 comments

Comments

@wangxiaoliang001
Copy link

wangxiaoliang001 commented Feb 21, 2025

项目是好项目,无奈问题实在是太多,从24.12-25.02挣扎了近3个月,从入坑到放弃

TiDB 6.1.3 :
1、仅能用来同步静态数据,无法进行增量(Region分裂合/并后增量异常,已修复)
2、增量过程任务持续一段时间后停止(sync同步锁导致 tikv grpc连接池不可用,Check for unhealthy stores, 已修复)
3、表字段存在默认值,sink后为 null (组装 SeaTunnelRow 实例时缺少 schema default value检查, 已修复)
4、大表全量阶段内存溢出 (load完大表之后才sink数据,已修复)
5、分区表无法同步 (已修复)
6、联合主键表无法同步 (缺失主键字段数据,已修复)
7、部分表增量同步异常,表现为 tikv有推送数据,但非常少,与实际变更条数不匹配 (未找到原因,未修复)
8、增量同步CPU使用率 100% (已修复)

TiDB组件完全没办法使用,该组件应当从官方宣传的支持列表中删除,以避免对需求方造成较大的决策损失

Kafka 2.8 :
1、Kafka 增量无法消费 (偏移量设置错误,已修复)
2、Kafka 任务重启无法从上次的偏移量消费,只有 group_offset 支持(其它语义已全部支持,已修复)
3、Kafka 不支持JSON字段 (已修复)
4、Kafka 消费大量数据内存溢出 (已修复)
5、Kafka Sink 统计不准,已增加 AT_LEAST_ONCE 语义,实际跟统计结果相差较大,且未提供丢失日志 (未修复)
6、Kafka Sink tinyint(1) 类型,null => 0、true => 0 转换错误 (已修复)
7、Kafka 支持按分区 Sink ,但不支持按分区 Source (已修复)

Mysql:
1、Mysql 大表全量阶段非常慢(使用场景不多,尚未发现太多问题)

Zeta 2.3.8 引擎缺陷:
1、未按资源使用情况进行调度,实际使用过程中,部分负载较高的节点依然会被调度,空闲节点依然空闲
2、任务缺乏隔离机制,部分任务异常导致节点重启,任务重新漂移,进而影响整个集群接连重启
3、引擎组件中大量使用无界队列,该问题在大多数组件中普遍存在,因此对当前宣传的投产项目持怀疑态度
4、开发调试难度较高,屏蔽了 main 函数启动方式,除非适配 spi 的启动方式,同一组件倘若对接各种数据源,其调试难度将大大增加
5、应当提供默认的,并且较好的GC策略,任务阶段GC时间过长,容器环境下易探测失败被动重启

监控:
监控模板较为粗糙,没有任务监控,已完善
Image

@davidzollo
Copy link
Contributor

davidzollo commented Feb 21, 2025

Thanks for the detailed information.
There're many users using MySQL connector, Can you paste your MySQL job conf?
TiDB cdc is delelvoping.

By the way, developing and debugging is not very difficult, you can refer this doc:
seatunnel-examples/seatunnel-engine-examples/src/main/java/org/apache/seatunnel/example/engine/SeaTunnelEngineLocalExample.java, I think we can make connection ( my Linkedin: davidzollo or Wechat: davidzollo)

@wangxiaoliang001
Copy link
Author

wangxiaoliang001 commented Feb 21, 2025

Thanks for the detailed information. There're many users using MySQL connector, Can you paste your MySQL job conf? TiDB cdc is delelvoping.

By the way, developing and debugging is not very difficult, you can refer this doc: seatunnel-examples/seatunnel-engine-examples/src/main/java/org/apache/seatunnel/example/engine/SeaTunnelEngineLocalExample.java, I think we can make connection ( my Linkedin: davidzollo or Wechat: davidzollo)

您好,我所看到的情况是这样的:

Local 方式启动,需要要在 pom 当中集成相关任务连接器及其依赖,然后项目启动过程中无法找到相关连接器,因为通过LocalExample的 Main方式启动,相关连接器是无法注册和初始化的,所以只能对调试的连接器通过SPI实现对应的注册和初始化操作,才能运行起来。
org.apache.seatunnel.api.sink.SeaTunnelSink
org.apache.seatunnel.api.source.SeaTunnelSource

server 方式启动,需要先编译出 connectors, 并指定 SEATUNNEL_HOME 路径才能加载到连接器启动,并且同时需要在 pom 中删除相关连接器,否则会由不同的 classloader 加载,导致服务启动失败。这种方式虽然能调试,但对源代码的改动不能热加载。

Mysql 相关的问题遇到的很少,只是我们的表大一些,全量过程在检查索引,并做 JdbcQuery,并且后续的 binlog 也很大,因为 mysql server 当中有非常多的库表,所以并不是问题,如果能否通过什么方式进行优化那更好了。
这是相关配置:

{
"env": {
"parallelism": 1,
"job.mode": "STREAMING",
"job.name": "mysql=>kafka1 : realtimecrowdlog_sv.log_crowdlog",
"checkpoint.interval": "60000",
"checkpoint.timeout": "60000",
"flush.timeout.ms": "10000"
},
"source": [
{
"plugin_name": "MySQL-CDC",
"driver": "com.mysql.cj.jdbc.Driver",
"base-url": "jdbc:mysql://xxx:3306?useSSL=false",
"username": "xxx",
"password": "xxx",
"database-names": [
"realtimecrowdlog_sv"
],
"table-names": [
"realtimecrowdlog_sv.log_crowdlog"
],
"startup.mode": "earliest",
"server-time-zone": "Asia/Shanghai",
"connect.max-retries": 3,
"format" : "compatible_debezium_json",
"debezium_record_include_schema": "false",
"debezium" : {
"database.server.name" : "oracle"
}
}
],
"sink" : [
{
"plugin_name" : "Kafka",
"bootstrap.servers" : "xxxx",
"topic": "xxxx",
"reroute": [
{
"pattern": "oracle.realtimecrowdlog_sv.log_crowdlog",
"topic": "short_video_composite",
"partition": 1
}
],
"kafka.config": {
"acks": "1",
"request.timeout.ms": 240000,
"batch.size": 10240,
"buffer.memory": 67108864,
"send.buffer.bytes": 262144,
"compression.gzip.level": 6,
"compression.type": "gzip"
},
"semantics" : "NON",
"format" : "compatible_debezium_json"
}
]
}

@hailin0
Copy link
Member

hailin0 commented Feb 21, 2025

@hailin0
Copy link
Member

hailin0 commented Feb 21, 2025

this pr improved task allocation strategy
#8233

@wangxiaoliang001
Copy link
Author

this pr improved task allocation strategy #8233

不好意思,任务示例中同步的是一张表,这个扫描策略并不能解决慢的问题,并且 Mysql-CDC 同一份 binlog 我支持了将不同的库、表数据写到不同的 topic 及不同的分区

@hailin0
Copy link
Member

hailin0 commented Feb 21, 2025

Increasing parallelism can increase read speed
parallelism: concurrent thread workers

@wangxiaoliang001
Copy link
Author

Increasing parallelism can increase read speed parallelism: concurrent thread workers

好的,非常感谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants