-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: avoid turning huge filepaths to atoms #18
fix: avoid turning huge filepaths to atoms #18
Conversation
efbcc00
to
1989efa
Compare
Specially when running with CT, used filepaths may become larger than the maximum atom size of 255 bytes: ``` 2024-10-17T14:18:10.117220+00:00 [error] GenServer #PID<0.31263.0> terminating, ** (SystemLimitError) a system limit has been reached, :erlang.binary_to_atom("/emqx/_build/emqx-enterprise-test/logs/ct_run.test@127.0.0.1.2024-10-17_14.18.07/apps.emqx_bridge_kafka.emqx_bridge_v2_kafka_producer_SUITE.t_fixed_topic_recovers_in_disk_mode.logs/run.2024-10-17_14.18.07/log_private/emqx_bridge_v2_kafka_producer_SUITE_data/kafka/kafka:fixed_topic_disk_recover:test@127.0.0.1/action=3akafka_producer=3afixed_topic_disk_recover=3aconnector=3akafka_producer=3ac_test-topic-one-partition/0/committer", :utf8), (replayq 0.3.7) /emqx/deps/replayq/src/replayq.erl:520: :replayq.spawn_committer/2, (replayq 0.3.7) /emqx/deps/replayq/src/replayq.erl:103: :replayq.open/1, (wolff 4.0.2) /emqx/deps/wolff/src/wolff_producer.erl:258: :wolff_producer.do_init/1, (wolff 4.0.2) /emqx/deps/wolff/src/wolff_producer.erl:298: :wolff_producer.handle_info/2, (stdlib 5.2.3.1) gen_server.erl:1095: :gen_server.try_handle_info/3, (stdlib 5.2.3.1) gen_server.erl:1183: :gen_server.handle_msg/6, (stdlib 5.2.3.1) proc_lib.erl:241: :proc_lib.init_p_do_apply/3, Last message: {:do_init, %{config: %{group: "action:kafka_producer:fixed_topic_disk_recover:connector:kafka_producer:c", max_linger_bytes: 10485760, max_batch_bytes: 917504, required_acks: :all_isr, compression: :no_compression, replayq_dir: "/emqx/_build/emqx-enterprise-test/logs/ct_run.test@127.0.0.1.2024-10-17_14.18.07/apps.emqx_bridge_kafka.emqx_bridge_v2_kafka_producer_SUITE.t_fixed_topic_recovers_in_disk_mode.logs/run.2024-10-17_14.18.07/log_private/emqx_bridge_v2_kafka_producer_SUITE_data/kafka/kafka:fixed_topic_disk_recover:test@127.0.0.1", replayq_offload_mode: false, replayq_max_total_bytes: 2147483648, replayq_seg_bytes: 104857600, drop_if_highmem: false, ack_timeout: 10000, partitioner: :random, partition_count_refresh_interval_seconds: 60, max_linger_ms: 0, max_send_ahead: 9, telemetry_meta_data: %{bridge_id: "action:kafka_producer:fixed_topic_disk_recover:connector:kafka_producer:c"}, max_partitions: :all_partitions, reconnect_delay_ms: 2000}, partition: 0, topic: "test-topic-one-partition", client_id: "connector:kafka_producer:c", conn: #PID<0.31253.0>, linger_expire_timer: false}} ```
1989efa
to
1708143
Compare
src/replayq.erl
Outdated
@@ -515,13 +519,17 @@ ensure_deleted(Filename) -> | |||
%% The committer writes consumer's acked segmeng number + item ID | |||
%% to a file. The file is only read at start/restart. | |||
spawn_committer(ReaderSegno, Dir) -> | |||
Name = iolist_to_binary(filename:join([Dir, committer])), | |||
%% register a name to avoid having two committers spawned for the same dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was a lazy implementation to guard bugs in caller modules.
e.g. open without close.
we can maybe find a better way for this exclusiveness check.
e.g. put pid in app env, and assert if pid is alive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having a named process sounds like a simple implementation supported by the vm itself, with automatic cleanup of the name in case the committer terminates.
what would be the motivation to get rid of this mechanism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in the latest commit
src/replayq_registry.erl
Outdated
#{committers := Committers0} = State0, | ||
case Committers0 of | ||
#{Dir := Pid} -> | ||
%% Already registered to the same pid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should not be ok, because it means the same process called open twice without closing first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgot the pid is actually the committer pid.
then the question is: is this possible to happen ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, in the current implementation.
thinking of this as a general registry, repeating the exact same registration should be idempotent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after we discussed, I changed the last commit to attempt to register the called of open
as soon as possible to avoid side-effects.
730ad81
to
3ab41ae
Compare
{ok, State}. | ||
|
||
handle_call(#register_committer{dir = Dir, pid = Pid}, _From, State0) -> | ||
%% Should we expand the directory path to avoid tricks with links and relative paths? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noticed that we should also normalize the dir (i.e.: don't treat an iolist, string and binary representations of the same path differently). pushed a small fix for that
tagged 0.3.9 |
Specially when running with CT, used filepaths may become larger than the maximum atom
size of 255 bytes: