Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The DDS API execution takes a long time and cannot meet real-time requirements. #2107

Open
yanzhang920817 opened this issue Oct 11, 2024 · 4 comments

Comments

@yanzhang920817
Copy link

yanzhang920817 commented Oct 11, 2024

According to the introduction of DDS, its real-time performance seems to be better, but after my test, the execution time of DDS's own API is not stable, as shown in the following example:



int DDSUtil::CreatePublisher(PublisherInfo &pub)
{
    // create dds publihser
    dds_return_t rc;
    dds_qos_t *qos;
    /* Create a Participant. */
    pub.participant = dds_create_participant(DDS_DOMAIN_DEFAULT, NULL, NULL);
    if (pub.participant < 0)
    {
        DDS_FATAL("dds_create_participant: %s\n", dds_strretcode(-pub.participant));
        goto err_free_pub;
    }
    printf("%s: create participant successfully\n", __FUNCTION__);

    /* Create a Topic. */
    // pub.topic = dds_create_topic(pub.participant, &pub.topicInfo.topicDesc, pub.topicInfo.topicName, NULL, NULL);
    if (!strcmp(pub.topicName, "rt/lowstate"))
    {
        pub.topic = dds_create_topic(pub.participant, &LowState__desc, pub.topicName, NULL, NULL);
        qos = dds_create_qos();
        dds_qset_history(qos, DDS_HISTORY_KEEP_LAST, 1);
    }
    else if (!strcmp(pub.topicName, "humaniod/state"))
    {
        pub.topic = dds_create_topic(pub.participant, &HumaniodState__desc, pub.topicName, NULL, NULL);
        qos = dds_create_qos();
        dds_qset_history(qos, DDS_HISTORY_KEEP_LAST, 1);
    }
    else if (!strcmp(pub.topicName, "rt/inspire/state"))
    {
        pub.topic = dds_create_topic(pub.participant, &InspireState__desc, pub.topicName, NULL, NULL);
        qos = dds_create_qos();
        dds_qset_history(qos, DDS_HISTORY_KEEP_LAST, 1);
    }
    if (pub.topic < 0)
    {
        DDS_FATAL("dds_create_topic: %s\n", dds_strretcode(-pub.topic));
        goto err_delete_participant;
    }
    printf("%s: create topic successfully\n", __FUNCTION__);

    /* Create a Writer. */
    pub.writer = dds_create_writer(pub.participant, pub.topic, qos, NULL);
    if (pub.writer < 0)
    {
        DDS_FATAL("dds_create_writer: %s\n", dds_strretcode(-pub.writer));
        goto err_delete_topic;
    }
    printf("%s: create writer successfully\n", __FUNCTION__);
    fflush(stdout);
    dds_delete_qos(qos);
    return 0;

err_delete_writer:
    dds_delete(pub.writer);
err_delete_topic:
    dds_delete(pub.topic);
err_delete_participant:
    dds_delete(pub.participant);
    dds_delete_qos(qos);
err_free_pub:
    return -1;
}
int DDSUtil::PublishLowState(void)
{
    struct timespec begin, end1, end2, end3, end4, end5;
    long timer1 = 0, timer2 = 0, timer3 = 0, timer4 = 0, timer5 = 0;
    static long maxTimer1 = 0, maxTimer2 = 0, maxTimer3 = 0, maxTimer4 = 0, maxTimer5 = 0;
    dds_return_t rc = 0;
    // get data
    clock_gettime(CLOCK_MONOTONIC, &begin);
    // 1 game pad
    GamepadHandler::getInstance().DealGPData(gLowState.wireless_remote);
    clock_gettime(CLOCK_MONOTONIC, &end1);
    // 2 axis
    // for (int i = 0; i < MAX_AXIS; i++) {
    //     gLowState.motor_state[i].mode = 1;
    //     gLowState.motor_state[i].q = 2;
    //     gLowState.motor_state[i].dq = 3;
    // }
    // 3 imux
    IMUHandler::getInstance().GetData(&gLowState.imu_state);
    clock_gettime(CLOCK_MONOTONIC, &end2);
    // publish
    rc = dds_write(gLowStatePub.writer, &gLowState);
    if (rc != DDS_RETCODE_OK)
    {
        DDS_FATAL("dds_write: %s\n", dds_strretcode(-rc));
        return -1;
    }
    // printf("%s: write successfully!\n", __FUNCTION__);
    clock_gettime(CLOCK_MONOTONIC, &end3);

    timer1 = (end1.tv_sec - begin.tv_sec) * 1000000 +
             (end1.tv_nsec - begin.tv_nsec) / 1000;
    timer2 = (end2.tv_sec - end1.tv_sec) * 1000000 +
             (end2.tv_nsec - end1.tv_nsec) / 1000;
    timer3 = (end3.tv_sec - end2.tv_sec) * 1000000 +
             (end3.tv_nsec - end2.tv_nsec) / 1000;

    if (timer1 > maxTimer1)
    {
        maxTimer1 = timer1;
    }
    if (timer2 > maxTimer2)
    {
        maxTimer2 = timer2;
    }
    if (timer3 > maxTimer3)
    {
        maxTimer3 = timer3;
    }
    static int i = 0;
    if (i++ % 50000 == 0)
    {
        printf("a part of lowstate, timer1=%ld, maxTimer1=%ld\n", timer1, maxTimer1);
        printf("a part of lowstate, timer2=%ld, maxTimer2=%ld\n", timer2, maxTimer2);
        printf("a part of lowstate, timer3=%ld, maxTimer3=%ld\n", timer3, maxTimer3);
    }
    return 0;
}

When tested on Ubuntu 20.04 with the rt patch, the CPU utilization was about 10%, and the execution time of dds_write and dds_take jumped between 40us and 700us, which was very unstable. I am using cyclonedds-0.10.5.

 cat /dobot/userdata/project/dds/cyclonedds.xml
<?xml version="1.0" encoding="UTF-8" ?>
<CycloneDDS xmlns="https://cdds.io/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="https://cdds.io/config https://raw.githubusercontent.com/eclipse-cyclonedds/cyclonedds/master/etc/cyclonedds.xsd">
    <Domain Id="any">
        <General>
            <Interfaces>
                <NetworkInterface autodetermine="false" address="192.168.5.1" priority="default" multicast="false" />
            </Interfaces>
            <AllowMulticast>default</AllowMulticast>
            <MaxMessageSize>65500B</MaxMessageSize>
        </General>
        <Discovery>
            <EnableTopicDiscoveryEndpoints>true</EnableTopicDiscoveryEndpoints>
        </Discovery>
        <Internal>
            <Watermarks>
                <WhcHigh>500kB</WhcHigh>
            </Watermarks>
        </Internal>
    </Domain>
</CycloneDDS>


@yanzhang920817
Copy link
Author

@hansvanthag Hello, do you have any doubts or suggestions? I will do some related tests, thank you.

@hansvanthag
Copy link
Member

As was stated (on discord support channel), there's multiple topics (of various sizes) being communicated along the 1Khz writes of 300 byte samples (for which the write and read execution times are being measured).
So I have the following suggestions:

  1. can you retry the test without those other topics being communicated 'in parallel' ?
  2. I assume a network is involved and if so: what network (100mbps, 1gbps, ..) is being used (to rule out congestion)
  3. shown top-output didn't indicate any threads with high-core-cpu-usage which is puzzling to us
  4. if I'm right that you're using KEEP_LAST(1) writer-history and (supposedly) best-effort reliability that makes it even stranger

Therefore, the question to contribute a reproducer stands as we're pretty sure that on 'your' machine (2.4 Ghz i5) a 1Khz writer of 300 bytes shouldn't be that slow. Note that Jitter (on a non-realtime OS) can be caused by many things, so you might want to try to run the app at a RT-priority (nice --20) to see if that reduces the the write-time jitter ..
FInally there's bundled performance-tests with Cyclone (pubsub/roundtrip) that you could try to see if those also behave strangely

@yanzhang920817
Copy link
Author

My operating system is a real-time system with the rt patch added.
uname -a Linux dobot-IB-ITLU-TW01B 6.1.0-rt5 #1 SMP PREEMPT_RT Sun Oct 6 13:47:58 CST 2024 x86_64 x86_64 x86_64 GNU/Linux
I used chrt to give my program the highest priority, and tested that it could optimize timing jitter. (Isolating cores 0 and 1 was done before, but it had no obvious effect)
chrt -f 99 taskset -c 0,1 ./host

@yanzhang920817
Copy link
Author

As was stated (on discord support channel), there's multiple topics (of various sizes) being communicated along the 1Khz writes of 300 byte samples (for which the write and read execution times are being measured). So I have the following suggestions:

  1. can you retry the test without those other topics being communicated 'in parallel' ?
  2. I assume a network is involved and if so: what network (100mbps, 1gbps, ..) is being used (to rule out congestion)
  3. shown top-output didn't indicate any threads with high-core-cpu-usage which is puzzling to us
  4. if I'm right that you're using KEEP_LAST(1) writer-history and (supposedly) best-effort reliability that makes it even stranger

Therefore, the question to contribute a reproducer stands as we're pretty sure that on 'your' machine (2.4 Ghz i5) a 1Khz writer of 300 bytes shouldn't be that slow. Note that Jitter (on a non-realtime OS) can be caused by many things, so you might want to try to run the app at a RT-priority (nice --20) to see if that reduces the the write-time jitter .. FInally there's bundled performance-tests with Cyclone (pubsub/roundtrip) that you could try to see if those also behave strangely

1, I will retry the test;
2, I am communicating locally, and the network port set in XML is a Gigabit port. Even if it is local communication, if the network port set in XML is a 100M port, will it affect the execution time?
3,After executing chrt -f 99 taskset -c 0,1 ./host, the time jitter is within 100us;
4, I have turned on Turbo Boost, and all current test data are tested with Turbo Boost turned on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants