Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing LibC dependency, improving search algorithms, simplifying API #54

Merged
merged 8 commits into from
Oct 9, 2023

Conversation

ashvardanian
Copy link
Owner

@ashvardanian ashvardanian commented Oct 9, 2023

This is a large PR, but most importantly, it removes the LibC dependency 🥳

#include <ctype.h>  // `tolower`
#include <search.h> // `qsort_s`
#include <stddef.h> // `sz_size_t`
#include <stdint.h> // `uint8_t`
#include <stdlib.h> // `qsort_r`
#include <string.h> // `memcpy`

All of those headers are gone.

// Define a type for the comparison function, depending on the platform.
#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__) || defined(__APPLE__)
typedef int (*sz_qsort_comparison_func_t)(void *, void const *, void const *);
#else
typedef int (*sz_qsort_comparison_func_t)(void const *, void const *, void *);
#endif


inline static int _sz_sort_sequence_strncmp(
#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__) || __APPLE__
    void *sequence_raw, void const *a_raw, void const *b_raw
#else
    void const *a_raw, void const *b_raw, void *sequence_raw
#endif
) {
    // https://man.freebsd.org/cgi/man.cgi?query=qsort_s&sektion=3&n=1
    // https://www.man7.org/linux/man-pages/man3/strcmp.3.html
    sz_sequence_t *sequence = (sz_sequence_t *)sequence_raw;
    sz_size_t a = *(sz_size_t *)a_raw;
    sz_size_t b = *(sz_size_t *)b_raw;
    sz_size_t a_len = sequence->get_length(sequence->handle, a);
    sz_size_t b_len = sequence->get_length(sequence->handle, b);
    int res = strncmp( //
        sequence->get_start(sequence->handle, a),
        sequence->get_start(sequence->handle, b),
        a_len > b_len ? b_len : a_len);
    return res ? res : a_len - b_len;
}

inline static int _sz_sort_sequence_strncasecmp(
#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__) || __APPLE__
    void *sequence_raw, void const *a_raw, void const *b_raw
#else
    void const *a_raw, void const *b_raw, void *sequence_raw
#endif
) {
    // https://man.freebsd.org/cgi/man.cgi?query=qsort_s&sektion=3&n=1
    // https://www.man7.org/linux/man-pages/man3/strcmp.3.html
    sz_sequence_t *sequence = (sz_sequence_t *)sequence_raw;
    sz_size_t a = *(sz_size_t *)a_raw;
    sz_size_t b = *(sz_size_t *)b_raw;
    sz_size_t a_len = sequence->get_length(sequence->handle, a);
    sz_size_t b_len = sequence->get_length(sequence->handle, b);
    int res = strncasecmp( //
        sequence->get_start(sequence->handle, a),
        sequence->get_start(sequence->handle, b),
        a_len > b_len ? b_len : a_len);
    return res ? res : a_len - b_len;
}

...

        // Perform sorts on smaller chunks instead of the whole handle
#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__)
        // https://stackoverflow.com/a/39561369
        // https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
        qsort_s(sequence->order, split, sizeof(sz_size_t), qsort_comparator, (void *)sequence);
        qsort_s(sequence->order + split,
                sequence->count - split,
                sizeof(sz_size_t),
                qsort_comparator,
                (void *)sequence);
#elif __APPLE__
        qsort_r(sequence->order, split, sizeof(sz_size_t), (void *)sequence, qsort_comparator);
        qsort_r(sequence->order + split,
                sequence->count - split,
                sizeof(sz_size_t),
                (void *)sequence,
                qsort_comparator);
#else
        // https://linux.die.net/man/3/qsort_r
        qsort_r(sequence->order, split, sizeof(sz_size_t), qsort_comparator, (void *)sequence);
        qsort_r(sequence->order + split,
                sequence->count - split,
                sizeof(sz_size_t),
                qsort_comparator,
                (void *)sequence);
#endif

...

The ugly qsort_r and qsort_s mess is also gone!

@ashvardanian ashvardanian merged commit c8a5b14 into main-dev Oct 9, 2023
1 of 5 checks passed
@ashvardanian
Copy link
Owner Author

🎉 This PR is included in version 2.0.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

@ashvardanian ashvardanian deleted the 53-remove-libc-dependency branch November 4, 2023 16:57
vmanot pushed a commit to vmanot/StringZilla that referenced this pull request Jan 29, 2024
…ependency

Removing LibC dependency, improving search algorithms, simplifying API
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant