Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broken the sort #9

Open
pixie-grasper opened this issue Oct 13, 2015 · 2 comments
Open

broken the sort #9

pixie-grasper opened this issue Oct 13, 2015 · 2 comments

Comments

@pixie-grasper
Copy link

Hello, I'm using libdivsufsort on the GitHub,
and I've tried to use it.

But, when input is a 'appellee$' (without quotation-mark), I've wanted 'e$elplepa', but we'll get '$eelplepa' as an output (see the slide https://www.cs.cmu.edu/~ckingsf/bioinfo-lectures/bwt.pdf page 6).

snippet is there.

  auto buffer = static_cast<sauchar_t*>(malloc(sizeof(sauchar_t) * source.size()));
  for (auto i = 0ul; i < source.size(); i++) {
    buffer[i] = static_cast<sauchar_t>(source[i]);
  }
  auto index = divbwt(buffer, buffer, NULL, static_cast<saidx_t>(source.size()));

What happend?

@akamiru
Copy link

akamiru commented Jun 10, 2016

That's because libdivsufsort (like most other bwt libraries) adds the sentinel (smallest character in the string / the $ in your version) itself. So it actually sorts something like "appellee$$". To get the bwt without sentinel you can rotate the first char of the minimal lexicographical rotation to the back and sort source.size()-1 (I do this in my compression tool https://github.com/akamiru/bce if you need an example, just search for bwt and rotate).

@pixie-grasper
Copy link
Author

Thank you for tell me why the sort broken and how to hack.

Sorry, but I've decided to write the BWT myself at last year to solve this problem. I've wrote the functions in c++14 with Larsson-Sadakane's method. And finally, I've packed into a new library with others method.
https://github.com/pixie-grasper/research-library
https://github.com/pixie-grasper/research-library/blob/master/includes/burrows-wheeler-transform.h
I'm sorry for the readme of my library has been wrote in the Japanese. that because I'm a Japanese. :-|

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants