Skip to content

Conversation

@Meigyoku-Thmn
Copy link

@Meigyoku-Thmn Meigyoku-Thmn commented Dec 4, 2018

I would like to add a fix for this issue: #31
Also I would like to expose 3 more methods, they are has_value(), root() and follow(). They will be useful if we want to scan a string then highlight every matched keywords, for example:

Target string: "One two three!"
List of keywords: ["one", "three"]

import dawg
import re

targetedStr = 'one two three!'
targetLen = len(targetedStr)
dic = dawg.DAWG(['one', 'three'])
index = dic.root()
hasMore = False
reportedStr = ""
bufferedStr = ""
isContainingKeywords = False
for i, chr in enumerate(targetedStr):
    willRedoInNextLoop = True
    while (willRedoInNextLoop):
        willRedoInNextLoop = False
        bufferedStr += targetedStr[i]
        hasMore, index = dic.follow(chr, index)
        isEgdeCase = (hasMore == True and i + 1 >= targetLen)
        if (hasMore == False or isEgdeCase == True):
            if (dic.has_value(index)):
                # Keyword detected!
                keyword = bufferedStr if isEgdeCase else bufferedStr[0:(
                    len(bufferedStr) - 1)]
                reportedStr += "{" + keyword + "}"
                isContainingKeywords = True
                if not isEgdeCase:
                    willRedoInNextLoop = True
            else:
                # Not a keyword
                reportedStr += bufferedStr
            bufferedStr = ""
            index = dic.root()
if (isContainingKeywords == True):
    reportedStr += bufferedStr 
    print(reportedStr)

Output:

{one} two {three}!

Meigyoku Thmn added 2 commits December 4, 2018 11:40
useful for keyword scanning and highlighting
@pikaliov
Copy link

HI.
I'm trying to compile your branch of DAWG with python3.7 and Cython 0.27.3.
sudo python3.7 -m pip install git+https://github.com/Meigyoku-Thmn/DAWG --upgrade
Can u guide me how to compile? Thank you.

Collecting git+https://github.com/Meigyoku-Thmn/DAWG
  Cloning https://github.com/Meigyoku-Thmn/DAWG to /tmp/pip-req-build-76e2ndpc
    running install
    running build
    running build_ext
    building 'dawg' extension
    creating build
    creating build/temp.linux-x86_64-3.7
    creating build/temp.linux-x86_64-3.7/src
    creating build/temp.linux-x86_64-3.7/lib
    creating build/temp.linux-x86_64-3.7/lib/b64
    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/_dawg.cpp -o build/temp.linux-x86_64-3.7/src/_dawg.o
    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/iostream.cpp -o build/temp.linux-x86_64-3.7/src/iostream.o
    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/_dictionary.cpp -o build/temp.linux-x86_64-3.7/src/_dictionary.o
    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/_completer.cpp -o build/temp.linux-x86_64-3.7/src/_completer.o
    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/_guide_unit.cpp -o build/temp.linux-x86_64-3.7/src/_guide_unit.o
    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/_guide.cpp -o build/temp.linux-x86_64-3.7/src/_guide.o
    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/b64_decode.cpp -o build/temp.linux-x86_64-3.7/src/b64_decode.o
    gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/dawg.cpp -o build/temp.linux-x86_64-3.7/src/dawg.o
    In file included from src/dawg.cpp:266:0:
    src/../lib/dawgdic/dictionary-builder.h: In member function ‘bool dawgdic::DictionaryBuilder::BuildDictionary(dawgdic::BaseType, dawgdic::BaseType)’:
    src/../lib/dawgdic/dictionary-builder.h:138:5: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
         if (dawg_.is_merging(dawg_child_index))
         ^~
    src/../lib/dawgdic/dictionary-builder.h:139:53: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
           link_table_.Insert(dawg_child_index, offset); {
                                                         ^
    src/dawg.cpp: In function ‘PyObject* __pyx_f_4dawg_9BytesDAWG_items(__pyx_obj_4dawg_BytesDAWG*, int, __pyx_opt_args_4dawg_9BytesDAWG_items*)’:
    src/dawg.cpp:11011:37: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
         for (__pyx_t_10 = 0; __pyx_t_10 < __pyx_t_9; __pyx_t_10+=1) {
                              ~~~~~~~~~~~^~~~~~~~~~~
    src/dawg.cpp: In function ‘PyObject* __pyx_gb_4dawg_9BytesDAWG_24generator2(__pyx_GeneratorObject*, PyObject*)’:
    src/dawg.cpp:11485:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
         for (__pyx_t_6 = 0; __pyx_t_6 < __pyx_t_5; __pyx_t_6+=1) {
                             ~~~~~~~~~~^~~~~~~~~~~
    src/dawg.cpp: In function ‘PyObject* __pyx_f_4dawg_9BytesDAWG_keys(__pyx_obj_4dawg_BytesDAWG*, int, __pyx_opt_args_4dawg_9BytesDAWG_keys*)’:
    src/dawg.cpp:11814:37: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
         for (__pyx_t_10 = 0; __pyx_t_10 < __pyx_t_9; __pyx_t_10+=1) {
                              ~~~~~~~~~~~^~~~~~~~~~~
    src/dawg.cpp: In function ‘PyObject* __pyx_gb_4dawg_9BytesDAWG_29generator3(__pyx_GeneratorObject*, PyObject*)’:
    src/dawg.cpp:12222:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
         for (__pyx_t_6 = 0; __pyx_t_6 < __pyx_t_5; __pyx_t_6+=1) {
                             ~~~~~~~~~~^~~~~~~~~~~
    src/dawg.cpp: In function ‘int __Pyx_GetException(PyObject**, PyObject**, PyObject**)’:
    src/dawg.cpp:21523:24: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
         tmp_type = tstate->exc_type;
                            ^~~~~~~~
                            curexc_type
    src/dawg.cpp:21524:25: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
         tmp_value = tstate->exc_value;
                             ^~~~~~~~~
                             curexc_value
    src/dawg.cpp:21525:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
         tmp_tb = tstate->exc_traceback;
                          ^~~~~~~~~~~~~
                          curexc_traceback
    src/dawg.cpp:21526:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
         tstate->exc_type = local_type;
                 ^~~~~~~~
                 curexc_type
    src/dawg.cpp:21527:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
         tstate->exc_value = local_value;
                 ^~~~~~~~~
                 curexc_value
    src/dawg.cpp:21528:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
         tstate->exc_traceback = local_tb;
                 ^~~~~~~~~~~~~
                 curexc_traceback
    src/dawg.cpp: In function ‘void __Pyx_ExceptionSwap(PyObject**, PyObject**, PyObject**)’:
    src/dawg.cpp:21550:24: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
         tmp_type = tstate->exc_type;
                            ^~~~~~~~
                            curexc_type
    src/dawg.cpp:21551:25: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
         tmp_value = tstate->exc_value;
                             ^~~~~~~~~
                             curexc_value
    src/dawg.cpp:21552:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
         tmp_tb = tstate->exc_traceback;
                          ^~~~~~~~~~~~~
                          curexc_traceback
    src/dawg.cpp:21553:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
         tstate->exc_type = *type;
                 ^~~~~~~~
                 curexc_type
    src/dawg.cpp:21554:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
         tstate->exc_value = *value;
                 ^~~~~~~~~
                 curexc_value
    src/dawg.cpp:21555:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
         tstate->exc_traceback = *tb;
                 ^~~~~~~~~~~~~
                 curexc_traceback
    src/dawg.cpp: In function ‘void __Pyx_ExceptionSave(PyObject**, PyObject**, PyObject**)’:
    src/dawg.cpp:21568:21: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
         *type = tstate->exc_type;
                         ^~~~~~~~
                         curexc_type
    src/dawg.cpp:21569:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
         *value = tstate->exc_value;
                          ^~~~~~~~~
                          curexc_value
    src/dawg.cpp:21570:19: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
         *tb = tstate->exc_traceback;
                       ^~~~~~~~~~~~~
                       curexc_traceback
    src/dawg.cpp: In function ‘void __Pyx_ExceptionReset(PyObject*, PyObject*, PyObject*)’:
    src/dawg.cpp:21582:24: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
         tmp_type = tstate->exc_type;
                            ^~~~~~~~
                            curexc_type
    src/dawg.cpp:21583:25: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
         tmp_value = tstate->exc_value;
                             ^~~~~~~~~
                             curexc_value
    src/dawg.cpp:21584:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
         tmp_tb = tstate->exc_traceback;
                          ^~~~~~~~~~~~~
                          curexc_traceback
    src/dawg.cpp:21585:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
         tstate->exc_type = type;
                 ^~~~~~~~
                 curexc_type
    src/dawg.cpp:21586:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
         tstate->exc_value = value;
                 ^~~~~~~~~
                 curexc_value
    src/dawg.cpp:21587:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
         tstate->exc_traceback = tb;
                 ^~~~~~~~~~~~~
                 curexc_traceback
    error: command 'gcc' failed with exit status 1
    
    ----------------------------------------
  Can't roll back DAWG; was not uninstalled
Command "/usr/local/bin/python3.7 -u -c "import setuptools, tokenize;__file__='/tmp/pip-req-build-76e2ndpc/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-6iiqpsev/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-req-build-76e2ndpc/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants