Skip to content
This repository has been archived by the owner on May 4, 2021. It is now read-only.

candidates2corpus.py: verify that splitters exist #10

Open
achimr opened this issue Jul 15, 2017 · 4 comments
Open

candidates2corpus.py: verify that splitters exist #10

achimr opened this issue Jul 15, 2017 · 4 comments
Assignees

Comments

@achimr
Copy link
Contributor

achimr commented Jul 15, 2017

If the splitters specified in the command line do not exist this just fails silently and the script still runs through the entire corpus download.

@achimr achimr added the bug label Jul 15, 2017
@achimr achimr self-assigned this Jul 15, 2017
@achimr
Copy link
Contributor Author

achimr commented Oct 16, 2017

In https://github.com/ModernMT/DataCollection/blob/65e4700ed46910c1e7e7e7e1bf5a541511650a64/baseline/external_processor.py#L19
subprocess.Popen should throw an OsError exception if the splitter command doesn't exist. Curious why it doesn't in this case.

@achimr
Copy link
Contributor Author

achimr commented Oct 16, 2017

In the constructor above self.cmd is an array, as the ExternalTextProcessor object is constructed like this:
https://github.com/ModernMT/DataCollection/blob/65e4700ed46910c1e7e7e7e1bf5a541511650a64/baseline/external_processor.py#L82

@achimr
Copy link
Contributor Author

achimr commented Nov 30, 2017

@achimr
Copy link
Contributor Author

achimr commented Nov 30, 2017

self.cmd in the Popen call above can be a command string or an array. Only if it is an array one can check if the first parameter is actually an available command (either checking the full path and/or searching for it in PATH). This would only partially fix the problem. Not fixing this for now.
Needs to be looked into with larger context how to call the splitters from the script https://stackoverflow.com/questions/2629680/deciding-among-subprocess-multiprocessing-and-thread-in-python

@achimr achimr added enhancement and removed bug labels Nov 30, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant