Skip to content

Commit

Permalink
Merge branch 'testing'
Browse files Browse the repository at this point in the history
  • Loading branch information
severinsimmler committed Feb 26, 2018
2 parents 0b54f18 + bb4d83c commit f105259
Show file tree
Hide file tree
Showing 31 changed files with 3,000 additions and 590 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@ venv/*
.idea/*
webapp.log
static/topicmodeling.zip
__pycache__/*
Pipfile.lock
1 change: 1 addition & 0 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Flask = "==0.12.2"
lxml = "==4.1.1"
pandas = "==0.21.1"
numpy = "==1.14.0"
pyqt = "==5.9.2"


[dev-packages]
Expand Down
80 changes: 30 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,24 @@
# Topics Explorer: A GUI for Topics – Easy Topic Modeling
This application introduces an user-friendly Topic Modeling workflow, basically containing text data preprocessing, the actual modeling using [latent Dirichlet allocation](http://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf) (LDA), as well as various interactive visualizations.

**If you do not know anything about Topic Modeling or programming in general, this is where you start.**
If you do not know anything about Topic Modeling or programming in general, this is where you start.

**Topics Explorer** aims for *simplicity* and *usability*. If you are working with a large corpus (let's say more than 200 documents, 5000 tokens each document) you may wish to use more sophisticated Topic Models such as those implemented in [MALLET](http://mallet.cs.umass.edu/topics.php), which is known to be more robust than standard LDA. Have a look at our Jupyter notebook [introducing Topic Modeling with MALLET](https://github.com/DARIAH-DE/Topics/IntroducingMallet.ipynb).

![Demonstrator Screenshot](screenshot.png)
## Getting started with the standalone executable
You **do not** have to install a Python interpreter or anything else. There is currently one standalone build for Windows and macOS, respectively. **At the moment, Linux user will have to use the development version**.

1. Go to the [release-section](https://github.com/DARIAH-DE/TopicsExplorer/releases) and download the ZIP archive for your OS.
2. Open it by double-clicking.
3. Run the app by double-clicking the file `DARIAH Topics Explorer`. (The files in the folder `src` is basically source code. You do not need to worry about that).

## Getting started
Although this application is built with Python and some JavaScript, it is possible to run it as if it was a native application, without having to install Python or any related packages. There is currently one build for Windows and macOS, respectively.
**Topics Explorer** aims for simplicity and usability. If you are working with a large corpus (let's say more than 200 documents, 5000 tokens each document) you may wish to use more sophisticated topic models such as those implemented in [MALLET](http://mallet.cs.umass.edu/topics.php), which is known to be more robust than standard LDA. Have a look at our Jupyter notebook [introducing Topic Modeling with MALLET](https://github.com/DARIAH-DE/Topics/blob/master/IntroducingMallet.ipynb).

1. Download `demonstrator-0.0.1-windows.zip` or `demonstrator-0.0.1-mac.zip` from the [release-section](https://github.com/DARIAH-DE/Topics/releases).
2. Open it by double-clicking.
3. Run the app by double-clicking the file `DARIAH Topics Explorer.exe` or `DARIAH Topics Explorer.app`, respectively.
![Demonstrator Screenshot](screenshot.png)


### Troubleshooting
* Please be patient. Depending on corpus size and number of iterations, the process may take some time, meaning something between some seconds and some hours.
* If you are on a Mac and get an error message saying that the file is from an “unidentified developer”, you can override it by holding control while double-clicking. The error message will still appear, but you will be given an option to run the file anyway.
* Please use [GitHub Issues](https://github.com/DARIAH-DE/TopicsExplorer/issues).
* Please use [GitHub issues](https://github.com/DARIAH-DE/TopicsExplorer/issues).


## Working with the development version
Expand All @@ -32,69 +31,50 @@ Although this application is built with Python and some JavaScript, it is possib

### Requirements
Besides the standalone executables, you have the ability to run the development version. In this case, you will have to install some dependencies, but first of all:

* At least Python 3.6, from [here](https://www.python.org/downloads/). Python 2 is *not* supported.
* If you wish to use *Layer 3* (which is not necessary at all): Node.js, from [here](https://nodejs.org/en/download/).

For Python, you will need the following libraries:
* [`dariah_topics`](https://github.com/DARIAH-DE/Topics) 0.0.5.
* [`lda`](https://github.com/lda-project/lda) 1.0.5.
* [`bokeh`](https://github.com/bokeh/bokeh) 0.12.13.
* [`flask`](https://github.com/pallets/flask) 0.12.2.
* [`lxml`](https://github.com/lxml/lxml) 4.1.1.
* [`pandas`](https://github.com/pandas-dev/pandas) 0.21.1.
* [`numpy`](https://github.com/numpy/numpy) 1.14.0.
You will need the following libraries:
* [`dariah_topics`](https://github.com/DARIAH-DE/Topics) 0.0.6
* [`lda`](https://github.com/lda-project/lda) 1.0.5
* [`bokeh`](https://github.com/bokeh/bokeh) 0.12.13
* [`flask`](https://github.com/pallets/flask) 0.12.2
* [`lxml`](https://github.com/lxml/lxml) 4.1.1
* [`pandas`](https://github.com/pandas-dev/pandas) 0.21.1
* [`numpy`](https://github.com/numpy/numpy) 1.14.0
* [`pyqt5`](https://github.com/baoboa/pyqt5) 5.9.2.

You can install all dependencies using [`pipenv`](http://pipenv.readthedocs.io/en/latest/):

```
pipenv install
```

> If you are on a UNIX-based machine, remember using `pip3` and `python3` instead of `pip` and `python`.
So far, you could run the application via `python webapp.py` and go to `http://127.0.0.1:5000` in any web browser. If you want a more desktop app-like feeling, you can build *Layer 3* on top with [Electron](https://electronjs.org/), a JavaScript framework for creating native applications with web technologies like JavaScript, HTML, and CSS. The dependencies are:

* [`electron`](https://github.com/electron/electron) 1.7.10.
* [`request-promise`](https://github.com/request/request-promise) 4.2.2.
* [`request`](https://github.com/request/request) 2.83.1.
> If you are on a UNIX-based machine, remember using `python3` instead of `python`.
Run the following command via [`npm`](https://www.npmjs.com/get-npm):
So far, you could run the application via `python webapp.py` and go to `http://127.0.0.1:5000` in any web browser. If you want a more desktop app-like feeling, you can build *Layer 3* on top and run:

```
npm install
python topicsexplorer.py
```


### Contents
* [`bokeh_templates`](bokeh_templates): HTML templates for `bokeh`. This is only relevant, if you want to freeze the Python part with `pyinstaller`.
* [`hooks`](hooks): Necessary hook files. This is only relevant, if you want to freeze the Python part with `pyinstaller`.
* [`main.js`](main.js): Basically the GUI.
* [`package.json`](package.json): Metadata, dependencies, and scripts for the GUI.
* [`bokeh_templates`](bokeh_templates): HTML templates for `bokeh`. This is only relevant, if you want to freeze the scripts with PyInstaller.
* [`hooks`](hooks): Necessary hook files. This is only relevant, if you want to freeze the Python part with PyInstaller.
* [`static`](static) and [`templates`](templates): Static files (e.g. images, CSS, etc.) and HTML templates for the `flask` template engine.
* [`test`](test): Unittest for `webapp.py`, testing all functions of the application.
* [`webapp.py`](webapp.py): Contains 3rd party functions and communicates with the webserver.
* [`webapp.spec`](webapp.spec): The build script for `pyinstaller` containing metadata.
* [`topicsexplorer.py`](topicsexplorer.py): A Qt-based UI displaying the contents of the app by running `webapp.py`.
* [`topicsexplorer.spec`](webapp.spec): The build script for PyInstaller containing metadata.


### Troubleshooting
* When installing `electron` fails, try `sudo npm install -g electron --unsafe-perm=true --allow-root`.
* Please use [GitHub Issues](https://github.com/DARIAH-DE/TopicsExplorer/issues).

* Please use [GitHub issues](https://github.com/DARIAH-DE/TopicsExplorer/issues).

## Creating a build for Layer 1 and 2
To freeze the Python part with `pyinstaller`, run on macOS:

```
pyinstaller --onefile --add-data static:static --add-data templates:templates --add-data bokeh_templates:bokeh_templates --additional-hooks-dir hooks webapp.py
```

or, for Windows:
```
pyinstaller --onefile --add-data static;static --add-data templates;templates --add-data bokeh_templates;bokeh_templates --additional-hooks-dir hooks webapp.py
```
## Creating a build for the whole application
To freeze the Electron part with `electron-builder`, run:
## Creating a standalone build
To freeze the Python scripts with [PyInstaller](http://www.pyinstaller.org/), simply run:

```
electron-builder
pyinstaller topicsexplorer.spec
```
113 changes: 36 additions & 77 deletions bokeh_templates/autoload_js.js
Original file line number Diff line number Diff line change
@@ -1,54 +1,43 @@
{#
Renders JavaScript code
for "autoloading".
Renders JavaScript code for "autoloading".

The code automatically and asynchronously loads BokehJS(
if necessary) and
then replaces the AUTOLOAD_TAG `` < script > ``
tag that
calls it with the rendered model.
The code automatically and asynchronously loads BokehJS (if necessary) and
then replaces the AUTOLOAD_TAG ``<script>`` tag that
calls it with the rendered model.

: param elementid: the unique id
for the script tag
: type elementid: str
:param elementid: the unique id for the script tag
:type elementid: str

: param js_urls: URLs of JS files making up Bokeh library: type js_urls: list
:param js_urls: URLs of JS files making up Bokeh library
:type js_urls: list

: param css_urls: CSS urls to inject: type css_urls: list
:param css_urls: CSS urls to inject
:type css_urls: list

#
}
#}
(function(root) {
function now() {
return new Date();
}

var force = {
{
force |
default (False) | json
}
};
var force = {{ force|default(False)|json }};

if (typeof(root._bokeh_onload_callbacks) === "undefined" || force === true) {
if (typeof (root._bokeh_onload_callbacks) === "undefined" || force === true) {
root._bokeh_onload_callbacks = [];
root._bokeh_is_loading = undefined;
}

{ % block register_mimetype %
} { % endblock %
}
{% block register_mimetype %}
{% endblock %}

{ % block autoload_init %
} { % endblock %
}
{% block autoload_init %}
{% endblock %}

function run_callbacks() {
try {
root._bokeh_onload_callbacks.forEach(function(callback) {
callback()
});
} finally {
root._bokeh_onload_callbacks.forEach(function(callback) { callback() });
}
finally {
delete root._bokeh_onload_callbacks
}
console.info("Bokeh: all callbacks have finished");
Expand Down Expand Up @@ -86,70 +75,40 @@
}
};

{ % -
if elementid - %
}
var element = document.getElementById({
{
elementid | json
}
});
{%- if elementid -%}
var element = document.getElementById({{ elementid|json }});
if (element == null) {
console.log("Bokeh: ERROR: autoload.js configured with elementid '{{ elementid }}' but no matching script tag was found. ")
return false;
} { % -endif %
}
{%- endif %}

var js_urls = {
{
js_urls | json
}
};

var inline_js = [{ % -
for js in js_raw %
}
var js_urls = {{ js_urls|json }};

var inline_js = [
{%- for js in js_raw %}
function(Bokeh) {
{
{
js | indent(6)
}
}
{{ js|indent(6) }}
},
{ % endfor - %
}

{% endfor -%}
function(Bokeh) {
{ % -
for url in css_urls %
}
{%- for url in css_urls %}
console.log("Bokeh: injecting CSS: {{ url }}");
Bokeh.embed.inject_css({
{
url | json
}
}); { % -endfor %
} { % -
for css in css_raw %
}
Bokeh.embed.inject_css({{ url|json }});
{%- endfor %}
{%- for css in css_raw %}
console.log("Bokeh: injecting raw CSS");
Bokeh.embed.inject_raw_css({
{
css
}
}); { % -endfor %
}
Bokeh.embed.inject_raw_css({{ css }});
{%- endfor %}
}
];

function run_inline_js() {
{ % block run_inline_js %
}
{% block run_inline_js %}
for (var i = 0; i < inline_js.length; i++) {
inline_js[i].call(root, root.Bokeh);
} { % endblock %
}
{% endblock %}
}

if (root._bokeh_is_loading === 0) {
Expand Down
Loading

0 comments on commit f105259

Please sign in to comment.