Skip to content

Latest commit

 

History

History
185 lines (136 loc) · 9.64 KB

README.md

File metadata and controls

185 lines (136 loc) · 9.64 KB

dataspects for MediaWiki

dataspects for MediaWiki is based on Meilisearch and instant-meilisearch.

dataspects TDM Documentation

flowchart BT

  subgraph Extension:Dataspects
    mediawikiAPI("<b>MediaWiki API</b>
    - LocalSettings.php
    - <a href='https://mwstakeorg.dataspects.com/w/api.php?action=help&modules=dataspectsapi'>dataspectsapi</a>")
    sQLite("<b>SQLite</b><br/>for managing search facet configs")
    Cypress("<b><a href='https://www.cypress.io/'>Cypress</a></b><ul><li><a href='https://github.com/dataspects/mediawiki-extensions-Dataspects/tree/main/cypress/e2e'>end-to-end and component tests</a></li><li><a href='https://htmlpreview.github.io/?https://github.com/dataspects/mediawiki-extensions-Dataspects/blob/master/doc/search-facets.cy.js.html'>automatic documentation</a></li></ul>")
    AnalysisPipelines("<a href='https://github.com/dataspects/mediawiki-extensions-Dataspects/tree/main/src/jobs'>Analysis Pipelines</a><ol><li>read a facet from storage</li><li>use <span style='color:orange;'>modules/services</span> to conclude annotations</li><li>write altered documents back to storage</li></ol>")
  end

  storage("<b>Storage</b><ul><li>Meilisearch</li><li>Neo4j</li></ul>")
  analyzers("<b style='color:orange;'>Services</b><ul><li>Tika</li><li>spaCy</li></ul>")

  subgraph Internet
    userAgent("<b>Special:Dataspects</b> (Algolia <a href='https://www.algolia.com/doc/guides/building-search-ui/what-is-instantsearch/js/'>InstantSearch</a>)<ul><li><a href='https://github.com/dataspects/mediawiki-extensions-Dataspects/blob/main/resources/ext.dataspectsSearch/profiles.json'>hit profile <-> searchResultClass matching</a></li></ul><b>Special:DataspectsBackstage</b>")
    internetSources("<b>- mediawiki.org</b><br/><b>- semantic-mediawiki.org</b><br/><b>- riot.im</b><br/>...")
  end

  subgraph Workstation
    DataspectsCLI("<b><a href='https://github.com/dataspects/dataspects'>dataspects (Go CLI)</a></b>
    - export DS_MEILI_MASTERKEY=
    - export INDEX=")
  end
  DataspectsCLI-.-|configure/manage|storage
  userAgent-->|<b>Search</b><br/>wgDataspectsSearchKey|storage
  userAgent-->mediawikiAPI
  mediawikiAPI-->|<b>CRUD</b><br/>wgDataspectsWriteKey|storage
  AnalysisPipelines-.-analyzers
  AnalysisPipelines-.-storage
  DataspectsCLI-->|<b>Read</b>|internetSources
  mediawikiAPI-->|<b>CRUD</b>|sQLite

classDef default text-align:left;
linkStyle 0,3,6 stroke:#ff0000
linkStyle 1,4,5 stroke:#00ff00
Loading

Features

PENDING

  • Delete docs from indexes

Indexing (feed data)

  • COMMAND: sudo docker exec canasta-dockercompose_web_1 bash -c 'php extensions/Dataspects/maintenance/feedAll.php'
  • COMMAND: dataspects__feed-mediawiki-category-to-index.sh
  • MONITOR: mwstakeorg__localhost__debug-log.sh

Example: configure dataspects for Canasta

Fixme

  1. Add to Canasta MediaWiki container: composer require --with-all-dependencies meilisearch/meilisearch-php:0.25.0 symfony/http-client laudis/neo4j-php-client

TEST

  1. RESET: data storage backends (see below CONFIGURE: the data storage backends)
  2. LOAD: w# php tests/phpunit/phpunit.php --filter testResetTestData extensions/Dataspects/tests/phpunit/unit/DataspectsTest.php
  3. RUN:
    • Cypress
      • E2E tests
      • Component tests
    • Services tests (TIKA)
    • PHP unit tests
sudo docker exec -it canasta-dockercompose_web_1 /bin/bash
root@95e3ef5ecc17:/var/www/mediawiki/w# php tests/phpunit/phpunit.php \
  extensions/Dataspects/tests/phpunit/unit/DataspectsTest.php

Debug API: https://localhost/w/api.php

DOCUMENT

  • ACTION: mwstakeorg__localhost__make-test-documentation-TDM.sh

DEVELOP

CHECK: system status

CHECK: docker-compose.override.yml

CONFIGURE: the environment for Extension:Dataspects

  • OPTION: temporarily change $wgDataspects* variables in LocalSettings.php:
    • ADVANTAGES:
      • no need to restart the Docker compose stack
      • preserve proper development .env
    • STORED-PROCEDURE: mwstakeorg__localhost__TEST-load-test-data_and_Cypress.php.sh
  • OPTION: temporarily change envs in docker-compose.override.yml

CONFIGURE: the data storage backends

  • PREPARE: source *.config files (e.g. localhost.config and production.config) exporting the environment variables
    • RESET Meilisearch: meilisearch__index_reset.sh which applies src/indexsettings.json
  • RESET SQLite:
    1. delete sqlite/dataspects.sqlite
    2. run php extensions/Dataspects/maintenance/manageSQlite3.php --initialize
  • RESET Neo4j:
    • MATCH(n) DETACH DELETE n

Tika

#!/bin/bash

# https://cwiki.apache.org/confluence/display/TIKA/TikaServerEndpointsCompared
curl \
    -T /home/lex/python-regular-expressions-cheat-sheet.pdf \
    http://localhost:9998/rmeta

Logs

sudo docker exec -it canasta-dockercompose_web_1 /bin/bash
tail -f  apache2/error_log.current

See also

Upgrade JS libraries

yarn add/update the libraries and then copy the corresponding files into place.

Install nvm/node curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.2/install.sh | bash nvm ls-remote --lts nvm install v16.18.0 npm install -g yarn

yarn add the libs lex@lexThinkPad:~/Downloads/dataspects-search-js-libraries$ yarn add
@meilisearch/instant-meilisearch instantsearch.js vis-network

Copy into place, e.g. lex@lexThinkPad:~/Downloads/dataspects-search-js-libraries$ cp node_modules/vis-network/dist/vis-network.min.js ~/mwstakeorgdevclone/extensions/Dataspects/resources/ext.dataspectsSearch/

https://datatables.net/download/