SNOMED-CT & ICD Codex

	Architecture	Status	Tests Coverage
Ubuntu Trusty 14.04	x86_64

Welcome to snoicd-codex

Snoicd-codex is a set of tools that allow to crawl, index and perform queries over the ICD-9, ICD-10 and SNOMED CT terminologies. Thanks to the previous crawl and indexation it is able to achieve very low latency queries and it occupies a no more than a Gb.

It's architecture its based on the well known Google search engine where they have a crawler that collects data about websites, and indexer to reduce all the information to a physical location in their databases/indexes and a search algorithm that given a query finds the best matching results in their indexes.

In the following illustration the descrived architecture can be perfectly seen.

Crawler

The crawler is in charge of finding the data among the different terminologies, add it to a common file-system for the indexer to look for and add the relations between the crawled data. At the beggining of the crawl process we have a set of unstructure and unlinked files meanwhile at the end of the process we have a unique JSON file containing all the crawled data. This file will be the one readed by the indexer.

Indexer

The indexer module starts by reading the file served by the crawler and it creates two index files, one where the concepts are indexed by its terminology unique code and another where concepts are text-indexed by its descriptions.

Search

The searcher is an api-structured module that by means of some algorithms it allows, trough REST requests to execute queries over the indexed data. At the time it allows queries by terminology codes and free text queries. Thanks to the previous indexation of the crawled data it performs free text queries over all the set of descriptions at less than 10ms. That is for unique words and n words with intersection search.

Getting started

These instructions give the most direct path to work with this module.

System Requirements

As the project is developed in java macOS, Windows and Linux distributions are natively supported. Of course you will need the latest JDK available and haveing Docker installed on your computer. Also, depending on where are you going to run the database, you will need internet connection or MongoDB installed and running on your machine.

Java Development Kit (JDK) ⚠️ No support for java 10 nor 11 ⚠️

A Java Development Kit (JDK) is a program development environment for writing Java applets and applications. It consists of a runtime environment that "sits on top" of the operating system layer as well as the tools and programming that developers need to compile, debug, and run applets and applications written in the Java programming language.

If you do not has the latest stable version download you can download it here.

Docker

Docker provides container software that is ideal for developers and teams looking to get started and experimenting with container-based applications. It can be downloaded here for your favourite OS.

Deploying snoicd-codex

These instructions give the most direct path to a working snoice-codex. First thing to do is to clone the repository on to your local computer, for that:

git clone https://github.com/thewilly/snoicd-codex

Then you will need to change your working directory to the snoicd-codex, build the api sources and finally deploy the docker container, to do so:

cd snoicd-codex;
cd api;
mvn package -DskipTests;
cd ..;
docker-compose up;

Notice that once the docker-compose services start it will take between 1 and 5 minutes for services to start. Once the services started you will be able to connect at http://localhost:8082. Find more information about the API here. This API has been documented with Swagger 2 so at http://localhost:8082/swagger-ui.html you can try / test the behaviour of the endpoint.

Did you find an issue?

If you find any issue or have any doubt with the system just ask by submitting an issue.

Versions included

Terminology Version	Internal code
ICD-9	`icd9`
ICD-10	`icd10`
SNOMED CT	`snomed`

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
data		data
docs		docs
snoicd-core		snoicd-core
snoicd-crawler		snoicd-crawler
snoicd-glue		snoicd-glue
snoicd-indexer		snoicd-indexer
snoicd-search		snoicd-search
snoicd-server		snoicd-server
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
CODE_OWNERS.TXT		CODE_OWNERS.TXT
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SNOMED-CT & ICD Codex

Welcome to snoicd-codex

Crawler

Indexer

Search

Getting started

System Requirements

Java Development Kit (JDK) ⚠️ No support for java 10 nor 11 ⚠️

Docker

Deploying snoicd-codex

Did you find an issue?

Versions included

About

Releases 2

Packages

Languages

License

thewillyhuman/snoicd-codex

Folders and files

Latest commit

History

Repository files navigation

SNOMED-CT & ICD Codex

Welcome to snoicd-codex

Crawler

Indexer

Search

Getting started

System Requirements

Java Development Kit (JDK) ⚠️ No support for java 10 nor 11 ⚠️

Docker

Deploying snoicd-codex

Did you find an issue?

Versions included

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages