Skip to content

Latest commit

 

History

History
89 lines (62 loc) · 3.72 KB

README.md

File metadata and controls

89 lines (62 loc) · 3.72 KB

Learning Entities from Narratives of Skin Cancer (LENS)

LENS Logo

Overview

Learning Entities from Narratives of Skin Cancer (LENS) is a Python library designed for Named Entity Recognition (NER) specifically tailored to narratives related to skin cancer. LENS is designed to recognize and categorize important entities within skin cancer narratives. It is equipped with 24 distinct tags (see file annotation_guidelines.pdf), which allow for the extraction of key information from unstructured text. This information can be linked to biomedical ontologies such as SNOMED-CT and MedCAT, facilitating structured data analysis in clinical and research settings.

Objective

The primary objective of LENS is to process input text—such as online narratives from platforms like Reddit—and return the corresponding LENS tags. These tags allow for the categorization of various entities mentioned in the text, facilitating further analysis and integration with biomedical ontologies.

Installation

To install the latest version of LENS, please run the following command:

pip install https://huggingface.co/4DPicture/OncoNER/Lens/resolve/main/onco_lens_ner-0.1.0-py3-none-any.whl

Usage Example

Below is an example of how to use LENS to extract entities from a skin cancer narrative:

import onco_lens_ner as lens

text = "I was diagnosed with melanoma last year. I'm currently undergoing immunotherapy and sometimes feel nauseous."
entities = lens.get_entities(text)
print(entities)

Functionalities

LENS provides a range of functionalities to meet diverse user needs:

  1. Extract all LENS entities: Identify and extract all recognized entities from a given text.
entities = lens.get_entities(text)
print(entities)
  1. Display all entities: Output the extracted entities with their corresponding tags.
lens.display_entities(text)
  1. Extract entities for a specific label: Extract entities corresponding to a specific tag, such as INV (Investigation).
entities = lens.get_entities(text, tag_list=['INV'])
print(entities)
  1. Extract entities for a subset of labels: Focus on a subset of tags, for example, TRT and SYM.
entities = lens.get_entities(text, tag_list=['TRT', 'INV'])
print(entities)
  1. Display entities for a subset of labels: Output entities for specific tags, such as TRT, SYM, and INV.
lens.display_entities(text, tag_list=['TRT', 'SYM','INV'])
  1. Extract all MedCAT Mappings: Link recognized entities to MedCAT biomedical concepts.
lens.display_entities(text)
  1. Extract all SNOMED-CT Mappings: Link recognized entities to SNOMED-CT concepts.
lens2medcat = lens.lens2medcat(text)
print(lens2medcat)
  1. Save the annotations in JSON format: Save the extracted entities and mappings in a structured JSON file for further analysis.
lens2snomedct = lens.lens2snomedct(text)
print(lens2snomedct)

Tutorial

A comprehensive tutorial on how to use LENS, including advanced features, is available here.

License

LENS is licensed under the MIT License. Please see the LICENSE file for further information.