Skip to content

Latest commit

 

History

History
166 lines (110 loc) · 6.23 KB

README.md

File metadata and controls

166 lines (110 loc) · 6.23 KB

goals

Keep it Simple is an AI tool designed to simplify text into a more readable, understandable, and visually accessible format. Made for everyone, whether you’re new to English, a young learner, or someone who faces challenges with reading due to learning issues.

Motivation behind Creating an Assistive Technology

motivation

Building upon the insights gained from these technological advancements, we have developed an AI-driven solution.

The goal is to create a tool that not only assists in overcoming the challenges posed by learning and attention issues but also enhances the overall learning experience for all users.

quote

We aim to make the digital realm more inclusive and information more easily digestible.


Table of Contents

  1. Our Goals
  2. Data
  3. Model
  4. Model Evaluation
  5. User Interface
  6. Findings
  7. How to Use
  8. About Us

Our Goals

goals


Data With Readability Levels

We collected open source articles from 'News in Levels' and 'Wikipedia'/ 'Simple Wikipedia,' as well as text from 'OneStopEnglish' research dataset.

These sources have the same text in multiple reading levels, which we define with the Common European Framework of Reference for Languages (CEFR).

There are 6 CEFR levels, but we mapped our data to 3 major levels: CEFR C-B-A corresponding to Advanced-Intermediate-Beginner.

goals


Our AI Model: Safety Check + Classifier + Simplifier

Our model first classifies texts into the predefined CEFR levels and then simplifies the content to match the desired reading level. We also flag if a text has Unsafe Text, including profane language and hate speech.

goals

safetycheck classifier

simplifier

Model Evaluation

For the robust evaluation of the tool’s performance, we’ve incorporated several methods:

  1. CEFR (Common European Framework of Reference for Languages): Using our classifier, we generate labels for the produced text and juxtapose it against the ground truth from our evaluation set.

Results:

cefr-eval

  1. Aggregate of Gunning Fog Index, Flesch Kincaid Reading Ease score, and Dale Chall Readability Score from python's textstat library: An aggregate of the following metrics is used to measure the complexity of the produced text.
    • Gunning Fog Index: Evaluates readability based on complex words (>3 syllables) density and average sentence length
    • Flesch Kincaid Reading Ease score: Evaluates readability based on average syllables per word and average sentence length
    • Dale Chall Readability Score: Evaluates readability based on average sentence length and frequency of difficult words (words that are not present in a list of 3000 easy words)

Results:

textstat-eval

  1. GPT-4 Score: GPT-4 is asked to rate the complexity of the output text on a scale of 1-100.

Results:

eval

Overall, we see that the fine-tuned Llama-2 7b chat model performs best for the simplification task. Comparison with the results of the out-of-box model shows that our fine-tuning greatly improved the quality of the generated text for the task.

User Interface

ui

UI created and deployed with Streamlit. User's input text is classified as a reading level, seen above in "Input text is at [Advanced] Level." User can then choose to simplify the text to a Beginner or Intermediate level in the "Simplify to:" option.

Inclusivity features included are bionic reading, text-to-speech, font display adjustment, and PDF download.

Challenges, Findings, and Future Work

challengesandfindings

futurework

How to Use

  1. Access the tool via our web portal.
  2. Paste or type in the content you wish to simplify.
  3. Select the desired readability level.
  4. Adjust display and format.
  5. View the simplified content.
  6. (Optional) Provide feedback for continuous model improvement.

Getting Started for Developers:

  1. Clone the GitHub repository.
  2. Ensure all dependencies are installed.
  3. For local testing, run the Streamlit app.
  4. For deploying on your server, modify the necessary configuration settings.

About Us

Team:

  • Ankita Nambiar
  • Egehan Yorulmaz
  • Lavanya Srivastava
  • Prayut Jain

Conversational AI with Nick Kadochnikov @ University of Chicago M.S. in Applied Data Science

Contributions, feedback, and improvements are always welcome. Feel free to submit pull requests or raise issues. This project is licensed under the MIT License. Refer to the LICENSE file for more details.

                   Keep It Simple. Making Information Accessible with AI.