From 809b6c3fa32f12edc81c03a27deb2ab5ec19d660 Mon Sep 17 00:00:00 2001
From: Falah Gate Salieh <digital_fgs@yahoo.com>
Date: Fri, 21 Jul 2023 19:27:39 +0300
Subject: [PATCH 1/8] add dataset research_papers_dataset

The "Research Papers Dataset 2023" contains information related to research papers. It includes the following features:
- Title (dtype: string): The title of the research paper.
- Abstract (dtype: string): The abstract of the research paper.

### Dataset Splits:

The dataset is divided into one split:

- Train Split:
  - Name: train
  - Number of Bytes: 2,363,569,633
  - Number of Examples: 2,311,491
---
 .../research_papers_dataset/ReadME.md         | 93 +++++++++++++++++++
 1 file changed, 93 insertions(+)
 create mode 100644 data/datasets/research_papers_dataset/ReadME.md

diff --git a/data/datasets/research_papers_dataset/ReadME.md b/data/datasets/research_papers_dataset/ReadME.md
new file mode 100644
index 0000000000..0411125fbb
--- /dev/null
+++ b/data/datasets/research_papers_dataset/ReadME.md
@@ -0,0 +1,93 @@
+---
+dataset_info:
+  features:
+  - name: title
+    dtype: string
+  - name: abstract
+    dtype: string
+  splits:
+  - name: train
+    num_bytes: 2363569633
+    num_examples: 2311491
+  download_size: 1423881564
+  dataset_size: 2363569633
+---
+## Research Paper Dataset 2023
+
+### Dataset Information:
+
+The "Research Paper Dataset 2023" contains information related to research papers. It includes the following features:
+
+- Title (dtype: string): The title of the research paper.
+- Abstract (dtype: string): The abstract of the research paper.
+
+### Dataset Splits:
+
+The dataset is divided into one split:
+
+- Train Split:
+  - Name: train
+  - Number of Bytes: 2,363,569,633
+  - Number of Examples: 2,311,491
+
+### Download Information:
+
+- Download Size: 1,423,881,564 bytes
+- Dataset Size: 2,363,569,633 bytes
+
+### Dataset Citation:
+
+If you use this dataset in your research or project, please cite it as follows:
+
+```
+@dataset{Research Paper Dataset 2023,
+  author = {Falah.G.Salieh},
+  title = {Research Paper Dataset 2023,},
+  year = {2023},
+  publisher = {Hugging Face},
+  version = {1.0},
+  location = {Online},
+  url = {Falah/research_paper2023}
+}
+
+
+```
+
+ ### Apache License:
+The "Research Paper Dataset 2023" is distributed under the Apache License 2.0. You can find a copy of the license in the LICENSE file of the dataset repository.
+
+The specific licensing and usage terms for this dataset can be found in the dataset repository or documentation. 
+Please make sure to review and comply with the applicable license and usage terms before downloading and using the dataset.
+
+### Example Usage:
+
+To load the "Research Paper Dataset 2023" using the Hugging Face Datasets Library in Python, you can use the following code:
+
+```python
+from datasets import load_dataset
+
+dataset = load_dataset("Falah/research_paper2023")
+```
+### Application of "Research Paper Dataset 2023" for NLP Text Classification and Chatbot Models
+
+The "Research Paper Dataset 2023" can be a valuable resource for various Natural Language Processing (NLP) tasks, including text classification and generating titles for books in the context of chatbot models. Here are some ways this dataset can be utilized for these applications:
+
+1. **Text Classification**: The dataset's features, such as the title and abstract of research papers, can be used to train a text classification model. By assigning appropriate labels to the research papers based on their topics or fields of study, the model can learn to classify new research papers into different categories. For example, the model can predict whether a research paper is related to computer science, biology, physics, etc. This text classification model can then be adapted for other applications that require categorizing text.
+
+2. **Book Title Generation for Chatbot Models**: By utilizing the research paper titles in the dataset, a natural language generation model, such as a sequence-to-sequence model or a transformer-based model, can be trained to generate book titles. The model can be fine-tuned on the research paper titles to learn patterns and structures in generating relevant and meaningful book titles. This can be a useful feature for chatbot models that recommend books based on specific research topics or areas of interest.
+
+### Potential Benefits:
+
+- Improved Chatbot Recommendations: With the ability to generate book titles related to specific research topics, chatbot models can provide more relevant and personalized book recommendations to users.
+- Enhanced User Engagement: By incorporating the text classification model, the chatbot can better understand user queries and respond more accurately, leading to a more engaging user experience.
+- Knowledge Discovery: Researchers and students can use the text classification model to efficiently categorize large collections of research papers, enabling quicker access to relevant information in specific domains.
+
+### Considerations:
+
+- Data Preprocessing: Before training the NLP models, appropriate data preprocessing steps may be required, such as text cleaning, tokenization, and encoding, to prepare the dataset for model input.
+- Model Selection and Fine-Tuning: Choosing the right NLP model architecture and hyperparameters, and fine-tuning the model on the specific tasks, can significantly impact the model's performance and generalization ability.
+- Ethical Use: Ensure that the generated book titles and text classification predictions are used responsibly and ethically, respecting copyright and intellectual property rights.
+
+### Conclusion:
+
+The "Research Paper Dataset 2023" holds great potential for enhancing NLP text classification models and chatbot systems. By leveraging the dataset's features and information, NLP applications can be developed to aid researchers, students, and readers in finding relevant research papers and generating meaningful book titles for their specific interests. Proper utilization of this dataset can lead to more efficient information retrieval and improved user experiences in the domain of research and academic literature exploration. 

From c4c5c877e2890d21b79e1ef28570b620049d1f20 Mon Sep 17 00:00:00 2001
From: Falah Gate Salieh <digital_fgs@yahoo.com>
Date: Fri, 21 Jul 2023 19:29:24 +0300
Subject: [PATCH 2/8] Create load_dataset.py

---
 data/datasets/research_papers_dataset/load_dataset.py | 2 ++
 1 file changed, 2 insertions(+)
 create mode 100644 data/datasets/research_papers_dataset/load_dataset.py

diff --git a/data/datasets/research_papers_dataset/load_dataset.py b/data/datasets/research_papers_dataset/load_dataset.py
new file mode 100644
index 0000000000..36ce97e040
--- /dev/null
+++ b/data/datasets/research_papers_dataset/load_dataset.py
@@ -0,0 +1,2 @@
+from datasets import load_dataset
+dataset = load_dataset("Falah/research_paper2023")

From fd3f41f008b8d50df1f0a275725ad78be47f5afd Mon Sep 17 00:00:00 2001
From: your name <you@example.com>
Date: Sat, 22 Jul 2023 15:34:18 +0300
Subject: [PATCH 3/8] add_new_dataset_sentiments_381_classes

---
 .../sentiments-dataset-381-classes/README.md  | 346 ++++++++++++++++++
 .../load_dataset.py                           |   3 +
 2 files changed, 349 insertions(+)
 create mode 100644 data/datasets/sentiments-dataset-381-classes/README.md
 create mode 100644 data/datasets/sentiments-dataset-381-classes/load_dataset.py

diff --git a/data/datasets/sentiments-dataset-381-classes/README.md b/data/datasets/sentiments-dataset-381-classes/README.md
new file mode 100644
index 0000000000..1a18a15e1f
--- /dev/null
+++ b/data/datasets/sentiments-dataset-381-classes/README.md
@@ -0,0 +1,346 @@
+---
+dataset_info:
+  features:
+  - name: text
+    dtype: string
+  - name: sentiment
+    dtype: string
+  splits:
+  - name: train
+    num_bytes: 104602
+    num_examples: 1061
+  download_size: 48213
+  dataset_size: 104602
+license: apache-2.0
+task_categories:
+- text-classification
+language:
+- en
+pretty_name: sentiments-dataset-381-classes
+size_categories:
+- 1K<n<10K
+---
+# Sentiments Dataset (381 Classes)
+
+## Dataset Description
+This dataset contains a collection of labeled sentences categorized into 381 different sentiment classes. The dataset provides a wide range of sentiment labels to facilitate fine-grained sentiment analysis tasks. Each sentence is associated with a sentiment class name.
+
+## Dataset Information
+- Number of classes: 381
+- Features: `text` (string), `sentiment` (string)
+- Number of examples: 1,061
+
+## Class Names
+The dataset includes the following sentiment class names as examples:
+- Positive
+- Negative
+- Neutral
+- Joyful
+- Disappointed
+- Worried
+- Surprised
+- Grateful
+- Indifferent
+- Sad
+- Angry
+- Relieved
+- Sentiment
+- Excited
+- Hopeful
+- Anxious
+- Satisfied
+- Happy
+- Nostalgic
+- Inspired
+- Impressed
+- Amazed
+- Touched
+- Proud
+- Intrigued
+- Relaxed
+- Content
+- Comforted
+- Motivated
+- Frustrated
+- Delighted
+- Moved
+- Curious
+- Fascinated
+- Engrossed
+- Addicted
+- Eager
+- Provoked
+- Energized
+- Controversial
+- Significant
+- Revolutionary
+- Optimistic
+- Impactful
+- Compelling
+- Enchanted
+- Peaceful
+- Disillusioned
+- Thrilled
+- Consumed
+- Engaged
+- Trendy
+- Informative
+- Appreciative
+- Enthralled
+- Enthusiastic
+- Influenced
+- Validated
+- Reflective
+- Emotional
+- Concerned
+- Promising
+- Empowered
+- Memorable
+- Transformative
+- Inclusive
+- Groundbreaking
+- Evocative
+- Respectful
+- Outraged
+- Unity
+- Enlightening
+- Artistic
+- Cultural
+- Diverse
+- Vibrant
+- Prideful
+- Captivated
+- Revealing
+- Inspiring
+- Admiring
+- Empowering
+- Connecting
+- Challenging
+- Symbolic
+- Immersed
+- Evolving
+- Insightful
+- Reformative
+- Celebratory
+- Validating
+- Diversity
+- Eclectic
+- Comprehensive
+- Uniting
+- Influential
+- Honoring
+- Transporting
+- Resonating
+- Chronicle
+- Preserving
+- Replicated
+- Impressive
+- Fascinating
+- Tributary
+- Momentum
+- Awe-inspiring
+- Unearthing
+- Exploratory
+- Immersive
+- Transportive
+- Personal
+- Resilient
+- Mesmerized
+- Legendary
+- Awareness
+- Evidence-based
+- Contemporary
+- Connected
+- Valuable
+- Referencing
+- Camaraderie
+- Inspirational
+- Evoke
+- Emotive
+- Chronicling
+- Educational
+- Serene
+- Colorful
+- Melodious
+- Dramatic
+- Enlivened
+- Wonderstruck
+- Enchanting
+- Grandiose
+- Abundant
+- Harmonious
+- Captivating
+- Mesmerizing
+- Dedicated
+- Powerful
+- Mystical
+- Picturesque
+- Opulent
+- Revitalizing
+- Fragrant
+- Spellbinding
+- Lush
+- Breathtaking
+- Passionate
+- Melodic
+- Wonderland
+- Invigorating
+- Dappled
+- Flourishing
+- Ethereal
+- Elaborate
+- Kaleidoscope
+- Harmonizing
+- Tragic
+- Transforming
+- Marveling
+- Enveloped
+- Reverberating
+- Sanctuary
+- Graceful
+- Spectacular
+- Golden
+- Melancholic
+- Transcendent
+- Delicate
+- Awakening
+- Intertwined
+- Indelible
+- Verdant
+- Heartrending
+- Fiery
+- Inviting
+- Majestic
+- Lullaby-like
+- Kissed
+- Behold
+- Soulful
+- Splendid
+- Whispering
+- Masterpiece
+- Moving
+- Crystalline
+- Tapestry
+- Haunting
+- Renewal
+- Wisdom-filled
+- Stunning
+- Sun-kissed
+- Symphony
+- Awestruck
+- Dancing
+- Heart-wrenching
+- Magical
+- Gentle
+- Emotion-evoking
+- Embracing
+- Floating
+- Tranquil
+- Celestial
+- Breathless
+- Symphonic
+- Stillness
+- Delightful
+- Flawless
+- Commanding
+- Embraced
+- Heartfelt
+- Precise
+- Adorned
+- Beautiful
+- Scattering
+- Timeless
+- Radiant
+- Regal
+- Sparkling
+- Resilience
+- Recognized
+- Echoing
+- Rebirth
+- Cradled
+- Tirelessly
+- Glowing
+- Icy
+- Brilliant
+- Anticipation
+- Awakened
+- Blossoming
+- Enthralling
+- Excitement
+- Vivid
+- Spellbound
+- Mellifluous
+- Intricate
+- Silent
+- Contrasting
+- Poignant
+- Perfumed
+- Pure
+- Magnificent
+- Exquisite
+- Anguished
+- Harmonic
+- Kaleidoscopic
+- Gripping
+- Soothing
+- Intense
+- Poetic
+- Fragile
+- Unwavering
+- Intriguing
+- Fairy-tale
+- Ephemeral
+- Joyous
+- Resplendent
+- Elegant
+- Coaxing
+- Illuminating
+- Thunderous
+- Cool
+- Exciting
+- Teeming
+- Blissful
+- Enduring
+- Raw
+- Adventurous
+- Mysterious
+- Enrapturing
+- Marvelous
+- Swirling
+- Resonant
+- Careful
+- Whimsical
+- Intertwining
+- - and more
+##  Usage example 
+```python
+from datasets import load_dataset
+#Load the dataset
+dataset = load_dataset("Falah/sentiments-dataset-381-classes")
+#Convert the dataset to a pandas DataFrame
+df = pd.DataFrame(dataset['train'])
+#Get the unique class names from the "sentiment" column
+class_names = df['sentiment'].unique()
+#Print the unique class names
+for name in class_names:
+    print(f"Class Name: {name}")
+
+```
+## Application
+
+The Sentiments Dataset (381 Classes) can be applied in various NLP applications, such as sentiment analysis and text classification.
+
+## Citation
+If you use this dataset in your research or publication, please cite it as follows:
+
+For more information or inquiries about the dataset, please contact the dataset author(s) mentioned in the citation.
+
+```
+@dataset{sentiments_dataset_381_classes),
+  author = {Falah.G.Salieh},
+  title = {Sentiments Dataset (381 Classes)},
+  year = {2023},
+  publisher = {Hugging Face},
+  url = {https://huggingface.co/datasets/Falah/sentiments-dataset-381-classes},
+}
+```
\ No newline at end of file
diff --git a/data/datasets/sentiments-dataset-381-classes/load_dataset.py b/data/datasets/sentiments-dataset-381-classes/load_dataset.py
new file mode 100644
index 0000000000..f7200c1656
--- /dev/null
+++ b/data/datasets/sentiments-dataset-381-classes/load_dataset.py
@@ -0,0 +1,3 @@
+from datasets import load_dataset
+
+dataset = load_dataset("Falah/sentiments-dataset-381-classes")

From e1f8a813a572463ebea6615488387556d8879dd9 Mon Sep 17 00:00:00 2001
From: your name <you@example.com>
Date: Sat, 22 Jul 2023 17:09:48 +0300
Subject: [PATCH 4/8] updata dataset

---
 data/datasets/medium_articles_posts/README.md | 39 ++++++++
 .../medium_articles_posts/load_dataset.py     |  3 +
 .../research_papers_dataset/ReadME.md         | 94 ++++++++++++++-----
 .../research_papers_dataset/load_dataset.py   |  1 +
 .../research_papers_dataset/package-lock.json | 26 +++++
 5 files changed, 138 insertions(+), 25 deletions(-)
 create mode 100644 data/datasets/medium_articles_posts/README.md
 create mode 100644 data/datasets/medium_articles_posts/load_dataset.py
 create mode 100644 data/datasets/research_papers_dataset/package-lock.json

diff --git a/data/datasets/medium_articles_posts/README.md b/data/datasets/medium_articles_posts/README.md
new file mode 100644
index 0000000000..d355795186
--- /dev/null
+++ b/data/datasets/medium_articles_posts/README.md
@@ -0,0 +1,39 @@
+# Medium Articles Posts Dataset
+
+## Description
+
+The Medium Articles Posts dataset contains a collection of articles published on the Medium platform. Each article entry includes information such as the article's title, main content or text, associated URL or link, authors' names, timestamps, and tags or categories.
+
+## Dataset Info
+
+The dataset consists of the following features:
+
+- **title**: *(string)* The title of the Medium article.
+- **text**: *(string)* The main content or text of the Medium article.
+- **url**: *(string)* The URL or link to the Medium article.
+- **authors**: *(string)* The authors or contributors of the Medium article.
+- **timestamp**: *(string)* The timestamp or date when the Medium article was published.
+- **tags**: *(string)* Tags or categories associated with the Medium article.
+
+## Dataset Size
+
+- **Total Dataset Size**: 1,044,746,687 bytes (approximately 1000 MB)
+
+## Splits
+
+The dataset is split into the following part:
+
+- **Train**: 
+  - Number of examples: 192,368
+  - Size: 1,044,746,687 bytes (approximately 1000 MB)
+
+## Download Size
+
+- **Compressed Download Size**: 601,519,297 bytes (approximately 600 MB)
+###   Usage example 
+```python
+from datasets import load_dataset
+#Load the dataset
+dataset = load_dataset("Falah/medium_articles_posts")
+
+```
\ No newline at end of file
diff --git a/data/datasets/medium_articles_posts/load_dataset.py b/data/datasets/medium_articles_posts/load_dataset.py
new file mode 100644
index 0000000000..1cc8027b1d
--- /dev/null
+++ b/data/datasets/medium_articles_posts/load_dataset.py
@@ -0,0 +1,3 @@
+from datasets import load_dataset
+#Load the dataset
+dataset = load_dataset("Falah/medium_articles_posts")
diff --git a/data/datasets/research_papers_dataset/ReadME.md b/data/datasets/research_papers_dataset/ReadME.md
index 0411125fbb..f927f272f0 100644
--- a/data/datasets/research_papers_dataset/ReadME.md
+++ b/data/datasets/research_papers_dataset/ReadME.md
@@ -1,22 +1,26 @@
 ---
 dataset_info:
   features:
-  - name: title
-    dtype: string
-  - name: abstract
-    dtype: string
+    - name: title
+      dtype: string
+    - name: abstract
+      dtype: string
   splits:
-  - name: train
-    num_bytes: 2363569633
-    num_examples: 2311491
+    - name: train
+      num_bytes: 2363569633
+      num_examples: 2311491
   download_size: 1423881564
   dataset_size: 2363569633
 ---
+
 ## Research Paper Dataset 2023
 
+[Check out this website](https://huggingface.co/datasets/Falah/research_paper2023)
+
 ### Dataset Information:
 
-The "Research Paper Dataset 2023" contains information related to research papers. It includes the following features:
+The "Research Paper Dataset 2023" contains information related to research
+papers. It includes the following features:
 
 - Title (dtype: string): The title of the research paper.
 - Abstract (dtype: string): The abstract of the research paper.
@@ -53,41 +57,81 @@ If you use this dataset in your research or project, please cite it as follows:
 
 ```
 
- ### Apache License:
-The "Research Paper Dataset 2023" is distributed under the Apache License 2.0. You can find a copy of the license in the LICENSE file of the dataset repository.
+### Apache License:
+
+The "Research Paper Dataset 2023" is distributed under the Apache License 2.0.
+You can find a copy of the license in the LICENSE file of the dataset
+repository.
 
-The specific licensing and usage terms for this dataset can be found in the dataset repository or documentation. 
-Please make sure to review and comply with the applicable license and usage terms before downloading and using the dataset.
+The specific licensing and usage terms for this dataset can be found in the
+dataset repository or documentation. Please make sure to review and comply with
+the applicable license and usage terms before downloading and using the dataset.
 
 ### Example Usage:
 
-To load the "Research Paper Dataset 2023" using the Hugging Face Datasets Library in Python, you can use the following code:
+To load the "Research Paper Dataset 2023" using the Hugging Face Datasets
+Library in Python, you can use the following code:
 
 ```python
 from datasets import load_dataset
 
 dataset = load_dataset("Falah/research_paper2023")
 ```
-### Application of "Research Paper Dataset 2023" for NLP Text Classification and Chatbot Models
-
-The "Research Paper Dataset 2023" can be a valuable resource for various Natural Language Processing (NLP) tasks, including text classification and generating titles for books in the context of chatbot models. Here are some ways this dataset can be utilized for these applications:
 
-1. **Text Classification**: The dataset's features, such as the title and abstract of research papers, can be used to train a text classification model. By assigning appropriate labels to the research papers based on their topics or fields of study, the model can learn to classify new research papers into different categories. For example, the model can predict whether a research paper is related to computer science, biology, physics, etc. This text classification model can then be adapted for other applications that require categorizing text.
+### Application of "Research Paper Dataset 2023" for NLP Text Classification and Chatbot Models
 
-2. **Book Title Generation for Chatbot Models**: By utilizing the research paper titles in the dataset, a natural language generation model, such as a sequence-to-sequence model or a transformer-based model, can be trained to generate book titles. The model can be fine-tuned on the research paper titles to learn patterns and structures in generating relevant and meaningful book titles. This can be a useful feature for chatbot models that recommend books based on specific research topics or areas of interest.
+The "Research Paper Dataset 2023" can be a valuable resource for various Natural
+Language Processing (NLP) tasks, including text classification and generating
+titles for books in the context of chatbot models. Here are some ways this
+dataset can be utilized for these applications:
+
+1. **Text Classification**: The dataset's features, such as the title and
+   abstract of research papers, can be used to train a text classification
+   model. By assigning appropriate labels to the research papers based on their
+   topics or fields of study, the model can learn to classify new research
+   papers into different categories. For example, the model can predict whether
+   a research paper is related to computer science, biology, physics, etc. This
+   text classification model can then be adapted for other applications that
+   require categorizing text.
+
+2. **Book Title Generation for Chatbot Models**: By utilizing the research paper
+   titles in the dataset, a natural language generation model, such as a
+   sequence-to-sequence model or a transformer-based model, can be trained to
+   generate book titles. The model can be fine-tuned on the research paper
+   titles to learn patterns and structures in generating relevant and meaningful
+   book titles. This can be a useful feature for chatbot models that recommend
+   books based on specific research topics or areas of interest.
 
 ### Potential Benefits:
 
-- Improved Chatbot Recommendations: With the ability to generate book titles related to specific research topics, chatbot models can provide more relevant and personalized book recommendations to users.
-- Enhanced User Engagement: By incorporating the text classification model, the chatbot can better understand user queries and respond more accurately, leading to a more engaging user experience.
-- Knowledge Discovery: Researchers and students can use the text classification model to efficiently categorize large collections of research papers, enabling quicker access to relevant information in specific domains.
+- Improved Chatbot Recommendations: With the ability to generate book titles
+  related to specific research topics, chatbot models can provide more relevant
+  and personalized book recommendations to users.
+- Enhanced User Engagement: By incorporating the text classification model, the
+  chatbot can better understand user queries and respond more accurately,
+  leading to a more engaging user experience.
+- Knowledge Discovery: Researchers and students can use the text classification
+  model to efficiently categorize large collections of research papers, enabling
+  quicker access to relevant information in specific domains.
 
 ### Considerations:
 
-- Data Preprocessing: Before training the NLP models, appropriate data preprocessing steps may be required, such as text cleaning, tokenization, and encoding, to prepare the dataset for model input.
-- Model Selection and Fine-Tuning: Choosing the right NLP model architecture and hyperparameters, and fine-tuning the model on the specific tasks, can significantly impact the model's performance and generalization ability.
-- Ethical Use: Ensure that the generated book titles and text classification predictions are used responsibly and ethically, respecting copyright and intellectual property rights.
+- Data Preprocessing: Before training the NLP models, appropriate data
+  preprocessing steps may be required, such as text cleaning, tokenization, and
+  encoding, to prepare the dataset for model input.
+- Model Selection and Fine-Tuning: Choosing the right NLP model architecture and
+  hyperparameters, and fine-tuning the model on the specific tasks, can
+  significantly impact the model's performance and generalization ability.
+- Ethical Use: Ensure that the generated book titles and text classification
+  predictions are used responsibly and ethically, respecting copyright and
+  intellectual property rights.
 
 ### Conclusion:
 
-The "Research Paper Dataset 2023" holds great potential for enhancing NLP text classification models and chatbot systems. By leveraging the dataset's features and information, NLP applications can be developed to aid researchers, students, and readers in finding relevant research papers and generating meaningful book titles for their specific interests. Proper utilization of this dataset can lead to more efficient information retrieval and improved user experiences in the domain of research and academic literature exploration. 
+The "Research Paper Dataset 2023" holds great potential for enhancing NLP text
+classification models and chatbot systems. By leveraging the dataset's features
+and information, NLP applications can be developed to aid researchers, students,
+and readers in finding relevant research papers and generating meaningful book
+titles for their specific interests. Proper utilization of this dataset can lead
+to more efficient information retrieval and improved user experiences in the
+domain of research and academic literature exploration.
diff --git a/data/datasets/research_papers_dataset/load_dataset.py b/data/datasets/research_papers_dataset/load_dataset.py
index 36ce97e040..4602f0d253 100644
--- a/data/datasets/research_papers_dataset/load_dataset.py
+++ b/data/datasets/research_papers_dataset/load_dataset.py
@@ -1,2 +1,3 @@
 from datasets import load_dataset
+
 dataset = load_dataset("Falah/research_paper2023")
diff --git a/data/datasets/research_papers_dataset/package-lock.json b/data/datasets/research_papers_dataset/package-lock.json
new file mode 100644
index 0000000000..f370609afd
--- /dev/null
+++ b/data/datasets/research_papers_dataset/package-lock.json
@@ -0,0 +1,26 @@
+{
+  "husky": {
+    "hooks": {
+      "pre-commit": "lint-staged"
+    }
+  },
+  "lint-staged": {
+    "*.{js,jsx,ts,tsx,json,css,scss,md}": [
+      "prettier --write",
+      "git add"
+    ]
+  }
+}
+{
+  "husky": {
+    "hooks": {
+      "pre-commit": "lint-staged"
+    }
+  },
+  "lint-staged": {
+    "*.{js,jsx,ts,tsx,json,css,scss,md}": [
+      "prettier --write",
+      "git add"
+    ]
+  }
+}

From c32b7da2ef8ab3611a734803e6a685ced561b567 Mon Sep 17 00:00:00 2001
From: your name <someone@someplace.com>
Date: Sun, 23 Jul 2023 08:33:09 +0300
Subject: [PATCH 5/8] updata dataset and add new dataset

---
 data/datasets/medium_articles_posts/README.md                 | 4 ++--
 data/datasets/medium_articles_posts/__init__.py               | 0
 data/datasets/medium_articles_posts/requirements.txt          | 1 +
 data/datasets/research_papers_dataset/__init__.py             | 0
 data/datasets/research_papers_dataset/requirements.txt        | 1 +
 data/datasets/semantics_ws_qna_oa/__init__.py                 | 0
 data/datasets/sentiments-dataset-381-classes/README.md        | 2 +-
 data/datasets/sentiments-dataset-381-classes/__init__.py      | 0
 data/datasets/sentiments-dataset-381-classes/requirements.txt | 1 +
 9 files changed, 6 insertions(+), 3 deletions(-)
 create mode 100644 data/datasets/medium_articles_posts/__init__.py
 create mode 100644 data/datasets/medium_articles_posts/requirements.txt
 create mode 100644 data/datasets/research_papers_dataset/__init__.py
 create mode 100644 data/datasets/research_papers_dataset/requirements.txt
 create mode 100644 data/datasets/semantics_ws_qna_oa/__init__.py
 create mode 100644 data/datasets/sentiments-dataset-381-classes/__init__.py
 create mode 100644 data/datasets/sentiments-dataset-381-classes/requirements.txt

diff --git a/data/datasets/medium_articles_posts/README.md b/data/datasets/medium_articles_posts/README.md
index d355795186..65b8211e5d 100644
--- a/data/datasets/medium_articles_posts/README.md
+++ b/data/datasets/medium_articles_posts/README.md
@@ -23,14 +23,14 @@ The dataset consists of the following features:
 
 The dataset is split into the following part:
 
-- **Train**: 
+- **Train**:
   - Number of examples: 192,368
   - Size: 1,044,746,687 bytes (approximately 1000 MB)
 
 ## Download Size
 
 - **Compressed Download Size**: 601,519,297 bytes (approximately 600 MB)
-###   Usage example 
+###   Usage example
 ```python
 from datasets import load_dataset
 #Load the dataset
diff --git a/data/datasets/medium_articles_posts/__init__.py b/data/datasets/medium_articles_posts/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/data/datasets/medium_articles_posts/requirements.txt b/data/datasets/medium_articles_posts/requirements.txt
new file mode 100644
index 0000000000..e9f023c9e0
--- /dev/null
+++ b/data/datasets/medium_articles_posts/requirements.txt
@@ -0,0 +1 @@
+datasets==2.9.0
\ No newline at end of file
diff --git a/data/datasets/research_papers_dataset/__init__.py b/data/datasets/research_papers_dataset/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/data/datasets/research_papers_dataset/requirements.txt b/data/datasets/research_papers_dataset/requirements.txt
new file mode 100644
index 0000000000..e9f023c9e0
--- /dev/null
+++ b/data/datasets/research_papers_dataset/requirements.txt
@@ -0,0 +1 @@
+datasets==2.9.0
\ No newline at end of file
diff --git a/data/datasets/semantics_ws_qna_oa/__init__.py b/data/datasets/semantics_ws_qna_oa/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/data/datasets/sentiments-dataset-381-classes/README.md b/data/datasets/sentiments-dataset-381-classes/README.md
index 1a18a15e1f..63d712f2af 100644
--- a/data/datasets/sentiments-dataset-381-classes/README.md
+++ b/data/datasets/sentiments-dataset-381-classes/README.md
@@ -312,7 +312,7 @@ The dataset includes the following sentiment class names as examples:
 - Whimsical
 - Intertwining
 - - and more
-##  Usage example 
+##  Usage example
 ```python
 from datasets import load_dataset
 #Load the dataset
diff --git a/data/datasets/sentiments-dataset-381-classes/__init__.py b/data/datasets/sentiments-dataset-381-classes/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/data/datasets/sentiments-dataset-381-classes/requirements.txt b/data/datasets/sentiments-dataset-381-classes/requirements.txt
new file mode 100644
index 0000000000..e9f023c9e0
--- /dev/null
+++ b/data/datasets/sentiments-dataset-381-classes/requirements.txt
@@ -0,0 +1 @@
+datasets==2.9.0
\ No newline at end of file

From 6b1cbe02d64503c71ef0b438c62c83c68b9d1fa7 Mon Sep 17 00:00:00 2001
From: your name <someone@someplace.com>
Date: Sun, 23 Jul 2023 08:38:43 +0300
Subject: [PATCH 6/8] updata dataset add new dataset for
 research_papers_and_medium_articles+post

---
 data/datasets/semantics_ws_qna_oa/__init__.py | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 delete mode 100644 data/datasets/semantics_ws_qna_oa/__init__.py

diff --git a/data/datasets/semantics_ws_qna_oa/__init__.py b/data/datasets/semantics_ws_qna_oa/__init__.py
deleted file mode 100644
index e69de29bb2..0000000000

From b86354b770a0db46c932c2dd9dcf03e737e33f01 Mon Sep 17 00:00:00 2001
From: your name <someone@someplace.com>
Date: Sun, 23 Jul 2023 09:06:52 +0300
Subject: [PATCH 7/8] updata_to_medium_post_dataset

---
 data/datasets/medium_articles_posts/README.md | 24 ++++++----
 .../medium_articles_posts/load_dataset.py     |  3 +-
 .../medium_articles_posts/requirements.txt    |  2 +-
 .../research_papers_dataset/package-lock.json | 26 ----------
 .../research_papers_dataset/requirements.txt  |  2 +-
 .../sentiments-dataset-381-classes/README.md  | 47 ++++++++++++-------
 .../requirements.txt                          |  2 +-
 7 files changed, 51 insertions(+), 55 deletions(-)
 delete mode 100644 data/datasets/research_papers_dataset/package-lock.json

diff --git a/data/datasets/medium_articles_posts/README.md b/data/datasets/medium_articles_posts/README.md
index 65b8211e5d..1b915cc296 100644
--- a/data/datasets/medium_articles_posts/README.md
+++ b/data/datasets/medium_articles_posts/README.md
@@ -2,18 +2,22 @@
 
 ## Description
 
-The Medium Articles Posts dataset contains a collection of articles published on the Medium platform. Each article entry includes information such as the article's title, main content or text, associated URL or link, authors' names, timestamps, and tags or categories.
+The Medium Articles Posts dataset contains a collection of articles published on
+the Medium platform. Each article entry includes information such as the
+article's title, main content or text, associated URL or link, authors' names,
+timestamps, and tags or categories.
 
 ## Dataset Info
 
 The dataset consists of the following features:
 
-- **title**: *(string)* The title of the Medium article.
-- **text**: *(string)* The main content or text of the Medium article.
-- **url**: *(string)* The URL or link to the Medium article.
-- **authors**: *(string)* The authors or contributors of the Medium article.
-- **timestamp**: *(string)* The timestamp or date when the Medium article was published.
-- **tags**: *(string)* Tags or categories associated with the Medium article.
+- **title**: _(string)_ The title of the Medium article.
+- **text**: _(string)_ The main content or text of the Medium article.
+- **url**: _(string)_ The URL or link to the Medium article.
+- **authors**: _(string)_ The authors or contributors of the Medium article.
+- **timestamp**: _(string)_ The timestamp or date when the Medium article was
+  published.
+- **tags**: _(string)_ Tags or categories associated with the Medium article.
 
 ## Dataset Size
 
@@ -30,10 +34,12 @@ The dataset is split into the following part:
 ## Download Size
 
 - **Compressed Download Size**: 601,519,297 bytes (approximately 600 MB)
-###   Usage example
+
+### Usage example
+
 ```python
 from datasets import load_dataset
 #Load the dataset
 dataset = load_dataset("Falah/medium_articles_posts")
 
-```
\ No newline at end of file
+```
diff --git a/data/datasets/medium_articles_posts/load_dataset.py b/data/datasets/medium_articles_posts/load_dataset.py
index 1cc8027b1d..d8b750a3b8 100644
--- a/data/datasets/medium_articles_posts/load_dataset.py
+++ b/data/datasets/medium_articles_posts/load_dataset.py
@@ -1,3 +1,4 @@
 from datasets import load_dataset
-#Load the dataset
+
+# Load the dataset
 dataset = load_dataset("Falah/medium_articles_posts")
diff --git a/data/datasets/medium_articles_posts/requirements.txt b/data/datasets/medium_articles_posts/requirements.txt
index e9f023c9e0..76de43c3ed 100644
--- a/data/datasets/medium_articles_posts/requirements.txt
+++ b/data/datasets/medium_articles_posts/requirements.txt
@@ -1 +1 @@
-datasets==2.9.0
\ No newline at end of file
+datasets==2.9.0
diff --git a/data/datasets/research_papers_dataset/package-lock.json b/data/datasets/research_papers_dataset/package-lock.json
deleted file mode 100644
index f370609afd..0000000000
--- a/data/datasets/research_papers_dataset/package-lock.json
+++ /dev/null
@@ -1,26 +0,0 @@
-{
-  "husky": {
-    "hooks": {
-      "pre-commit": "lint-staged"
-    }
-  },
-  "lint-staged": {
-    "*.{js,jsx,ts,tsx,json,css,scss,md}": [
-      "prettier --write",
-      "git add"
-    ]
-  }
-}
-{
-  "husky": {
-    "hooks": {
-      "pre-commit": "lint-staged"
-    }
-  },
-  "lint-staged": {
-    "*.{js,jsx,ts,tsx,json,css,scss,md}": [
-      "prettier --write",
-      "git add"
-    ]
-  }
-}
diff --git a/data/datasets/research_papers_dataset/requirements.txt b/data/datasets/research_papers_dataset/requirements.txt
index e9f023c9e0..76de43c3ed 100644
--- a/data/datasets/research_papers_dataset/requirements.txt
+++ b/data/datasets/research_papers_dataset/requirements.txt
@@ -1 +1 @@
-datasets==2.9.0
\ No newline at end of file
+datasets==2.9.0
diff --git a/data/datasets/sentiments-dataset-381-classes/README.md b/data/datasets/sentiments-dataset-381-classes/README.md
index 63d712f2af..23a5526354 100644
--- a/data/datasets/sentiments-dataset-381-classes/README.md
+++ b/data/datasets/sentiments-dataset-381-classes/README.md
@@ -1,37 +1,45 @@
 ---
 dataset_info:
   features:
-  - name: text
-    dtype: string
-  - name: sentiment
-    dtype: string
+    - name: text
+      dtype: string
+    - name: sentiment
+      dtype: string
   splits:
-  - name: train
-    num_bytes: 104602
-    num_examples: 1061
+    - name: train
+      num_bytes: 104602
+      num_examples: 1061
   download_size: 48213
   dataset_size: 104602
 license: apache-2.0
 task_categories:
-- text-classification
+  - text-classification
 language:
-- en
+  - en
 pretty_name: sentiments-dataset-381-classes
 size_categories:
-- 1K<n<10K
+  - 1K<n<10K
 ---
+
 # Sentiments Dataset (381 Classes)
 
 ## Dataset Description
-This dataset contains a collection of labeled sentences categorized into 381 different sentiment classes. The dataset provides a wide range of sentiment labels to facilitate fine-grained sentiment analysis tasks. Each sentence is associated with a sentiment class name.
+
+This dataset contains a collection of labeled sentences categorized into 381
+different sentiment classes. The dataset provides a wide range of sentiment
+labels to facilitate fine-grained sentiment analysis tasks. Each sentence is
+associated with a sentiment class name.
 
 ## Dataset Information
+
 - Number of classes: 381
 - Features: `text` (string), `sentiment` (string)
 - Number of examples: 1,061
 
 ## Class Names
+
 The dataset includes the following sentiment class names as examples:
+
 - Positive
 - Negative
 - Neutral
@@ -312,7 +320,9 @@ The dataset includes the following sentiment class names as examples:
 - Whimsical
 - Intertwining
 - - and more
-##  Usage example
+
+## Usage example
+
 ```python
 from datasets import load_dataset
 #Load the dataset
@@ -326,14 +336,19 @@ for name in class_names:
     print(f"Class Name: {name}")
 
 ```
+
 ## Application
 
-The Sentiments Dataset (381 Classes) can be applied in various NLP applications, such as sentiment analysis and text classification.
+The Sentiments Dataset (381 Classes) can be applied in various NLP applications,
+such as sentiment analysis and text classification.
 
 ## Citation
-If you use this dataset in your research or publication, please cite it as follows:
 
-For more information or inquiries about the dataset, please contact the dataset author(s) mentioned in the citation.
+If you use this dataset in your research or publication, please cite it as
+follows:
+
+For more information or inquiries about the dataset, please contact the dataset
+author(s) mentioned in the citation.
 
 ```
 @dataset{sentiments_dataset_381_classes),
@@ -343,4 +358,4 @@ For more information or inquiries about the dataset, please contact the dataset
   publisher = {Hugging Face},
   url = {https://huggingface.co/datasets/Falah/sentiments-dataset-381-classes},
 }
-```
\ No newline at end of file
+```
diff --git a/data/datasets/sentiments-dataset-381-classes/requirements.txt b/data/datasets/sentiments-dataset-381-classes/requirements.txt
index e9f023c9e0..76de43c3ed 100644
--- a/data/datasets/sentiments-dataset-381-classes/requirements.txt
+++ b/data/datasets/sentiments-dataset-381-classes/requirements.txt
@@ -1 +1 @@
-datasets==2.9.0
\ No newline at end of file
+datasets==2.9.0

From 87df83508b8ee39b1dfc3b4a51c4f40db2ad81e8 Mon Sep 17 00:00:00 2001
From: your name <someone@someplace.com>
Date: Sun, 23 Jul 2023 09:16:11 +0300
Subject: [PATCH 8/8] update dataset

---
 data/datasets/medium_articles_posts/requirements.txt          | 1 +
 data/datasets/research_papers_dataset/requirements.txt        | 1 +
 data/datasets/sentiments-dataset-381-classes/requirements.txt | 1 +
 3 files changed, 3 insertions(+)

diff --git a/data/datasets/medium_articles_posts/requirements.txt b/data/datasets/medium_articles_posts/requirements.txt
index 76de43c3ed..7883858ca7 100644
--- a/data/datasets/medium_articles_posts/requirements.txt
+++ b/data/datasets/medium_articles_posts/requirements.txt
@@ -1 +1,2 @@
 datasets==2.9.0
+
diff --git a/data/datasets/research_papers_dataset/requirements.txt b/data/datasets/research_papers_dataset/requirements.txt
index 76de43c3ed..7883858ca7 100644
--- a/data/datasets/research_papers_dataset/requirements.txt
+++ b/data/datasets/research_papers_dataset/requirements.txt
@@ -1 +1,2 @@
 datasets==2.9.0
+
diff --git a/data/datasets/sentiments-dataset-381-classes/requirements.txt b/data/datasets/sentiments-dataset-381-classes/requirements.txt
index 76de43c3ed..7883858ca7 100644
--- a/data/datasets/sentiments-dataset-381-classes/requirements.txt
+++ b/data/datasets/sentiments-dataset-381-classes/requirements.txt
@@ -1 +1,2 @@
 datasets==2.9.0
+