Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

999 ai coder rev1 #456

Merged
merged 12 commits into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions ANSWERS_UTILS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Answer Editor and Cleaner

This project consists of two main Python scripts: `answer_editor.py` and `cleanse_answers.py`. These scripts work together to manage and clean a set of questions and answers stored in JSON format.

## answer_editor.py

This script is a Flask web application that provides a user interface for viewing and editing a set of questions and answers.

### Key Features:
- Uses Flask and Flask-Bootstrap for the web interface
- Reads and writes data to a JSON file (`answers.json`)
- Allows viewing all questions and answers
- Supports editing answers
- Handles both radio button and text input answers
- Allows deletion of individual question-answer pairs

### How it works:
1. The main route (`/`) displays all questions and answers when accessed via GET request
2. When a POST request is made (i.e., when the form is submitted), it updates the answers in the JSON file
3. It uses a template (`index.html`, not shown in the provided code) to render the web interface

## cleanse_answers.py

This script is designed to clean and sanitize the questions and answers stored in the JSON file.

### Key Features:
- Removes duplicate words in questions
- Converts text to lowercase
- Removes common suffixes and unnecessary characters
- Eliminates non-ASCII characters
- Removes duplicate questions

### How it works:
1. Reads the input JSON file (`answers.json`)
2. Sanitizes each question using the `sanitize_text` function
3. Removes duplicate questions
4. Writes the cleansed data to a new JSON file (`cleansed_answers.json`)

## Usage

1. Run `answer_editor.py` to start the web application for viewing and editing answers:
```
python answer_editor.py
```
Then open a web browser and navigate to `http://localhost:5000`

2. After editing answers, run `cleanse_answers.py` to clean the data:
```
python cleanse_answers.py
```

This will create a new file `cleansed_answers.json` with the sanitized data.

Note: Make sure you have Flask and Flask-Bootstrap installed (`pip install flask flask-bootstrap`) before running `answer_editor.py`. (they are inlcuded in the requirements.txt file)
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,8 +153,25 @@ Auto_Jobs_Applier_AIHawk steps in as a game-changing solution to these challenge
pip install -r requirements.txt
```

6. **Copy example files in data_folder for configuration:**
```bash
cp data_folder_example/*.yaml data_folder/
```

## Configuration


### 0. Data Folder

The `data_folder` directory contains all the files necessary for the bot to operate. This folder should be structured as follows:

```bash
data_folder/
├── config.yaml
├── plain_text_resume.yaml
└── secrets.yaml
```
Examples of each file are provided in the `data_folder_example` directory.
### 1. secrets.yaml

This file contains sensitive information. Never share or commit this file to version control.
Expand Down Expand Up @@ -624,6 +641,10 @@ yaml.scanner.ScannerError: while scanning a simple key

For further assistance, please create an issue on the [GitHub repository](https://github.com/feder-cr/Auto_Jobs_Applier_AIHawk/issues) with detailed information about your problem, including error messages and your configuration (with sensitive information removed).

**Answer Editor and Cleaner**

See ANSWERS_UTILS.md for more information on the Answer Editor and Cleaner.

## Setup Documents

### Ollama & Gemini Setup
Expand Down Expand Up @@ -677,3 +698,4 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file

## Disclaimer
This tool, Auto_Jobs_Applier_AIHawk, is intended for educational purposes only. The creator assumes no responsibility for any consequences arising from its use. Users are advised to comply with the terms of service of relevant platforms and adhere to all applicable laws, regulations, and ethical guidelines. The use of automated tools for job applications may carry risks, including potential impacts on user accounts. Proceed with caution and at your own discretion.

47 changes: 47 additions & 0 deletions answer_editor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
from flask import Flask, render_template, request, jsonify, redirect, url_for
import json
import os
from pathlib import Path
from flask_bootstrap import Bootstrap

app = Flask(__name__)
Bootstrap(app)

JSON_FILE = Path(__file__).parent / 'answers.json'

@app.route('/', methods=['GET', 'POST'])
def index():
if request.method == 'POST':
return update()
else:
if not JSON_FILE.exists():
data = [] # Default empty list if file doesn't exist
else:
with open(JSON_FILE, 'r') as f:
data = json.load(f)
print(data)
return render_template('index.html', data=data if isinstance(data, list) else [])

def update():
if not JSON_FILE.exists():
data = []
else:
with open(JSON_FILE, 'r') as f:
data = json.load(f)

updated_data = []
for i, item in enumerate(data):
if f'delete_{i}' not in request.form:
if item['type'] == 'radio':
item['answer'] = request.form.get(f'answer_{i}_radio', item['answer'])
else:
item['answer'] = request.form.get(f'answer_{i}', item['answer'])
updated_data.append(item)

with open(JSON_FILE, 'w') as f:
json.dump(updated_data, f, indent=2)

return redirect(url_for('index'))

if __name__ == '__main__':
app.run(debug=True)
47 changes: 47 additions & 0 deletions cleanse_answers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import json
import re

def sanitize_text(text: str) -> str:
# Remove duplicates by splitting and rejoining
text = text.rstrip()
text = re.sub(r'\s+', ' ', text)
text = text.replace('?', '').replace('"', '').replace('\\', '')
words = text.lower().split()
unique_words = []
for word in words:
if word not in unique_words:
unique_words.append(word)
text = ' '.join(unique_words)

# Remove common suffixes
text = re.sub(r'\s*\(?required\)?', '', text, flags=re.IGNORECASE)
text = re.sub(r'(\s*\(?yes\/no\)?|\s*\(?yes\)?|\s*\(?no\)?|\?)$', '', text, flags=re.IGNORECASE)
sanitized_text = re.sub(r'[^[:ascii:]]','', text)
return sanitized_text

def cleanse_answers_json(input_file: str, output_file: str):
with open(input_file, 'r') as f:
data = json.load(f)

cleansed_data = []
seen_questions = set()

for item in data:
sanitized_question = sanitize_text(item['question'])
if sanitized_question not in seen_questions:
seen_questions.add(sanitized_question)
cleansed_item = {
'type': item['type'],
'question': sanitized_question,
'answer': item['answer']
}
cleansed_data.append(cleansed_item)

with open(output_file, 'w') as f:
json.dump(cleansed_data, f, indent=4)

if __name__ == "__main__":
input_file = "answers.json"
output_file = "cleansed_answers.json"
cleanse_answers_json(input_file, output_file)
print(f"Cleansed answers have been saved to {output_file}")
50 changes: 0 additions & 50 deletions data_folder/config.yaml

This file was deleted.

129 changes: 0 additions & 129 deletions data_folder/plain_text_resume.yaml

This file was deleted.

1 change: 0 additions & 1 deletion data_folder/secrets.yaml

This file was deleted.

Loading
Loading