Skip to content

Commit

Permalink
Merge pull request #83 from eli64s/refactor/repo-processor
Browse files Browse the repository at this point in the history
refactor: Improve repository preprocessing design and metadata extraction.
  • Loading branch information
eli64s authored Jan 5, 2024
2 parents 67261d6 + f7b8224 commit 5096f17
Show file tree
Hide file tree
Showing 52 changed files with 2,618 additions and 1,586 deletions.
36 changes: 12 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -281,9 +281,9 @@ See the <a href="#-configuration">Configuration</a> section below for the comple
</tr>
<tr>
<td>3️⃣</td>
<td><a href="https://github.com/eli64s/readme-ai/blob/main/examples/markdown/readme-javascript.md">readme-javascript.md</a></td>
<td><a href="https://github.com/idosal/assistant-chat-gpt-javascript">(repository deleted)</a></td>
<td>JavaScript, React</td>
<td><a href="https://github.com/eli64s/readme-ai/blob/main/examples/markdown/readme-postgres.md">readme-postgres.md</a></td>
<td><a href="https://github.com/jwills/buenavista">postgres-proxy-server</a></td>
<td>Python, Postgres, Duckdb, Docker</td>
</tr>
<tr>
<td>4️⃣</td>
Expand Down Expand Up @@ -351,19 +351,7 @@ A repository URL or local path to your codebase is required run readme-ai. The f

**OpenAI API Key**

An OpenAI API account and API key are needed to use *readme-ai*. The following steps outline the process.

<details closed>
<summary>🔐 OpenAI API Account Setup</summary>
<ol>
<li>Go to the <a href="https://platform.openai.com/">OpenAI website</a>.</li>
<li>Click the "Sign up for free" button.</li>
<li>Fill out the registration form with your information and agree to the terms of service.</li>
<li>Once logged in, click on the "API" tab.</li>
<li>Follow the instructions to create a new API key.</li>
<li>Copy the API key and keep it in a secure place.</li>
</ol>
</details>
An OpenAI API account and API key are needed to use *readme-ai*. Get started by creating an account [here](https://platform.openai.com/docs/quickstart/account-setup). Once you have an account, you can create an API key on the [API settings page](https://platform.openai.com/api-keys).

> [!WARNING]
>
Expand Down Expand Up @@ -395,11 +383,16 @@ conda install -c conda-forge readmeai
Alternatively, clone the readme-ai repository and build from source.

```sh
git clone https://github.com/eli64s/readme-ai && \
git clone https://github.com/eli64s/readme-ai
```

Change into the project directory.

```sh
cd readme-ai
```

Then use one of the methods below to install the project's dependencies (Bash, Conda, Pipenv, or Poetry).
And install the dependencies using one of the methods below.

Using `bash`
```sh
Expand All @@ -420,7 +413,7 @@ poetry shell

---

### 👩‍💻 Running *README-AI*
### 👩‍💻 Running *readme-ai*

Before running the application, ensure you have an OpenAI API key and its set as an environment variable.

Expand Down Expand Up @@ -554,11 +547,6 @@ The readme-ai tool is designed with flexibility in mind, allowing users to confi
<details closed><summary>🔠 Configuration Models</summary>
<br>

<!--
# README-AI Configuration and Settings
This documentation provides an overview of the configuration and settings for the README.ai CLI tool. It details various data models and functions that are used to configure the tool, making it adaptable for different environments and use cases.
-->

***GitService Enum***

- **Purpose**: Defines Git service details.
Expand Down
132 changes: 0 additions & 132 deletions docs/architecture.md

This file was deleted.

106 changes: 106 additions & 0 deletions docs/concepts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# readme-ai Core Concepts

readme-ai is a tool for auto-generating README files for code repositories using AI. Here are some of its key concepts:

## Repository Analysis

- Traverses the repository directory tree to build a code structure overview
- Extracts metadata like dependencies and languages used
- Analyzes characteristics to inform content generation

## AI-Powered Content Creation

- Uses GPT language models via the OpenAI API
- Structured prompts injected with repository details
- Generates sections like project overview and technical features
- Summarizes code files in markdown tables

## Customization

- Flexible configuration system
- CLI options to tweak badge icons, images, model settings
- Supports different badge styles like flat, plastic, skills
- Can provide custom images and set text alignment
- Edit prompt templates to influence content

## Modular Design

- Components and parsers decoupled from core logic
- Built using factory and strategy patterns
- Easily extend functionality with new parsers
- Abstracts services like file handling and git ops

## Asynchronous Workflows

- Leverages Python asyncio for non-blocking I/O
- Concurrent networking, disk and CPU bound tasks
- Manages OpenAI rate limits for optimal performance
- Resource management via async context managers

## Robustness

- Exponential backoff retry logic for resilience
- Caching frequently used responses
- Handles Unicode encoding errors gracefully
- Secure temp directories to isolate repository
- Configurable logging for debuggability

By leveraging these concepts and more, readme-ai aims to offer a flexible platform for auto-generating documentation to boost developer productivity.

---

Here is a markdown document discussing some of the core concepts of the readme-ai project:

# README-AI Core Concepts

README-AI is a tool for auto-generating detailed README files for software projects using AI. It utilizes several core concepts and components to analyze codebases and produce high-quality documentation.

## Codebase Analysis

README-AI performs an in-depth analysis of the provided codebase to extract key information.

- **File traversal**: Recursively traverse the codebase directory to identify all files. Special cases like ignoring certain files or handling GitHub workflows are handled programmatically.

- **Metadata extraction**: File metadata like name, path, content, language, dependencies etc. are extracted and stored. Popular dependency manifest formats are parsed to detect dependencies.

- **Content preprocessing**: File contents are tokenized to allow smarter content generation tailored to codebase complexity.

The output is a structured `FileData` object that encapsulates file details.

## LLM API Integration

Language Models like GPT-3 are leveraged to generate fluent text for documentation.

- **Modular design**: The LLM API client is abstracted into a separate `ModelHandler` class to allow swapping out different AI providers.

- **Prompt engineering**: Carefully crafted prompt templates are populated with codebase metadata to produce accurate, relevant content.

- **Batching & caching**: Requests are batched and caching used to optimize performance and costs. Exponential backoff retries handle errors.

Generated text is inserted into Markdown templates to build a full-fledged README.

## Configuration-driven

The tool relies extensively on configuration using Pydantic models.

- **Settings**: Central settings file with common constants and file paths. Helper configuration provides additional customization.

- **Validation**: Rigorous validations are performed on settings like repository URL to prevent errors.

- **Extensibility**: Adding new features or functionality requires minimal code changes due to config-driven design.

Overall, this promotes maintainability, testability and flexibility.

## Customizable Output

Users can customize the look and feel of the generated README by providing a range of CLI options.

- **Appearance**: Choose badge styles, header images, alignment options and more for unique styling.

- **Content**: Control language model behavior with parameters like temperature and max tokens. Toggle emojis in text.

- **Templates**: (WIP) Generate focused READMEs for domains like machine learning, webdev etc.

In summary, README-AI aims to simplify documentation through intelligent automation, while keeping the user in control.

---
Loading

0 comments on commit 5096f17

Please sign in to comment.