contextualize

contextualize is a package to quickly retrieve and format file contents for use with LLMs.

Installation

You can install the package using pip:

pip install contextualize

or pipx for using the CLI globally:

pipx install contextualize

Usage (`reference.py`)

Define FileReference objects for specified file paths and optional ranges.

set range to a tuple of line numbers to include only a portion of the file, e.g. range=(1, 10)
set format to "md" (default) or "xml" to wrap file contents in Markdown code blocks or <file> tags
set label to "relative" (default), "name", or "ext" to determine what label is affixed to the enclosing Markdown/XML string
- "relative" will use the relative path from the current working directory
- "name" will use the file name only
- "ext" will use the file extension only

Retrieve wrapped contents from the output attribute.

CLI

A CLI (cli.py) is provided to print file contents to the console from the command line.

cat: Prepare and concatenate file references
- paths: Positional arguments for target file(s) or directories
- --ignore: File(s) to ignore (optional)
- --format: Output format (md or xml, default is md)
- --label: Label style (relative for relative file path, name for file name only, ext for file extension only; default is relative)
- --output: Output target (console (default), clipboard)
- --output-file: Output file path (optional, compatible with --output clipboard)
ls: List token counts
- paths: Positional arguments for target file(s) or directories to process
- --openai-encoding: OpenAI encoding to use for tokenization, e.g., cl100k_base (default), p50k_base, r50k_base
- --openai-model: OpenAI model (e.g., gpt-3.5-turbo/gpt-4 (default), text-davinci-003, code-davinci-002) to determine which encoding to use for tokenization.
- --anthropic-model: Anthropic model to use for token counting

Examples

cat:
- contextualize cat README.md will print the wrapped contents of README.md to the console with default settings (Markdown format, relative path label).
- contextualize cat README.md --format xml will print the wrapped contents of README.md to the console with XML format.
- contextualize cat contextualize/ dev/ README.md --format xml will prepare file references for files in the contextualize/ and dev/ directories and README.md, and print each file's contents (wrapped in corresponding XML tags) to the console.
ls:
- contextualize ls README.md will count and print the number of tokens in README.md using the default cl100k_base encoding, unless ANTHROPIC_API_KEY is set, in which case the Anthropic token counting API will be used.
- contextualize ls contextualize/ --openai-model text-davinci-003 will count and print the number of tokens in each file in the contextualize/ directory using the p50k_base encoding associated with the text-davinci-003 model, then print the total tokens for all processed files.

Related projects

lumpenspace/jamall

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
contextualize		contextualize
dev		dev
.gitignore		.gitignore
README.md		README.md
config.example.yaml		config.example.yaml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

contextualize

Installation

Usage (`reference.py`)

CLI

Examples

Related projects

About

Releases 1

Packages

Languages

conscious-data/contextualize

Folders and files

Latest commit

History

Repository files navigation

contextualize

Installation

Usage (reference.py)

CLI

Examples

Related projects

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Usage (`reference.py`)

Packages