-
[Message Queue System for Alfred Server] Alfred now implements a lean Kafka-like message queue system for server-client communication. This enables alfred server to communicate with multiple clients concurrently and asynchronously.
-
[Google Gemini Support] Gemini is here!
gemini_pro = Client(model_type="google", model="gemini-pro", api_key=<your API key>)
-
[GPT-4V Support] Alfred now supports GPT-4V(ision). Use it to streamline your image annotation tasks! For example:
openai = Client(model_type="openai", model="gpt-4-vision-preview") image = ... # load your image openai((image, f"What type is this document? Please choose from {label_space}"))
-
[Embedding with Alfred] Get a vector representation for any input strings! Alfred now supports embedding from locally hosted huggingface models or api-based calls from Cohere and OpenAI. To use:
Client.encode(Union[str, List[str]]) -> Union[torch.tensor, List[torch.tensor]]
-
[Chat with GPTs, Gemini or Claude on Alfred] Alfred now supports chat with anthropic, google gemini and openai api-based models, to use simply type:
from alfred import Client gpt = Client(model_type="openai", model="gpt-3.5-turbo") gpt.chat() # or chat with claude from Anthropic! claude = Client(model_type="anthropic", model="claude-2") claude.chat() gemini = Client(model_type="google", model="gemini-pro") gemini.chat()
Our Alfred paper is available here! Alfred is a prototype framework for integrating large pretrained model into programmatic weak supervision pipelines. Alfred provides an intuitive and user-friendly interface, enabling users to quickly create and refine prompts as supervision sources and interact with large models. Furthermore, Alfred includes tools for label modeling, allowing the mixed signals from prompted model responses to be combined, distilled and denoised. Additionally, Alfred enables memory- and computation- intensive models to be run on cloud or computing clusters with optimized batching mechanisms, significantly increasing query throughput. Alfred aims to reduce annotation cost and time by making efficient use of LLMs, allowing users to make the most of their resources.
If you find Alfred useful, please cite the following work. Thank you!
Peilin Yu, Stephen H. Bach. "Alfred: A System for Prompted Weak Supervision". ACL Demo, 2023.
@inproceedings{yu2023alfred,
title = {Alfred: A System for Prompted Weak Supervision},
author = {Yu, Peilin and Bach, Stephen H.},
booktitle = {ACL Systen Demonstration},
year = 2023,
}
pip install -r requirements.txt
(Optional) It is highly recommended to use anaconda to create a virtual environment in alternative to the above command
conda create --name alfred anaconda
conda activate alfred
pip install -r requirements.txt
Run Alfred directly in its root directory or install it as a pip package at the end of the setup process.
pip install -e .
from alfred.client import Client
AlfredT0pp = Client(model_type="huggingface", model="bigscience/T0pp",
local_path='/data/models/huggingface/')
# Or use API-based AI21/Cohere/OpenAI Models
GPTClient = Client(model_type="openai", model="gpt-3.5-turbo", api_key="<api_key>")
# Get the model's predictions for given queries:
AlfredT0pp("What is the capital of France?")
# This is equivalent to run a Completion Query
from alfred.fm.query import CompletionQuery
AlfredT0pp(CompletionQuery("What is the capital of France?"))
# Or you can run the model on a list of queries:
AlfredT0pp(["What is the capital of France?", "What is the capital of Germany?"])
# For ranking prompts, use RankedQuery
from alfred.fm.query import RankedQuery
query = RankedQuery("What is the capital of France?", ["Paris", "Berlin", "London"])
AlfredT0pp(query)
pip install -r requirements.txt
python -m alfred.run_server --model_type <model_type> --model <model_name> --local_path <model_ckpt_dir> --port <port_number>
NOTE: It is better to check the standard output logs to make sure the port number given is used by the server. If not, alfred will automatically find the nearest available port number.
python -m alfred.run_server --model_type "huggingface" --model "bigscience/T0pp" --local_path "/data/models/huggingface/" --port 10719
You may launch the server with cluster manager (e.g. SLURM) and use the login node as jump host. A example slurm bash script:
#!/bin/bash
#SBATCH --job-name=alfred_server_session
#SBATCH --nodes=1
#SBATCH --partition=<partition> --gres=gpu:4
python -m alfred.run_server --port 10719 --model_type "huggingface" --model "bigscience/T0pp" --local_path '/data/models/huggingface/'
NOTE: If you want to server the model for multiple users who may not have the credentials for your jump node, you may use third-party tcp tunneling services (e.g. ngrok) to get a public URL and port for your server. You can then use the public URL and port to connect to the server from your local machine.
from alfred.client import Client
t0pp = Client(model_type="huggingface", model="bigscience/t0pp", end_point="", ssh_tunnel=True, ssh_node="")
NOTE:
end_point
contains user name, server address and port numbers in the form of[username]@[server]:[port]
.Thessh_tunnel
flag is used to indicate whether the client should use ssh tunnel to connect to the server. Thessh_node
is only used if the server that is running the model is a compute node sitting behind a jump server. For example if the server is running ongpu1404
and the jump server isssh.ccv.brown.edu
then thessh_node
should begpu1404
and theend_point
should be[username]@ssh.ccv.brown.edu:[port_number]
.
For completions, the simplest way is to use model(query)
or model.run(query)
t0pp("Who are you?")
# Or for ranking prompts, use RankedQuery
from alfred.fm.query import RankedQuery
query = RankedQuery("What is the capital of France?", ["Paris", "Berlin", "London"])
t0pp(query)
from alfred.template import StringTemplate
example_template = StringTemplate(
template = """Context: [text]\n\nIs the above messege about weather?""",
answer_choices = None, # -> None if for completion, add "|||" delimilated strings for candidates scoring
)
example = {'text': "Finaly a pleasant day with sunny sky"}
prompt = example_template.apply(example)
# Now the prompt should be a CompletionQuery:
# CompletionQuery(content=Context: Finaly a pleasant day with sunny sky Is the above messege about weather?)
t0pp(prompt)
The whole process can be simplified and distilled into one line as:
t0pp(example_template(example))
from alfred.client import Client
t0pp = Client(...)
2. Define a dataset class, you may use huggingface datasets classes directly! For now we use a Wrench Benchmark dataset
from alfred.data.wrench import WrenchBenchmarkDataset
spouse_test = WrenchBenchmarkDataset(
dataset_name='spouse',
split='test',
local_path="/users/pyu12/data/pyu12/datasets/wrench/"
)
from alfred.template import StringTemplate
mention_template = StringTemplate(
template = """Context: [text]\n\nIs there any mention of "spouse" between the entities [entity1] and [entity2]?""",
answer_choices = None, # -> None if for completion, add "|||" delimilated strings for candidates scoring
)
mention_voter = Voter(
label_map = {'yes': 2},
)
prompts = [mention_template.apply(instance) for instance in spouse_test]
responses = t0pp(prompts)
votes = mention_voter.vote(responses)