Skip to content

A User-Centric framework designed to evaluate LLMs’ ability to handle complex financial tasks

License

Notifications You must be signed in to change notification settings

TobyYang7/UCFE-Benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Simulator Overview

Overview

The UCFE Benchmark provides a user-centric framework for evaluating the performance of large language models (LLMs) in complex financial tasks. The complete benchmark dataset is available in UCFE_bench.json.

How to Run the Simulator

Follow these steps to set up and run the simulator:

  1. Set your API key in the config folder.
  2. Run the simulator with the following command: python run_ckpt.py

How to Evaluate the Model

You can evaluate individual models or run evaluations for all models:

  1. Evaluate for a single model: bash scripts/eval_model.sh
  2. Evaluate for all models: bash scripts/eval_all.sh

About

A User-Centric framework designed to evaluate LLMs’ ability to handle complex financial tasks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published