NexusSample
NexusBenchmark
NexusRaven V2 Function Calling Benchmark
<https://huggingface.co/collections/Nexusflow/nexusraven-v2-function-calling-benchmark-657a597fb84dbe7a09ebfc3e>.
Parameters:
- data_dir (str): The directory to save the data.
- save_to (str): The file to save the results.
- processes (int, optional): The number of processes to use. (default: :obj:
1)
init
- data_dir (str): The directory to save the data.
- save_to (str): The file to save the results.
- processes (int, optional): The number of processes to use for parallel processing. (default: :obj:
1)
download
load
- dataset_name (str): Name of the specific dataset to be loaded.
- force_download (bool): Whether to force download the data.
train
run
- agent (ChatAgent): The agent to run the benchmark. task (Literal[“NVDLibrary”, “VirusTotal”, “OTX”, “PlacesAPI”, “ClimateAPI”, “VirusTotal-ParallelCalls”, “VirusTotal-NestedCalls”, “NVDLibrary-NestedCalls”]): The task to run the benchmark.
- randomize (bool, optional): Whether to randomize the data. (default: :obj:
False) - subset (Optional[int], optional): The subset of data to run. (default: :obj:
None)
construct_tool_descriptions
construct_prompt
parse_function_call
- call (str): A string in the format
func(arg1, arg2, kwarg=value).
compare_function_calls
- agent_call (str): Function call by agent.
- ground_truth_call (str): Ground truth function call.
Trueif the function names and arguments match.Falseotherwise.