Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI

Amazon SageMaker AI now supports using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to improve tool-calling accuracy in small language models. This approach helps developers enhance model performance without managing training infrastructure. It also provides methods to evaluate and compare model variants for better decision-making.

HotAI AgentsHigh-signal source

Signal trust

High-signal sourceSingle sourceEarly signal

Original article excerpt

Server-side extracted preview paragraphs from the original source.

In this post, you learn how to use Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) together to improve the tool-calling accuracy of a small language model (SLM). The example uses Amazon SageMaker AI training jobs, so you can focus on training code instead of managing your own training infrastructure. You also learn how to evaluate tool-calling accuracy and compare a base model to several fine-tuned variants, so you can make data-driven decisions about model quality.

AI agents can autonomously handle complex, multi-step tasks, but their effectiveness depends on calling the right tools to retrieve information or take action. When an agent picks the wrong tool, formats parameters incorrectly, or breaks a workflow chain, task completion times grow, error rates rise, support costs increase, and user experiences degrade. As more organizations move agentic applications from pilot to production, having agents that select the right tool for each request is essential for reliable automation.

Opening the briefing

Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI

Original article excerpt