AI/ML

Fine-Tuning GPT-4o-mini on Your Own Data: A Practical Guide

Step-by-step guide to fine-tuning OpenAI's GPT-4o-mini model on your custom dataset for better domain-specific performance at lower cost.

March 18, 20263 min read

Why Fine-Tune Instead of Using Prompts?

Prompt engineering works well for general tasks, but fine-tuning is better when:

  • You need consistent output format (always returning JSON in a specific schema)
  • You have a domain-specific style (legal writing, medical notes, brand voice)
  • You want to reduce token usage — fine-tuned models need shorter prompts
  • You need better accuracy on niche tasks the base model struggles with

Fine-tuning GPT-4o-mini costs $3.00 per million training tokens — roughly $0.30 to train on 100 examples. It's surprisingly affordable for the improvement you get.

Step 1 — Prepare Your Training Data

OpenAI requires training data in JSONL format. Each line is a conversation with messages:

{"messages": [{"role": "system", "content": "You are a legal document summarizer."}, {"role": "user", "content": "Summarize this contract clause: [clause text]"}, {"role": "assistant", "content": "This clause establishes..."}]}

{"messages": [{"role": "system", "content": "You are a legal document summarizer."}, {"role": "user", "content": "Summarize this contract clause: [another clause]"}, {"role": "assistant", "content": "This provision requires..."}]}

Key guidelines:
  • Minimum 10 examples, recommended 50-100 for noticeable improvement
  • Each example should represent the exact input/output you want
  • Keep the system message consistent across all examples
  • Include edge cases and diverse examples

Save this as training_data.jsonl.

Step 2 — Validate and Upload

Validate your data format before uploading:

import json

def validate_training_data(filepath):

with open(filepath) as f:

lines = f.readlines()

errors = []

for i, line in enumerate(lines):

try:

data = json.loads(line)

if "messages" not in data:

errors.append(f"Line {i+1}: missing 'messages' key")

elif len(data["messages"]) < 2:

errors.append(f"Line {i+1}: need at least user + assistant messages")

except json.JSONDecodeError:

errors.append(f"Line {i+1}: invalid JSON")

if errors:

for e in errors:

print(f"ERROR: {e}")

else:

print(f"Valid! {len(lines)} training examples.")

validate_training_data("training_data.jsonl")

Upload to OpenAI:

from openai import OpenAI

client = OpenAI()

file = client.files.create(

file=open("training_data.jsonl", "rb"),

purpose="fine-tune"

)

print(f"File ID: {file.id}")

Step 3 — Start Fine-Tuning

Create a fine-tuning job:

job = client.fine_tuning.jobs.create(

training_file=file.id,

model="gpt-4o-mini-2024-07-18",

hyperparameters={

"n_epochs": 3, # 2-4 epochs is usually optimal

}

)

print(f"Job ID: {job.id}")

print(f"Status: {job.status}")

Monitor progress:

import time

while True:

job = client.fine_tuning.jobs.retrieve(job.id)

print(f"Status: {job.status}")

if job.status in ["succeeded", "failed", "cancelled"]:

break

time.sleep(60)

if job.status == "succeeded":

print(f"Model: {job.fine_tuned_model}")

Training typically takes 10-30 minutes for 50-100 examples.

Step 4 — Use Your Fine-Tuned Model

Once training completes, use your model like any other OpenAI model:

response = client.chat.completions.create(

model=job.fine_tuned_model, # ft:gpt-4o-mini-2024-07-18:your-org::xxx

messages=[

{"role": "system", "content": "You are a legal document summarizer."},

{"role": "user", "content": "Summarize this clause: ..."}

]

)

print(response.choices[0].message.content)

Fine-tuned gpt-4o-mini costs $0.30 per million input tokens and $1.20 per million output tokens — about 2x the base model price, but you'll typically use fewer tokens because you need shorter prompts.

Tips for Better Results

After fine-tuning dozens of models, here's what actually makes a difference:

  • 1. Quality > quantity — 50 perfect examples beat 500 sloppy ones
  • 2. Be consistent — same system prompt, same output format in every example
  • 3. Include negative examples — show the model what NOT to do
  • 4. Start with 3 epochs — more epochs can cause overfitting on small datasets
  • 5. Test against base model — run the same 20 test cases on both models to measure improvement
  • 6. Iterate — add examples where the fine-tuned model fails, then retrain

Fine-tuning is not a one-shot process. The best models go through 3-5 iterations of training data refinement.

Try These Tools

Mentioned in this article — free, no sign-up required.

fine-tuningopenaigpt-4o-minimachine learningcustom modelai training

Related Articles