ArticlesProjectsWeeklyCredentialsAbout

Fine-Tune BERT for Sentiment Classification

End-to-end fine-tuning of bert-base-uncased on SST-2 using the HuggingFace Trainer API. Covers tokenisation, TrainingArguments, evaluation with the GLUE metric, and a ready-to-use inference pipeline.

bertpythonfine-tuningnlp

Fine-tune bert-base-uncased on SST-2 sentiment in under 60 lines. Reaches ~92% validation accuracy in 3 epochs using the HuggingFace Trainer with AdamW + linear warmup.

Source code
# Fine-Tune BERT for Sentiment Classification

Fine-tunes `bert-base-uncased` on SST-2 using the HuggingFace Trainer API. Reaches ~92% validation accuracy in 3 epochs.

## Requirements

```bash
pip install transformers datasets evaluate accelerate
```

## Run

```bash
python finetune_bert_sst2.py
```

Outputs:

- `./bert-sst2/best/` — the best checkpoint (by validation accuracy)
- `./logs/` — TensorBoard training logs

## Expected results

| Epoch | Val accuracy |
| ----- | ------------ |
| 1     | ~90%         |
| 2     | ~92%         |
| 3     | ~92–93%      |

The original BERT paper reports 93.5% on SST-2 with the full training setup.

## Key hyperparameters

| Param           | Value  | Notes                                                     |
| --------------- | ------ | --------------------------------------------------------- |
| `learning_rate` | `2e-5` | Standard for BERT fine-tuning                             |
| `warmup_steps`  | `500`  | Prevents destroying pre-trained weights early             |
| `weight_decay`  | `0.01` | L2 regularisation via AdamW                               |
| `max_length`    | `128`  | SST-2 sentences are short; saves ~4× memory vs 512        |
| `batch_size`    | `32`   | Reduce to 8 + `gradient_accumulation_steps=4` for 8GB GPU |

## Swap to DistilBERT

Change one line for 60% faster inference with ~2% accuracy drop:

```python
MODEL_NAME = "distilbert-base-uncased"
# and use DistilBertForSequenceClassification / DistilBertTokenizer
```