Text Classification with Transformers Pipeline

nlp

transformers

text-classification

spam-detection

Using Hugging Face transformers pipeline for text classification and spam detection.

Author

Mohammed Adil Siraju

Published

September 26, 2025

This notebook demonstrates how to use Hugging Face’s pipeline API for text classification tasks, specifically for spam detection.

Importing Pipeline

Import the pipeline function from transformers to create pre-configured models for specific tasks.

from transformers import pipeline

Creating Text Classification Pipeline

Create a text classification pipeline using a multilingual DistilBERT model trained for sentiment analysis. We’ll adapt this for spam detection.

spam_classifier = pipeline(
    'text-classification',
    model='philschmid/distilbert-base-multilingual-cased-sentiment'
)

Device set to use cuda:0

Testing with Sample Documents

Test the classifier with various text samples including potential spam messages and normal communications.

docs = [
    "Congartulations! You've won 5oo INR Amazon gift voucher",
    'Hey Amit. Lets have a meeting tomorrow',
    'URGENT: Youre gmail has been comprimsed. Click this link to revive it',
    'URGENT: Youre Credit Card has been comprimsed. Click this link to revive it'
]

results = spam_classifier(docs)

Interpreting Results for Spam Detection

Map sentiment labels to spam categories and display results with confidence scores.

label_mapping = {'negative': 'SPAM',
                 'neutral':'NOT SPAM',
                 'positive':'NOT SPAM'}

for res in results:
    label = label_mapping[res['label']]
    score = res['score']
    print(f"Label: {label}, Confidence: {score:.2f}")

Label: SPAM, Confidence: 0.96
Label: NOT SPAM, Confidence: 0.80
Label: SPAM, Confidence: 0.92
Label: SPAM, Confidence: 0.94

Summary

This notebook demonstrated: - Using Hugging Face pipelines for text classification - Adapting a sentiment model for spam detection - Processing multiple documents at once - Interpreting and mapping model outputs

The pipeline API makes it easy to use pre-trained models for various NLP tasks!