from transformers import pipelineText Classification with Transformers Pipeline
This notebook demonstrates how to use Hugging Face’s pipeline API for text classification tasks, specifically for spam detection.
Importing Pipeline
Import the pipeline function from transformers to create pre-configured models for specific tasks.
Creating Text Classification Pipeline
Create a text classification pipeline using a multilingual DistilBERT model trained for sentiment analysis. We’ll adapt this for spam detection.
spam_classifier = pipeline(
'text-classification',
model='philschmid/distilbert-base-multilingual-cased-sentiment'
)Device set to use cuda:0
Testing with Sample Documents
Test the classifier with various text samples including potential spam messages and normal communications.
docs = [
"Congartulations! You've won 5oo INR Amazon gift voucher",
'Hey Amit. Lets have a meeting tomorrow',
'URGENT: Youre gmail has been comprimsed. Click this link to revive it',
'URGENT: Youre Credit Card has been comprimsed. Click this link to revive it'
]
results = spam_classifier(docs)Interpreting Results for Spam Detection
Map sentiment labels to spam categories and display results with confidence scores.
label_mapping = {'negative': 'SPAM',
'neutral':'NOT SPAM',
'positive':'NOT SPAM'}
for res in results:
label = label_mapping[res['label']]
score = res['score']
print(f"Label: {label}, Confidence: {score:.2f}")Label: SPAM, Confidence: 0.96
Label: NOT SPAM, Confidence: 0.80
Label: SPAM, Confidence: 0.92
Label: SPAM, Confidence: 0.94
Summary
This notebook demonstrated: - Using Hugging Face pipelines for text classification - Adapting a sentiment model for spam detection - Processing multiple documents at once - Interpreting and mapping model outputs
The pipeline API makes it easy to use pre-trained models for various NLP tasks!