imagemoderationapi
Home
Industries
E-commerce Social Media Dating Gaming Healthcare
Use Cases
User Generated Content Profile Verification Marketplace Listings Kids Apps Live Streaming
Detection
NSFW Detection Violence Detection Deepfake Detection Face Detection AI Image Detection
Threats
CSAM Nudity Violence Deepfakes Harassment
SDKs
Python Node.js JavaScript PHP Go
Platforms
WordPress Shopify Discord AWS S3 Firebase
Resources
Pricing Login Compliance Glossary Regions
Try Image Moderation
Definition

Confidence Score

/ˈkɒnfɪdəns skɔː/

A numerical value (typically 0-1 or 0-100%) indicating the AI model's certainty that an image contains a particular type of content, used to determine moderation thresholds and actions.

What is a Confidence Score?

A confidence score represents the probability that an AI model's prediction is correct. In content moderation, it indicates how certain the system is that an image contains specific content types like nudity, violence, or hate symbols.

Scores typically range from 0 (no confidence) to 1 (complete confidence). Higher scores indicate stronger certainty in the detection.

Confidence Score Examples

NSFW
0.95
Violence
0.12
Hate Symbol
0.03

Setting Thresholds

Platforms set custom thresholds based on their policies. A children's app might block content above 0.3, while an adult platform might only flag at 0.9. The right threshold balances catching harmful content against false positives.

Using Confidence Scores

Multi-Category Scores

Modern APIs return confidence scores for multiple categories simultaneously, allowing platforms to evaluate images against various policy criteria in a single request. This enables nuanced moderation decisions based on combined risk factors.

Get Precise Confidence Scores

Industry-leading accuracy for better moderation decisions

Start Free Trial