Newsletter image

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Search

GDPR Compliance

We use cookies to ensure you get the best experience on our website. By continuing to use our site, you accept our use of cookies, Privacy Policy, and Terms of Service.

Hugging Face - Ecosystem

Build Your Own AI Checker With Open Source Models 2026

Build your own AI checker with open-source classifiers in 2026. Step-by-step tutorial on HuggingFace detection models, fine-tuning, and deployment.

License MIT
TL;DR
  • Build a FastAPI endpoint that runs an open-source RoBERTa classifier plus GPT-2 perplexity ensemble for AI content detection.
  • Uses HuggingFace models with MIT (roberta-base-openai-detector) or Apache 2.0 (fakespot-ai/roberta-base-ai-text-detection-v1) licenses.
  • Honest about false positives; ships with a bias warning in the code and caveats for non-native English text.
System Requirements
RAM4GB
GPUCPU or any NVIDIA
VRAM2GB (optional)
✓ Apple Silicon

You want an AI checker you actually control. One that runs on your own GPU, logs what you want, and does not mail every user submission to a third-party vendor. Good news: the open-source pieces to build a working AI content checker have been on HuggingFace for over a year, and you can glue them together into a production FastAPI endpoint in an afternoon.

This tutorial walks through building your own AI checker in 2026 using two open models as an ensemble: a RoBERTa-large classifier for a learned score, plus a GPT-2 perplexity signal for a zero-shot second opinion. We wrap both behind a FastAPI endpoint, containerize it with Docker, and test it end to end. Fair warning before we start: your detector will have the same false-positive problem as every commercial AI checker. We will be honest about that at the end.

TL;DR: what you are building

  • Model A: SuperAnnotate/roberta-large-llm-content-detector for a supervised AI-probability score (F1 around 0.87 across four domains per the model card).
  • Model B: GPT-2 small for a zero-shot perplexity signal, the same family of feature used by early GPTZero and by the DetectGPT paper.
  • API: FastAPI with a single /check endpoint, Pydantic request validation, JSON response.
  • Deploy: Docker image, CPU-runnable, GPU-optional.

Total build time: 45 to 90 minutes depending on download speed and how much you hate Python dependency errors.

Prerequisites

You need Python 3.11 or newer, pip, git, and about 5 GB of free disk for the model weights. A GPU is nice but not required: roberta-large inference on CPU takes roughly 400 ms per paragraph on a modern laptop, which is fine for a single-user API.

Create a clean working directory and a virtual environment first.

mkdir open-ai-checker && cd open-ai-checker
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip

Pick your classifier model

The AI checker world on HuggingFace has three reasonable starting points in 2026. Pick one based on license and size.

Read the license file for whichever model you ship. SAIPL in particular has restrictions on competing with SuperAnnotate's own products, which may not matter to you but which your legal team will want to know about.

Install the dependencies

We need transformers for both models, torch as the backbone, fastapi and uvicorn for the API, and pydantic for the request schema.

pip install "transformers>=4.48" torch fastapi "uvicorn[standard]" pydantic

If you are on a CUDA box, install the matching torch build from the PyTorch site instead of the default CPU wheel. Everything else stays the same.

Write the inference module

Save the following as checker.py. It loads both models once at import time, exposes two scoring functions, and combines them into an ensemble score.

import math
import torch
import torch.nn.functional as F
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    GPT2LMHeadModel,
    GPT2TokenizerFast,
)

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

# Model A: supervised classifier
CLASSIFIER_ID = "SuperAnnotate/roberta-large-llm-content-detector"
clf_tok = AutoTokenizer.from_pretrained(CLASSIFIER_ID)
clf_model = AutoModelForSequenceClassification.from_pretrained(
    CLASSIFIER_ID
).to(DEVICE).eval()

# Model B: perplexity reference
PPL_ID = "gpt2"
ppl_tok = GPT2TokenizerFast.from_pretrained(PPL_ID)
ppl_model = GPT2LMHeadModel.from_pretrained(PPL_ID).to(DEVICE).eval()

@torch.inference_mode()
def classifier_score(text: str) -> float:
    """Return P(AI) from the RoBERTa classifier, in [0, 1]."""
    enc = clf_tok(
        text,
        return_tensors="pt",
        truncation=True,
        max_length=512,
    ).to(DEVICE)
    logits = clf_model(**enc).logits
    probs = F.softmax(logits, dim=-1).squeeze(0)
    # Label index 1 is the AI-generated class for this model card.
    return float(probs[1].item())

@torch.inference_mode()
def perplexity(text: str) -> float:
    """Compute GPT-2 perplexity of the input string."""
    enc = ppl_tok(text, return_tensors="pt", truncation=True, max_length=1024).to(DEVICE)
    input_ids = enc.input_ids
    if input_ids.size(1) < 2:
        return float("inf")
    out = ppl_model(input_ids, labels=input_ids)
    return math.exp(out.loss.item())

def perplexity_score(text: str) -> float:
    """Map perplexity to a 0..1 AI-likelihood using a simple logistic."""
    ppl = perplexity(text)
    # Lower perplexity = more AI-like. Tune the midpoint for your corpus.
    midpoint = 40.0
    slope = 0.08
    return 1.0 / (1.0 + math.exp(slope * (ppl - midpoint)))

def ensemble_score(text: str, w_clf: float = 0.65, w_ppl: float = 0.35) -> dict:
    clf = classifier_score(text)
    ppl = perplexity_score(text)
    final = w_clf * clf + w_ppl * ppl
    return {
        "classifier_ai_prob": round(clf, 4),
        "perplexity_ai_prob": round(ppl, 4),
        "ensemble_ai_prob": round(final, 4),
        "verdict": "ai" if final > 0.7 else "mixed" if final > 0.4 else "human",
    }

Two notes. First, the classifier label index (0 = human, 1 = AI) is true for the SuperAnnotate card we linked. If you swap in fakespot-ai or the OpenAI detector, read the label mapping from clf_model.config.id2label and adjust. Second, the perplexity midpoint (40) is a rough starting point. Calibrate it on a held-out set of human vs AI paragraphs from your own corpus before you trust the number.

Build the FastAPI endpoint

Save this as app.py. It exposes one POST route, validates the request body with Pydantic, and returns the ensemble result.

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field
from checker import ensemble_score

app = FastAPI(title="Open AI Checker", version="0.1.0")

class CheckRequest(BaseModel):
    text: str = Field(..., min_length=20, max_length=20000)

class CheckResponse(BaseModel):
    classifier_ai_prob: float
    perplexity_ai_prob: float
    ensemble_ai_prob: float
    verdict: str

@app.get("/health")
def health():
    return {"status": "ok"}

@app.post("/check", response_model=CheckResponse)
def check(req: CheckRequest):
    try:
        return ensemble_score(req.text)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

The 20-character minimum is deliberate. Every open-source AI checker gets worse as the input shrinks, and a single sentence is statistically useless. Under 20 characters you are guessing; under 100 you are still guessing but with extra steps.

Test it with curl

Run the server with uvicorn in one terminal:

uvicorn app:app --host 0.0.0.0 --port 8000

The first request downloads the models (about 1.8 GB total) and will take a minute. Subsequent calls are fast. Hit the endpoint from a second terminal.

curl -s http://localhost:8000/check \
  -H "Content-Type: application/json" \
  -d '{"text": "Large language models have transformed the landscape of natural language processing by enabling scalable text generation across many domains, with notable implications for education, publishing, and software development."}' | python3 -m json.tool

You should see a response like this:

{
    "classifier_ai_prob": 0.9412,
    "perplexity_ai_prob": 0.8233,
    "ensemble_ai_prob": 0.9000,
    "verdict": "ai"
}

Now send a messy human paragraph (a real Slack rant, a paragraph from your own journal) and compare. If you see a clean split, your calibration is close to right. If human text is also getting flagged, tune the perplexity midpoint up, or lower w_ppl.

Package it in Docker

For deployment, bake the whole thing into an image. Save as Dockerfile.

FROM python:3.11-slim

WORKDIR /app

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
 && rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Pre-download models at build time so cold starts are fast.
COPY warm.py .
RUN python warm.py

COPY checker.py app.py ./

EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

The warm.py script is three lines: import the two model classes and pull them by ID so HuggingFace caches the weights into the image layer.

from transformers import AutoTokenizer, AutoModelForSequenceClassification, GPT2LMHeadModel, GPT2TokenizerFast

AutoTokenizer.from_pretrained("SuperAnnotate/roberta-large-llm-content-detector")
AutoModelForSequenceClassification.from_pretrained("SuperAnnotate/roberta-large-llm-content-detector")
GPT2TokenizerFast.from_pretrained("gpt2")
GPT2LMHeadModel.from_pretrained("gpt2")

A minimal requirements.txt:

transformers>=4.48
torch
fastapi
uvicorn[standard]
pydantic

Build and run.

docker build -t open-ai-checker:0.1.0 .
docker run --rm -p 8000:8000 open-ai-checker:0.1.0

Image size lands around 3.5 GB because of the PyTorch wheel and the two models. If you care about cold-start latency, pin the torch CPU wheel explicitly in requirements.txt and drop the build-essential apt packages after pip install.

Honest limitations

This AI checker is useful for triage and debugging. It is not a decision engine. Three things you should know before you connect it to anything that affects a human.

  • False positives on non-native English writing are real. The Stanford study by Liang et al. (2023) found that major detectors flagged more than 50 percent of TOEFL essays as AI-generated. Your ensemble inherits the same weakness because both signals (perplexity and the RoBERTa classifier) reward fluent, common-vocabulary writing.
  • It decays with every new model release. The classifier was trained on a specific distribution of LLM output. When Qwen4, GLM-6, or Llama 5 drops, your accuracy will drift until someone retrains the detector. Plan on quarterly evaluation at minimum.
  • It is trivially defeatable. A 2-prompt humanizer pipeline ("rewrite this with varied sentence length and casual punctuation") collapses both signals. If your threat model includes motivated evasion, this tool is the wrong tool.

Use the endpoint as a low-cost first pass, log every score, and send anything in the "mixed" band to a human reviewer. That is the only policy the research supports in 2026.

Your ten-minute action

Clone this scaffold, swap in the fakespot-ai Apache-2.0 model if SAIPL licensing is a problem for you, and run your own last blog post through it. The score you get on text you know is human will be a better calibration dataset than any benchmark, and it will teach you exactly why the "verdict" field needs a human in the loop.

Prev Article
Best AI Generator Stack for Developers in 2026
Next Article
How to create Logos with Midjourney

Related to this topic: