To assist builders shield their purposes in opposition to doable misuse, we’re introducing the quicker and extra correct Moderation endpoint. This endpoint gives OpenAI API builders with free entry to GPT-based classifiers that detect undesired content material—an occasion of using AI systems to help with human supervision of those methods. We’ve got additionally launched each a technical paper describing our methodology and the dataset used for analysis.
When given a textual content enter, the Moderation endpoint assesses whether or not the content material is sexual, hateful, violent, or promotes self-harm—content material prohibited by our content policy. The endpoint has been skilled to be fast, correct, and to carry out robustly throughout a variety of purposes. Importantly, this reduces the probabilities of merchandise “saying” the fallacious factor, even when deployed to customers at-scale. As a consequence, AI can unlock advantages in delicate settings, like schooling, the place it couldn’t in any other case be used with confidence.