We believe in transparency about our classification expectations. Below are our test suites showing inputs, expected classifications, and actual results. These help you understand what risk signals we consider "good" classifications.
Each test case includes an input conversation, expected risk classification, and actual model output. We use these suites internally to validate changes and track classification quality over time. "Passing" means the actual classification matches our expected outcome within acceptable bounds.