Red Team Safety Classifier Evaluation AI Trainer, $55–$65/hour

Remote, USA

Posted Jun 13, 2026

Full-time

Project Overview:

Join a growing community of professionals advancing the next wave of AI. As an AI Trainer, you’ll play a hands-on role by analyzing and providing feedback on data to improve LLM performance, helping ensure that the next generation of AI technology is accurate and trustworthy.

We are seeking a skilled AI Safety Evaluator / Red Team Prompt Engineer to work as a project consultant in our AI Labor Marketplace. This is not a full-time employment position — you will be engaged as an expert project consultant on a contract basis.

Location: U.S.-based experts only

Engagement: Part-time, project-based expert evaluation work

Work Type: Remote

Project Summary:

A fast-paced AI safety evaluation sprint focused on adversarial prompt generation and safety classification. Contributors will create and assess high-difficulty, edge-case scenarios, applying structured labeling, severity scoring, and policy-based reasoning to improve model safety performance.

Consultant Engagement Terms:

This is a project-based consultant role. Consultants will be paid on a per-project basis; hourly rates are estimates based on anticipated completion time. Consultants control their own schedule, provide their own tools, and may simultaneously provide services to other vendors/employers (subject to those vendors’ allowances).

Responsibilities:

Contributors will:
• Design adversarial prompts that expose edge cases in AI safety systems
• Apply structured safety classifications, including category and severity
• Write concise, policy-grounded rationales for decisions
• Review and validate peer submissions for accuracy and quality
• Identify ambiguous or difficult-to-classify scenarios
• Maintain consistency across high-volume evaluation tasks

Expected Outcomes:
• High-quality adversarial examples suitable for model evaluation
• Accurate and consistent safety labels and severity ratings
• Clear, defensible rationales aligned with policy guidelines
• Reliable QA feedback improving dataset quality

Qualifications:
• Experience in AI safety, LLM evaluation, red teaming, or trust & safety
• Strong prompt engineering and analytical reasoning skills
• Familiarity with safety taxonomies and policy-based classification
• Ability to work independently and maintain high-quality output
• Prior experience with annotation or evaluation platforms preferred

Apply tot his job

Apply Now

Red Team Safety Classifier Evaluation AI Trainer, $55–$65/hour

Responsibilities:

More Remote Jobs