Trust & Safety Evaluator with English from New Zealand
TELUS Digital
✨ Miksi tämä sopiiArvioit tekoälyn turvallisuutta luomalla ja testaamalla haastavia kyselyitä. Tehtävä vaatii luovuutta, kulttuurituntemusta ja kykyä tunnistaa riskit.
Kuvaus
Trust & Safety Evaluator conduct adversarial testing and safety evaluation of generative AI features. Main tasks are crafting queries, evaluating the safety of generated content and providing critical feedback. This role requires creative thinking about potential misuse, deep cultural and linguistic knowledge, and the ability to identify subtle safety risks.
**KEY RESPONSIBILITIES**
- Write, review and evaluate diverse and challenging queries designed to test the system's limits and expose problematic outputs. Queries will target specific risk topics including explicit and/or offensive content
- Design and execute sequences of queries simulating realistic, unfolding conversations.
- Craft attack scenarios using techniques like crescendo attacks and context manipulation to test the system
- Age-Appropriate Safety Evaluation: to guide adversarial query crafting and safety evaluation.
- Assign risk ratings to AI Generated content based on safety guidelines.
- Detect and articulate potential biases in system-generated content
- Identify subtle unsafe elements, inconsistencies, or problematic implications
- Evaluate specific cultural and language knowledge across different cultural contexts.
- Leverage familiarity with relevant domains, genres, cultural references, and industry context.
- Complete adversarial testing tasks within time constraints
- Maintain high accuracy and attention to detail while working at pace.
**REQUIRED SKILLS**
- Language skills: English from New Zealand as primary language is required
- Deep cultural awareness of English from New Zealand: context, social norms, and values
- Understanding of cultural references and industry context
- Knowledge of how different cultural groups interpret content
- Familiarity with behavioral patterns of younger audiences
- Ability to evaluate content across different cultural contexts
- Query crafting and scenario design
- Techniques like crescendo attacks and context manipulation
- Risk rating assignment based on guidelines
- Bias detection in AI outputs
- Attention to detail + ability to work fast under time constraints
- Ability to produce creative content
- Understanding of slang and youth target language
**PREFERRED BACKGROUND** Creative Writing, Social Media Management, Community Management, Content Moderation, Digital Marketing, Communications, Journalism, Public Relations, Copywriting, or related fields.
**PREFERRED SKILLS**
- Trust & Safety Background: Prior experience in trust and safety, or similar roles.
- Understanding of how safety systems work at scale and common failure modes.
- Fluency in additional languages beyond English is an advantage for this role.
- Familiarity with Machine Learning Concepts
**CONTENT WARNING & OPTIONAL PARTICIPATION**
This role involves exposure to sensitive and potentially disturbing content, including examples of toxic and vulgar language, sexually explicit content, descriptions of violence and self-harm, child abuse, exploitation, and other harmful material. You will encounter this content as part of adversarial testing and safety evaluation.
**WE OFFER**
- **Occupational health care at Terveystalo private clinic**
- **Lunch benefit with ePassi**
- **After 4 months, eligibility for Pohjola Comprehensive Health Insurance**
- **Comprehensive leisure-time accident insurance by Pohjola**
- **Sport & Culture&Heath benefits through Epassi, Yearly tax free max.value 400€**
- **Multicultural environment/diversity and inclusiveness**
- **Charity & well-being activities**
- **Company events**
- **Employee referral bonuses & Recognition programs**
**_Only shortlisted candidates will be contacted/interviewed._ Kindly attach an updated Copy of your CV in English.**
Katso ilmoitus ja hae →
Lähde: Työmarkkinatori