Skip to content
Elena Liu

Hi, I'm Elena,

Trust & Safety Program Manager
who builds the trust layer between humans and AI
with Evals + LLMOps + HITL

Trust & SafetyAI Operations

At Flip, I used ML classifiers to automate 65% of Tier-1 reports — freeing human experts for the complex, judgment-heavy cases.

This is just the beginning.

Bigger teams. Harder problems. End-to-end ownership.

Ready for what's next.

Work Experience

End-to-end ownership across discovery → specification → delivery → adoption, collaborating closely with AI Research, Engineering, and cross-functional stakeholders.

Trust & Safety Infrastructure
AI/LLM Workflow Automation
Product Lifecycle Management
Data Strategy & Governance
LLMOps & Eval Systems
Cross-functional Stakeholder Alignment

Moody's Analytics

SF Bay Area, CA

Product Operation Specialist

2024 – Present

Leading Trust & Safety infrastructure modernization — unified 3 legacy moderation tools into a single Internal Safety OS. Partnered with AI Research to deploy an LLM-assisted moderator agent and defined the full automated evaluation strategy.

  • Unified 3 legacy moderation tools into Internal Safety OS — 40% reduction in moderator onboarding time
  • Deployed LLM-assisted moderation agent: 22% accuracy improvement, 15% AHT reduction
  • Defined automated evaluation strategy via Safety Index System tracking Precision, Recall, and FPR
  • Led requirements and prioritization for multi-agent Trust & Safety workflows
  • Extended policy enforcement coverage across advertiser integrity and seller trust domains
Case Study: Moderation OS
40%↓
onboarding time
22%↑
moderation accuracy
15%↓
avg handle time
3→1
tool consolidation

Flip

Los Angeles, CA

Operations Analyst

2022 – 2024

Built ML classification pipelines to automate Tier-1 content report processing and supported GenAI policy enforcement tooling.

  • Automated 65% of Tier-1 content reports with ML classifiers
  • Improved moderator decision speed by 12%
  • Supported GenAI content policy enforcement tool pilot launch
  • Managed 100+ backlog items using MoSCoW prioritization
65%Tier-1 automation
12%↑decision speed

LeanData

Sunnyvale, CA

Data Governance Analyst

2020 – 2021

Standardized JSON taxonomy and built Python classification automation, expanding automation coverage 35% and reducing manual reconciliation errors 30%.

  • Standardized JSON taxonomy — unified data definitions across teams
  • Built Python (Scikit-learn) auto-classification system: 35% automation coverage expansion
  • Reduced manual reconciliation errors by 30%
  • Established governance framework for data quality monitoring
35%↑automation coverage
30%↓manual errors

Modis (Adecco / Akkodis)

Remote, CA

Project Manager

2018 – 2020

Managed 4 concurrent IT infrastructure projects with 98% on-time delivery, built real-time Power BI executive dashboards.

  • 4 concurrent IT infrastructure projects — 98% on-time delivery
  • Built real-time executive dashboard in Power BI
  • Cross-functional agile delivery via Jira / Confluence

Projects

yingshill

Moderation OS · Moody's Analytics

Case Study · Production

Turned an LLM moderation assistant from 'it exists' to 'it works.' Unified 3 legacy tools, defined a Safety Index evaluation system — 22% accuracy improvement, 15% AHT reduction, 40% faster onboarding.

LLM ModerationPythonTrust & SafetySafety IndexEval Framework
Read case study →
22%↑
moderation accuracy
15%↓
avg handle time
40%↓
onboarding time
3→1
tools consolidated

ai-content-moderation-edge-case-eval-framework

Open Source

Edge-case evaluation framework for LLM content moderation pipelines — standardized precision/recall tracking, failure pattern taxonomy, reproducible eval suites.

PythonLLM EvalsTrust & SafetyPrecision/Recall
View on GitHub

ai-governance-red-team-control-pipeline

Open Source

Red-team control pipeline for AI governance and policy enforcement — attack pattern classification, prompt injection defense, output consistency test suites.

PythonAI GovernanceRed TeamPolicy Enforcement
View on GitHub

incident-drill-kit

Open Source

Structured incident drill toolkit for T&S ops teams — customizable tabletop exercise scenarios, role assignment templates, and post-incident review frameworks.

Trust & SafetyIncident ManagementOps Tooling
View on GitHub

Education

In Progress · Georgia Institute of Technology

M.S. Computer Science (OMSCS)

In progress — focus on machine learning and computing systems.

2022 · Pepperdine University

M.S. Policy Analytics

Policy analysis and data-driven decision making.

2021 · FullStack Academy

Software Engineering

Full-stack web development.

2018 · Sun Yat-sen University

B.E. Communication Engineering

Communication engineering.

Skills

Languages

Mandarin ChineseNative
EnglishProfessional proficiency

Soft Skills

CommunicationCross-functional LeadershipSystems ThinkingE2E OwnershipBias for ActionInfluence w/o AuthorityDealing with Ambiguity

Tech Stack

AI / LLM
Claude (Anthropic)OpenAIMeta AI (LLaMA)Databricks MLflowGuardrail AI
Safety & Compliance
GDPR / CCPAPII SecurityApache RangerUnity Catalog
Data & Analytics
SQLPython (Pandas / Scikit-learn)ScalaTableauPower BI
Data Engineering
dbtAirflowSnowflakeDatabricksAWS GlueS3
Databases
PostgreSQLMySQLMongoDBCassandra
LLMOps
Eval PipelinesObservabilityLangfuseStreamlit
Platform & Tools
JiraConfluenceGCPTerraformShell ScriptingGitVercel

Let's talk

Open to roles in Trust & Safety, AI Program Management, and AI Governance. Based in the SF Bay Area — no sponsorship needed.

© 2026 Elena Liu|Privacy