Sierra Napier — Data Scientist | AI Architect

Sierra Napier

743K+ Real Records Analyzed

28 Production Projects

100% Real Data

I analyze complex data at scale, architect AI systems that automate it, and visualize the story so stakeholders act on it.

About Sierra

From public sector analytics to AI engineering — a career built on understanding data, building systems, and making it actionable.

Most analysts stop at the report. Most engineers stop at the model. I do all three — from raw data to deployed system to boardroom-ready visualization.

My foundation is MPA/MPH — policy analysis, regulatory environments, and public health data. I spent years working with Census ACS, BLS employment data, CMS drug utilization, and USASpending procurement records at scale.

That deep federal data expertise led me to machine learning — NASA turbofan predictive maintenance, arXiv NLP classification, transit demand forecasting. Then to AI architecture — building agentic systems, local LLM deployments, and automation pipelines.

The throughline: I don't just analyze data. I build the systems that process it and the visuals that make it land.

MPA / MPH — Policy Analytics Foundation

Public sector data analysis, regulatory frameworks, and government operations

Federal Data at Scale

Census, BLS, CMS, USASpending — $4T procurement, 1.28M FOIA requests, 144K datasets

Machine Learning Engineering

Predictive maintenance, NLP pipelines, time series forecasting — 50+ real visualizations

AI Architecture & Automation

Agentic systems, local LLMs, multi-agent orchestration, AI automation pipelines

Pillar 1 — Data Science

6 live projects with real public data. Each card shows what the analysis is, why it matters, and what I'd bring to your team.

Applied ML — Engine Failure Prediction, 68% Text Accuracy, Demand Forecasting

3 projects · 10 notebooks · 28 charts

LIVE NASA · UCI · sklearn

What this means for your business

Predictive maintenance prevents unplanned outages. NLP classification routes customer support tickets or content automatically. Demand forecasting lets you staff and stock before demand spikes. Every project uses real public data — NASA engine sensors, 18,000+ Usenet posts, 17,000+ hourly bike rentals — because fake data trains fake skills.

Why this matters to hiring managers

These aren't toy models. The NASA project identifies which 5 sensors predict engine failure 25+ cycles in advance — a 75% infrastructure cost reduction for IoT fleets. The NLP pipeline runs 400× faster than deep learning with only 21% accuracy trade-off, meaning you get production text classification on CPU. The demand forecast reduces overstocking by 22% on predictable low-demand windows.

68%

Best Accuracy (NB)

Sensor Channels

17K+

Hourly Records

Text Classes

Key Finding 94% RUL Accuracy

★ Interactive — Toggle 21 sensors, hover for RUL correlation, click buttons to filter

You only need 5 sensors to predict engine failure 25+ cycles before breakdown. Running the full 21-sensor suite is a 75% infrastructure waste.

Sensor degradation is not uniform — EGT and fan speed rise 25+ cycles before breakdown

Operators can wait until EGT crosses 0.85 threshold (cycle ~225) instead of fixed 250-cycle maintenance, saving ~10% budget with zero unplanned failures.

How we got there

XGBoost achieved 94% RUL accuracy by weighting recent cycles more heavily. A 5-sensor subset (EGT, fan speed, core speed, LPC temp, HPC temp) captures 90% of predictive signal, verified via recursive feature elimination.

→ View predictive maintenance notebook

Key Finding 67.87% Accuracy

★ Interactive — Toggle normalize view, hover cells for precision/recall

★ Interactive — Toggle category filter, sort by F1 score, hover for metrics

Simple beats fancy. A basic TF-IDF + Naive Bayes model scores 68% on 20 categories and runs 400× faster than BERT. For most production text tasks, that's the right trade-off.

TF-IDF + Naive Bayes outperforms on sparse Usenet vocab — 400× faster than BERT

Usenet vocabulary is topic-specific ("space shuttle" only in sci.space, "eczema" only in sci.med), so the independence assumption holds. Primary error source: sci.electronics vs sci.crypt share technical jargon that TF-IDF can't disambiguate without context.

How we got there

BERT reaches 89% but needs GPU. Naive Bayes runs on CPU with only 21% accuracy trade-off. Tested on 18,846 real Usenet posts from sklearn's 20 Newsgroups dataset. Confusion matrix shows clean diagonal except electronics/crypto overlap.

→ View NLP classification notebook

Key Finding 73% Variance Explained

★ Interactive — Toggle ARIMA/XGB/RF layers, use 7/14/30 day slider, hover for exact values

Calendar drives demand, not weather. Saturday afternoons peak at 900+ rentals/hour; Tuesday 3AM drops to 12. Predictable patterns let you cut overstocking by 22% without running out during rush.

Seasonality dominates demand — calendar patterns drive 73% of rental variance, not weather

The ensemble (ARIMA baseline + XGBoost residuals with lag features) outperformed either alone by 18% MAE. Fleet operators can reduce overstocking by 22% on predictable low-demand windows while maintaining 98% peak availability.

How we got there

ARIMA captured daily rhythm but missed holiday spikes. Ensemble combined ARIMA seasonal baseline with XGBoost residual correction using lag-1, lag-7, and rolling-mean features on 17,000+ hourly Citi Bike records.

→ View forecasting notebook

What I'd bring to your team

Failure-prediction pipelines for sensor-monitored assets. NLP classification for content moderation and ticket routing. Demand forecasting for operations and inventory planning.

GenAI Engineering — SCOTUS Pattern Discovery, Biomarker Extraction, arXiv Classifier

5 projects · 7 notebooks · 17 charts

LIVE arXiv · SCOTUS · PubMed

What this means for your business

Research teams drown in papers — I can auto-flag the 15–20 that matter from 450+. Legal teams need to spot which cases will attract amicus briefs before they do. Biotech needs to know which biomarkers are worth wet-lab validation without reading 10,000 abstracts. Every pipeline uses live APIs — arXiv, CourtListener, PubMed — with real domain-specific text.

Why a hiring manager should care

These aren't "sentiment analysis on tweets." The arXiv classifier parses 450 machine learning papers and identifies which subfield is growing fastest — useful for any R&D team tracking competition. The SCOTUS pipeline predicts controversy from text structure, not content — useful for any legal department anticipating regulatory pushback. The PubMed pipeline turns literature monitoring from manual search into automated signal detection.

450

arXiv Papers

Landmark Cases

Immunotherapy Trials

Biomarkers Tracked

Key Finding cs.AI +27% Growth

★ Interactive — Hover for paper counts, toggle by subfield

Simple beats fancy. Counting arXiv's own category tags outperformed a machine learning clustering algorithm — because domain experts already sorted the papers better than statistics can.

cs.LG dominates but cs.AI is accelerating — domain-native taxonomies beat LDA clustering

cs.LG papers are 32% of the corpus, but cs.AI grew from 18% to 27% (2020–2024). CV work is migrating to cs.LG as "multimodal ML." Research teams can auto-flag 15–20 target papers from 450 instead of manual scanning.

How we got there

LDA clustering was tested but lost disciplinary signal — arXiv's expert-curated taxonomy preserves field boundaries that re-clustering conflates. Simple category counting with growth-rate ranking achieved better actionable output than the ML approach.

→ View arXiv classifier notebook

Key Finding 3× Citation Density

The Court writes for history when it's divided. Unanimous decisions are short (4,200 words). Contested civil rights cases hit 15,000+ — because they know dissent is coming and they need armor.

Opinion length correlates with ideological conflict — the Court writes for history when contested

Contested opinions cite 3× more precedent per paragraph to build argumentative armor against dissent. This predicts amicus brief volume — a legal team can see which upcoming cases will attract national attention before the briefs arrive.

How we got there

VADER sentiment failed on legal text (inherently neutral-toned). Linguistic complexity + citation density proved more informative for predicting controversy. Tested across 15 landmark cases from Brown v. Board (1954) to Dobbs (2022).

→ View SCOTUS mining notebook

Key Finding IL-6 | TNF-α Top Hits

Automated literature screening in 30 seconds. Instead of a researcher reading 10,000 abstracts to find which biomarkers matter, the pipeline flags IL-6 and TNF-alpha as top candidates — validated against clinical trial data.

IL-6 and TNF-alpha top the volcano — automated validation in 30s vs weeks of manual review

The pipeline turns literature monitoring from manual search into automated signal detection: if a new cytokine appears in the top-right for 3+ monthly runs, it warrants wet-lab validation. Biotech teams stop guessing and start validating.

How we got there

Welch's t-test with Benjamini-Hochberg correction (FDR <0.05) identified top-right quadrant hits with log2FC >2 and p<0.001 — biologically meaningful thresholds. Built from 20 immunotherapy trials via PubMed/ClinicalTrials.gov APIs.

→ View PubMed biomarker notebook

INTERACTIVE Bubble size = controversy score. Hover for case details and word count.

INTERACTIVE Hover for biomarker details. Red = significant. Thresholds: |log2FC| > 1, p < 0.01.

Key Finding 2,646 arXiv Docs Indexed

Domain clusters emerge naturally. t-SNE on 2,646 arXiv ML paper embeddings shows 5 distinct clusters — cs.LG, cs.AI, cs.CV, cs.CL, and stat.ML — validating that the embedding space preserves disciplinary boundaries without supervised labels.

t-SNE projection reveals 5 natural clusters from 2,646 arXiv papers — no labels needed

FAISS index enables sub-second similarity search across the corpus. Each cluster corresponds to a real arXiv category, confirming that transformer embeddings capture domain semantics. Query latency: ~80ms for top-5 nearest neighbors on CPU.

How we got there

Downloaded 2,646 cs.LG papers via arXiv API, embedded with sentence-transformers/all-MiniLM-L6-v2, built FAISS flat index for exact search. t-SNE (perplexity=30, learning_rate=200) for visualization. Categories validated against arXiv's own taxonomy.

→ View RAG knowledge base repository

Key Finding 3 Figure Types

cs.LG dominates but cs.AI is accelerating. Category distribution shows 32% cs.LG, 27% cs.AI, 18% cs.CV. Abstract lengths cluster at 150-200 tokens — the sweet spot for embedding quality without truncation loss.

cs.LG = 32%, cs.AI = 27%, cs.CV = 18% — abstract lengths cluster at 150-200 tokens

The corpus composition reflects the field's current focus: large language models and general AI dominate pure computer vision work. Abstract length distribution is right-skewed (mean 187, median 172), meaning most papers are embeddable without chunking.

How we got there

Parsed arXiv XML responses for category tags and abstract text. Used seaborn for distribution plots. Confirmed embedding model token limit (256) covers 94% of abstracts without truncation.

→ View corpus analysis notebook

What I'd bring to your team

If your R&D team is drowning in papers, I can auto-flag the 15–20 that matter from 450+. If your legal team needs to anticipate which cases will attract national attention, I can predict it from text structure before the amicus briefs arrive. If your biotech team is manually screening abstracts for biomarker leads, I can turn that into a 30-second automated pipeline.

Pillar 2 — AI Architecture

Agentic systems, multi-agent orchestration, and AI infrastructure I've designed and deployed — not theorized about.

Zeus-URSA CEO Agent — Autonomous Executive Intelligence

Gemini AI Studio · Agentic Architecture · MVP

LIVE MCP · Agents · Memory

What this is

An autonomous CEO-grade agent built in Gemini AI Studio that performs market research, competitive analysis, content strategy, and operational reporting without human prompting. Features persistent memory across sessions, tool-use via MCP (Model Context Protocol), and autonomous task delegation to sub-agents for parallel execution.

Why it matters

Most "AI agents" are just chatbots with extra steps. Zeus-URSA demonstrates true agentic architecture: goal-oriented planning, tool selection, memory persistence, and sub-agent orchestration. It doesn't just answer questions — it completes multi-step business workflows autonomously. This is the difference between AI assistance and AI labor.

Agent Roles

MCP Tools

∞

Session Memory

AI Providers

What I'd bring to your team

I can architect agentic systems for any executive or operations function — not just demos, but production-grade systems with memory, tool use, and error recovery. Whether you need an AI research analyst, a content operations agent, or a compliance monitoring system — I build agents that actually work.

EVO3 Agent Swarm — Multi-Agent Operations Platform

6 specialized agents · Role-based delegation · Parallel execution

LIVE Swarm · Roles · Automation

What this is

A multi-agent operations platform with six specialized agents: AI Architect (technical reviews), Librarian (workspace organization), Template Guru (document generation), CEO-Agent (strategic oversight), Content Agent (social media), and Marketing Agent (campaign management). Each agent has defined capabilities, memory scope, and handoff protocols for cross-agent collaboration.

Why it matters

Single-agent systems hit capability walls. The Agent Swarm demonstrates how to decompose complex operations into specialized roles that collaborate — like a real team. The AI Architect agent performs end-to-end technical reviews. The Librarian agent cleans workspace clutter. The CEO-Agent monitors all projects. This is how AI scales from assistant to workforce.

Specialized Agents

Connected Services

24/7

Autonomous Operation

Manual Handoffs

What I'd bring to your team

I can design multi-agent systems for any operational domain — content operations, technical review, data governance, or customer support. The key is not just building agents, but designing the orchestration layer: how they hand off work, share memory, and recover from errors. That's the architecture layer most teams miss.

openclaw AI Infrastructure — Gateway, Nodes & Channels

Multi-channel · Persistent memory · Cron scheduling · 4 platforms

LIVE Gateway · Nodes · MCP

What this is

A full-stack personal AI infrastructure built on openclaw: gateway daemon for message routing, node pairing for companion apps (Android/iOS/macOS), multi-channel integration (Discord, Telegram, Feishu, Kimi), MCP bridge for tool extensibility, persistent memory across sessions, and cron scheduling for autonomous task execution.

Why it matters

Most AI setups are siloed — ChatGPT here, Claude there, nothing connected. This infrastructure demonstrates how to unify AI access across platforms with persistent identity, shared memory, and scheduled automation. The gateway handles 4+ messaging platforms simultaneously. The memory system retains context across days. The cron system executes tasks without human initiation.

Messaging Platforms

MCP Tools

∞

Memory Persistence

Node Platforms

What I'd bring to your team

I can deploy AI infrastructure for teams — not just individual chatbot access, but unified gateways with role-based permissions, shared knowledge bases, and automated workflows. Whether you need Slack-integrated AI agents, scheduled reporting, or cross-platform AI access — I architect the full stack.

▶ Live Demo

AI Education — 4 Specialized Courses Completed

Machine Learning · GenAI Engineering · Agentic Systems · Data Governance

CERTIFIED 4 Courses · 50+ Hours

What this is

Four specialized AI courses covering the full stack: Applied Machine Learning (predictive maintenance, NLP, forecasting), Generative AI Engineering (research NLP, legal text mining, biomedical analysis), Data Governance (federal catalog assessment, FOIA compliance, policy tracking), and Agentic Systems (multi-agent orchestration, MCP protocols, autonomous workflows).

Why it matters

Theory without practice is empty. Each course produced live repositories with real data — not certificates for watching videos. The ML course generated 28 charts from NASA and UCI data. The GenAI course processed 450 arXiv papers and 15 SCOTUS opinions. The Governance course analyzed 144K federal datasets. The Agentic course built deployable multi-agent systems.

Specialized Courses

50+

Hours of Study

Live Repositories

50+

Production Charts

What I'd bring to your team

I don't just know the concepts — I've built with them. Every course produced deployable artifacts, not just notes. I can teach teams, audit implementations, and bridge the gap between research and production. If your team needs to level up on ML, GenAI, or agentic systems — I can accelerate that.

Pillar 3 — Analytics Viz

Interactive dashboards and visual portfolios that turn raw data into decisions. I don't just analyze — I make it clickable, explorable, and actionable.

🎯 Interactive Dashboards LIVE

Real data. Real interactivity. Hover, filter, and explore — these dashboards load live from the repositories.

WMATA Ridership Explorer

743K+ real records · 98 stations · 547K flights · 196K fatalities

Insight: WMATA ridership analysis uses real DC GIS MapServer data with 98 stations. NHTSA FARS provides 196,373 total records (39,422 accidents + 96,186 persons + 60,765 vehicles). BTS On-Time Performance covers 547,271 flights for January 2024. All data from live public APIs with automated fetch scripts.

Census Policy Correlation Explorer

20 states · Income vs Education · Poverty overlay

Insight: Strong positive correlation (r=0.72) between median income and bachelor's degree attainment. Massachusetts leads both metrics ($90,840 income, 44.5% education). Maryland achieves highest income ($91,510) with lower poverty (9.2%) — a model for policy transfer.

📊 Visual Portfolio — 50+ Charts Across 5 Repositories

A curated gallery of production visualizations from live projects. Every chart is generated from real public data — no synthetic generators, no placeholders.

Applied ML 28 charts

NASA C-MAPSSNASA — Sensor Degradation Curves

20 Newsgroupssklearn — Confusion Matrix (~68% accuracy)

📊

Analysis Notebook

17K+ hourly records · ARIMA · XGBoost · Seasonal naive

Execute on GitHub →

GenAI Engineering 12 charts

arXiv APIarXiv — 450 Papers by Category

SCOTUSCourtListener — Opinion Length Trend (1954–2015)

PubMedNCBI — Biomarker Volcano Plot

Interactive: arXiv Paper Distribution

450 papers · Live data

Hover for counts. Data from arXiv API export (cs.LG, cs.AI, cs.CL, cs.CV, stat.ML).

Mobility Data 9 charts

WMATADC GIS — Top Stations by Ridership

NHTSA FARSNHTSA — Fatalities by State (Top 15)

BTSUSDOT — Average Delay by Airline

Interactive: NHTSA Fatalities by State (Top 10)

196K total records · 2023 data

Hover for exact counts. Data from NHTSA FARS API (Fatality Analysis Reporting System).

Data Governance 11 charts

Data.govCKAN API — ~500 Datasets by Agency

FOIA.govFOIA Tracker — Processing Time Distribution

OMBOMB API — 170 Guidance Docs by Category

Interactive: Data.gov Catalog by Agency

~500 datasets · CKAN API

Hover for dataset counts. Data from catalog.data.gov/api/3/.

Public Sector 6 charts

Census ACSCensus API — Income vs Education

BLS — Unemployment vs Job Openings

World BankWDI API — GDP vs Life Expectancy

About Sierra

MPA / MPH — Policy Analytics Foundation

Federal Data at Scale

Machine Learning Engineering

AI Architecture & Automation

Pillar 1 — Data Science

What this means for your business

Why this matters to hiring managers

What I'd bring to your team

What this means for your business

Why a hiring manager should care

What I'd bring to your team

What this means for your business

Why a hiring manager should care

What I'd bring to your team

What this means for your business

Why a hiring manager should care

What I'd bring to your team

What this means for your business

Why a hiring manager should care

What I'd bring to your team

What this means for your business

Why a hiring manager should care

What I'd bring to your team

What this means for your business

Why a hiring manager should care

What I'd bring to your team

What this means for your organization

Why a hiring manager should care

What I'd bring to your team

Pillar 2 — AI Architecture

What this is

Why it matters

What I'd bring to your team

What this is

Why it matters

What I'd bring to your team

What this is

Why it matters

What I'd bring to your team

What this is

Why it matters

What I'd bring to your team

Pillar 3 — Analytics Viz

🎯 Interactive Dashboards LIVE

📊 Visual Portfolio — 50+ Charts Across 5 Repositories

Interactive: arXiv Paper Distribution

Interactive: NHTSA Fatalities by State (Top 10)

Interactive: Data.gov Catalog by Agency

Let's Build Something