Rahul
Sharma


AI Systems Builder

I build autonomous agents, media generation pipelines, and interactive video systems — from architecture to production deployment.

Amazon AGI USC PhD IIT Kanpur
app.4klabs.ai/video-workspace

AI-Generated Interactive Video

Q4 Revenue Analysis

$2.4M
Revenue
+18%
847
New Accounts
+24%
"Revenue grew 18% quarter over quarter..."
2:15 / 3:42
58 Tools in Production MCP Server
10+ Published Papers, Top-Tier Venues
2 Foundation Models Launched
6 Cloud Services in Production
01
Audio-Visual Drift Correction
02
Shadow DOM Scene Isolation
03
AI Animation Choreography
04
Context-Aware TTS Shaping
05
Live DOM Data Collection
06
Autonomous Agent Pipeline
New Format · 6 Inventions · Solo-Built

A Video Format That Didn't Exist

I built a new kind of video — narrated by AI, animated with choreographed timing, and fully interactive in the browser. Not sealed pixels. Live HTML where viewers click, drag, and answer. It required solving 6 problems with no prior art: sub-45ms audio sync, DOM isolation across slides, AI-directed animation choreography, presentation-aware voice shaping, in-video data collection at 2.7x survey completion rates, and a 58-tool agent pipeline that produces finished video from a text prompt.

TypeScript · GSAP · Shadow DOM · Web Audio · AWS · Claude API · MCP Protocol
Foundation Models · Amazon AGI

Nova Canvas & Nova Reels

Core science team. Developed evaluation frameworks that determined production readiness for Amazon's image and video generation models. Launched December 2024.

Research · USC & Geena Davis Institute

Video Understanding & Media Analysis

10+ papers, h-index 9. Built systems to quantify gender representation in Hollywood at scale.

Google Scholar →
ivy-agent
$ create-video "Q4 Revenue Analysis"
Project created — id: f7a2b1c
Template: cinematic
Outline: 8 slides planned
Building slides in parallel…
  slide-001 ████████████
  slide-002 ████████████
  slide-003 ████████████
  slide-004 ██████████
Narration generated (4 voices)
Manifest built — 3:42 total
https://app.4klabs.ai/v/f7a2b1c
Infrastructure · Agentic AI

Multi-Agent Orchestration Platform

58 MCP tools expose the full pipeline as an API. Any AI agent can go from a text prompt to a playable narrated interactive video in a single autonomous session. 6 services on AWS ECS Fargate with service discovery, SQS queues, and PostgreSQL.

MCP Protocol · AWS ECS · RDS · ElastiCache · S3 · CloudFront · SQS

AI Agent Systems

Autonomous agents that handle complex workflows — research, decisions, multi-step execution. I've built a 58-tool MCP server that runs full video creation pipelines without human intervention.

Good fit if you have a multi-step workflow that needs AI automation.

Media & Content Generation

End-to-end pipelines for generating and delivering media at scale. Video, audio, interactive content, TTS. I evaluated the models that became Amazon Nova Canvas and Nova Reels.

Good fit if you need AI to produce content at scale.

AI Products, Full-Stack

You have a problem, I build the product. Model selection, pipeline architecture, cloud infrastructure, frontend. Concept to deployed product.

Good fit if you need an AI product built and don't have a technical team.

Interactive Narrated Experiences

Narrated, animated, interactive video on the INX platform. Live HTML inside what feels like video — calculators, forms, widgets, 3D viewers.

Good fit if you want interactive, personalized video content.

Technical Due Diligence

For investors evaluating AI startups. I assess architecture, model choices, scalability, and defensibility. Built at Amazon scale, published peer-reviewed research. I can tell you if the AI is real.

Good fit if you're a VC or investor evaluating an AI company's technical foundation.

2025 — Now

Builder & Consultant

Created INX — the interactive narrated video platform. Built the entire stack solo. Taking on select consulting projects in parallel.

2023 — 2025

Applied Scientist, Amazon AGI

Inception science team. Responsible AI for LLMs, then video/image generation evaluation. Core team that launched Nova Canvas & Nova Reels.

2017 — 2022

PhD, USC Viterbi

Video understanding research. 10+ papers. Geena Davis Institute — quantifying gender representation in media. Viterbi Fellowship.

2012 — 2017

BTech-MTech, IIT Kanpur

Electrical Engineering, dual degree. Gold Medal — best master's thesis across all departments. GATE Fellowship.

10+ peer-reviewed papers across video understanding, multimodal AI, and media analysis. Google Scholar →

ICIP '19 · arXiv

Active Speaker Detection in Movies

Cross-modal identity association for detecting who's speaking in video. Unsupervised audio-visual framework.

ICASSP '20

Vocal Tract Contour Detection in RT-MRI

Convolutional LSTM for tracking articulatory boundaries in real-time MRI video of speech production.

USC SAIL

Video Analysis for Autism Prescreening

Automated analysis of child-interlocutor dynamics from video for ASD behavioral characterization.

USC SAIL

Eyetracking for Cortical Visual Impairment

Saliency map generation from gaze data to characterize visual patterns in CVI subjects.

Peer-reviewed

Visual Behavior in Public Speaking

CNN + attention-LSTM predicting TED talk popularity from visual cues. End-to-end trainable with interpretable attention.

IIT Kanpur · Gold Medal Thesis

Crowd Flow Segmentation

Trajectory clustering for pixel-wise motion pattern segmentation in high-density crowd video.

Have something worth building?

I take on a limited number of projects. Tell me about yours and I'll get back to you within 48 hours.