Ella - a real-time, voice-first Home Assistant

Low-latency streaming pipeline, LLM reasoning, Smart home integrations - Completely local

What is Ella?

Ella has been a passion project of mine. It's the consolidation of many individual projects all related to voice assistants, home automation, and local Languge models.

Deep LearningHomey APIS2S PipelineLocal LLM

In a Machine Learning and AI oriended world, your data privacy is paramount. People have become increasingly more concerned about the security and confidentiality of their personal information that lives online. Ella is designed to operate completely locally, ensuring that your voice commands and data never leave your network.

Ella Showcase

Click into any module for more information

Project Overview + Demos
Outline, motivation and development timeline for Ella.
View

A design with privacy, modularity and transparency in mind. Running locally for a secure and reliable alternative to cloud based assistants like Alexa or Google Home.

TimelineSystem ArchitecturePrototypeDemo
Voice Pipeline & Synthetic Generation
An audio pipeline optimized for low latency and streaming.
View

With the goal of a sub-second perceived latency, we start playback as soon as the first audio chunk is ready and stream the rest continuously by concatenating the newest generated chunks.

PyTorchTacotron 2VibeVoice RealtimeCustom KWS models
Wake-word & DSCNNs
Depthwise separable CNNs for efficient keyword spotting
View

A real-time on-device classifier trained on a custom dataset of wake word examples. We explore various time series techniques for low latency and high accuracy clasification.

Keyword SpottingLog-Mel SpectrogramsAudio ClassificationSTFT
Hardware & Prototyping
An Echo Show inspired design powered by a Raspberry Pi CM5
View

A clean and compact design with a 10.1 inch DSI capacitive touchscreen. The enclosure was 3D printed using a custom CAD prototype.

LinuxCompute Module 5I2S AudioCAD
Conversation Hub
A FastAPI backend that stores, streams, and orchestrates.
View

A lightweight UI for conversations/messages with endpoints for streaming generation. An interface for Ella's outputs and user interaction

FastAPISQLiteUvicornStreaming

Architecture (high level)

Flow

1
Audio In
Mic capture -> STFT -> Mel Filterbanks
2
Wake Word
Lightweight on-device detection
3
STT
Streaming transcription. Running on-device
4
LLM
Reasoning. Local inference
5
TTS
Streaming audio synthesis. Sub 200ms latency to first audio chunk
6
Conversation Hub
UI for managing conversations and messages

Skills

A concise set of strengths I use to build Ella end-to-end.

Core
PythonTypeScriptGit/LinuxFastAPISQLite
ML / Audio
PyTorchAudio feature extractionCNNs (efficient)Evaluation + datasets
Systems
Streaming pipelinesWebSocketsLatency tuningEdge constraints

Other Projects

A few smaller things I’ve worked on over the years

Sentiment Trading Bot

Clean training/validation/test splits with stratification, and a simple baseline model workflow.

PythonSentiment AnalysisInvestments
A Simple Neural Network

A sandbox repo for ML-from-first-principles notes, derivations, and small experiments.

PythonMathNumPy
Personal

About

I am a Mathematics student with a strong focus on AI, Machine Learning, and Software Engineering. I have designed and built end-to-end AI pipelines, including a voice based assistant with custom made wake-word detection integration, real-time audio processing, and LLM reasoning. Experienced in Python, neural networks, data handling, and modular system design. I have a strong interest in building intelligent, real world AI systems that solve meaningful problems

Hobbies