The computer vision that turns a phone into a coach that sees.
For a US sports-tech startup, we're building the machine-learning vision layer of their consumer training app — models that read live phone video during real practice, judge the athlete's form and the outcome of each rep in real time, and feed the coaching and competition the whole product is built around.
Most apps tell you what happened. The hard part is explaining why.
Plenty of training apps can tell you the result of a rep. The product we're building with goes further: it coaches. To coach, the app has to read how the movement actually looked — the athlete's form and mechanics — not just whether the rep worked. That is a genuinely hard ML problem, and it's the one the whole product sits on.
Harder still: this has to work on a phone propped on a tripod during real practice — not a lab rig with controlled lighting and fixed cameras. Different angles, different light, different environments, and feedback that has to land between reps, not minutes later. Reading video that well, that fast, on consumer hardware, was the layer that didn't exist yet.
A vision layer that reads mechanics and outcomes.
Not a demo that works in one corner. A real-time computer-vision system — models plus the pipeline around them — built to run on a phone in the field and feed everything the product does.
Reading the movement
We're building the models that track the athlete's body and the movement through each rep — capturing how the form looked, frame by frame, so the app can judge mechanics and not just count results.
What happened, and how
The vision layer detects both sides of every rep: the outcome — what the result was — and the mechanics — how the movement was executed. Reading both is what lets the product explain why a rep went the way it did.
Built for the phone, in the field
The models run on live video from a phone on a tripod — varied lighting, angles, and environments — and return their read fast enough that feedback lands between reps, inside a real training session, not after it.
The layer everything sits on
Instruction, AI coaching feedback, and the gamified competition layer all depend on what the camera understands. We're building the vision layer as the foundation those experiences are built on — get it right and the rest of the product becomes possible.
Embedded with the founders, building to be owned.
Define what to see
We worked with the founding team to pin down what the vision layer actually has to read in each rep — the mechanics that matter and the outcomes that count — and what good enough means in the field.
Build the models
We're building and training the vision models against real practice footage — the messy lighting, angles, and environments a phone on a tripod actually sees — not clean lab conditions.
Make it real-time
We tune the models and the pipeline around them so the read is fast and reliable enough to feed coaching feedback between reps, live, on consumer hardware.
Build to transfer
The models and pipeline are built to be owned and iterated by the client's team — so they can keep improving the vision layer alongside their own product roadmap.
A camera that understands the whole session.
As the engagement continues, the vision layer is becoming the thing the product can lean on: a phone that doesn't just record a session but understands it — reading the athlete's form and the outcome of each rep well enough that the app can explain why, not just report what. That read is what turns instruction, feedback, and competition from features into a coach.
We're embedded with the founding team and building alongside their roadmap — and we're building the models and the pipeline to be theirs. The vision layer is being built so the client's team can own it and keep iterating on it as the product grows. We build the layer; they're set up to run it.
If your product depends on reading video well, let's talk.
Book a 30-minute call. We'll map what your app needs to see, what real-time computer vision on consumer hardware would take to build — and how to set your team up to own it.