diff --git a/docs/technical/hand-tracking.md b/docs/technical/hand-tracking.md new file mode 100644 index 0000000..45b3f96 --- /dev/null +++ b/docs/technical/hand-tracking.md @@ -0,0 +1,120 @@ +# Hand Tracking Technical Notes + +This document describes the hand tracking system that exists in the current codebase. + +## Purpose + +Hand tracking is a debug-stage interaction system used to test direct 3D object manipulation with a webcam. It allows a user to close their fist to grab a nearby object and move it in 3D space without relying on the center crosshair. + +The feature is currently scoped to the debug physics scene and is not yet a production gameplay input system. + +## Runtime Flow + +1. The browser captures webcam frames in `src/hooks/useRemoteHandTracking.ts`. +2. Frames are sent to the local Python backend over WebSocket. +3. The backend runs MediaPipe hand landmark detection. +4. The backend returns hand data including landmarks, handedness, score, center point, and `isFist`. +5. React stores the latest snapshot in the hand tracking provider. +6. `GrabbableObject` reads that snapshot each frame and uses fist state plus raycasting to grab objects. + +## Activation Rules + +Hand tracking is intentionally gated so the webcam and backend are not used all the time. + +The current activation conditions are: + +- debug mode is active with `?debug` +- scene mode is `physics` +- the player is near an interaction, is holding an object, or is hand-holding an object + +This prevents the previous issue where hand tracking depended on crosshair focus. The system now remains active while the player is inside an interaction zone, even if the camera is not aimed directly at the object. + +## Backend + +The backend lives in `backend/` and exposes: + +- `GET /health` for health checks +- `WS /ws` for frame input and hand tracking output + +The Python process uses MediaPipe and the local model file: + +```txt +backend/hand_landmarker.task +``` + +The backend sends normalized hand coordinates and landmarks. The frontend treats the values as screen-space inputs, then maps them into world space with the active Three.js camera. + +## Frontend Data Shape + +The shared types live in `src/types/handTracking.ts`. + +```ts +interface HandTrackingHand { + x: number; + y: number; + z: number; + landmarks: HandTrackingLandmark[]; + handedness: string; + isFist: boolean; + score: number; +} +``` + +`x` and `y` are normalized camera coordinates. `z` is a relative depth value from MediaPipe, not an absolute world-space distance. + +## Grab Targeting + +The hand grab logic lives in `src/components/three/GrabbableObject.tsx`. + +The object is moved toward the visual center of the hand. That center is computed from the bounding box of all landmarks: + +```txt +centerX = (minX + maxX) / 2 +centerY = (minY + maxY) / 2 +``` + +Starting a grab uses a slightly wider virtual hit zone. Instead of raycasting only from one point, the code casts several rays around the hand center: + +- center +- left +- right +- up +- down + +If any ray hits the object while the object is within `INTERACTION_RADIUS`, the object enters hand-holding mode. + +## Depth Handling + +Because MediaPipe `z` is relative, the frontend captures the starting depth when the grab begins: + +```txt +initialHandZ = hand.z +initialHoldDistance = hit.distance +``` + +While holding, the object distance from the camera is adjusted by the change in hand depth: + +```txt +holdDistance = initialHoldDistance + (hand.z - initialHandZ) * sensitivity +``` + +The final hold distance is clamped between the configured grab minimum and maximum distances to avoid unstable movement. + +## UI And Debug + +The current debug UI includes: + +- `HandTrackingOverlay` for status, usage, server state, hand count, and fist state +- `HandTrackingVisualizer` for the SVG landmark wireframe +- `r3f-perf` for render performance +- `lil-gui` for scene, camera, lighting, interaction, and grab controls + +The hand tracking overlay is an HTML overlay outside the canvas. The hand wireframe is also HTML/SVG, not a 3D hand model. + +## Known Limitations + +- The feature is debug-only and currently focused on the physics test scene. +- MediaPipe depth is relative and can be noisy. +- The virtual hit zone is an approximation based on multiple raycasts, not a real 3D collider. +- There is no smoothing layer for hand position or depth yet. +- The hand visualization is a temporary SVG wireframe. diff --git a/docs/user/main-feature.md b/docs/user/main-feature.md new file mode 100644 index 0000000..06fb77e --- /dev/null +++ b/docs/user/main-feature.md @@ -0,0 +1,87 @@ +# Main Feature + +This document explains the main interactive feature currently being prototyped in La-Fabrik: grabbing and moving 3D objects with hand tracking. + +## What It Does + +In debug mode, the player can use their webcam to control object grabbing in the physics scene. + +The intended user flow is: + +1. Open the app with `?debug`. +2. Switch the scene to `Physics` in the debug panel. +3. Move close to a grabbable object. +4. Show a hand to the camera. +5. Close the hand into a fist near the object. +6. Move the hand to move the object. +7. Open the hand to release the object. + +## Why It Matters + +This prototype tests whether La-Fabrik interactions can feel more physical and embodied than a classic mouse or keyboard interaction. + +For the final experience, this can support low-tech repair gestures, object manipulation, and more expressive interaction sequences. + +## Current Behavior + +The feature works with one or more detected hands. A hand is considered active for grabbing when the backend detects a closed fist. + +When the fist starts close enough to a grabbable object, the object attaches to the hand target. The object then follows the hand center in screen space and also reacts to relative hand depth. + +Moving the hand left, right, up, or down moves the object in that direction. Moving the hand closer or farther from the camera changes the object's distance from the camera. + +## Debug Requirements + +Hand tracking currently requires: + +- Chrome or another browser that allows `getUserMedia()` reliably +- the local Python backend running +- the local MediaPipe model file available in `backend/hand_landmarker.task` +- the app opened with `?debug` +- the debug scene set to `Physics` + +Backend command: + +```bash +source backend/.venv/bin/activate +python -m backend.main +``` + +Frontend command: + +```bash +npm run dev +``` + +Debug URL: + +```txt +http://localhost:5173/?debug +``` + +## On-Screen Feedback + +The debug build shows several helpers: + +- a hand tracking status panel +- a hand landmark wireframe +- the `lil-gui` debug panel +- the `r3f-perf` performance panel +- optional interaction spheres + +The wireframe turns yellow when the detected hand is a fist. + +## Current Limitations + +- The feature is still a prototype. +- It is enabled only in the debug physics scene. +- The SVG hand wireframe is temporary. +- Depth movement depends on relative webcam tracking and may need tuning. +- The system has not yet been integrated into final mission gameplay. + +## Expected Next Improvements + +- Smooth the hand position and depth signal. +- Add a better 3D hand representation. +- Add calibration controls for grab radius and depth sensitivity. +- Connect hand gestures to final repair or transformation tasks. diff --git a/src/data/docs/docsSections.ts b/src/data/docs/docsSections.ts index 2fd4f9e..cbb8d90 100644 --- a/src/data/docs/docsSections.ts +++ b/src/data/docs/docsSections.ts @@ -38,6 +38,12 @@ export const docGroups: DocGroup[] = [ subtitle: "Implementation details", meta: "04", }, + { + path: "/docs/hand-tracking", + title: "Hand Tracking Technical Notes", + subtitle: "Webcam interaction pipeline", + meta: "05", + }, ], }, { @@ -47,13 +53,19 @@ export const docGroups: DocGroup[] = [ path: "/docs/features", title: "Features", subtitle: "Implemented scope", - meta: "05", + meta: "06", + }, + { + path: "/docs/main-feature", + title: "Main Feature", + subtitle: "Hand grab prototype", + meta: "07", }, { path: "/docs/editor", title: "Editor User Guide", subtitle: "Editing workflow", - meta: "06", + meta: "08", }, ], }, diff --git a/src/pages/docs/editor/page.tsx b/src/pages/docs/editor/page.tsx index 53c7236..c479c89 100644 --- a/src/pages/docs/editor/page.tsx +++ b/src/pages/docs/editor/page.tsx @@ -7,7 +7,7 @@ export function DocsEditorPage(): React.JSX.Element { ); diff --git a/src/pages/docs/features/page.tsx b/src/pages/docs/features/page.tsx index 4b41580..8010b5f 100644 --- a/src/pages/docs/features/page.tsx +++ b/src/pages/docs/features/page.tsx @@ -7,7 +7,7 @@ export function DocsFeaturesPage(): React.JSX.Element { ); diff --git a/src/pages/docs/hand-tracking/page.tsx b/src/pages/docs/hand-tracking/page.tsx new file mode 100644 index 0000000..26e7176 --- /dev/null +++ b/src/pages/docs/hand-tracking/page.tsx @@ -0,0 +1,13 @@ +import handTracking from "../../../../docs/technical/hand-tracking.md?raw"; +import { DocsDocument } from "@/components/docs/DocsDocument"; + +export function DocsHandTrackingPage(): React.JSX.Element { + return ( + + ); +} diff --git a/src/pages/docs/main-feature/page.tsx b/src/pages/docs/main-feature/page.tsx new file mode 100644 index 0000000..db324be --- /dev/null +++ b/src/pages/docs/main-feature/page.tsx @@ -0,0 +1,13 @@ +import mainFeature from "../../../../docs/user/main-feature.md?raw"; +import { DocsDocument } from "@/components/docs/DocsDocument"; + +export function DocsMainFeaturePage(): React.JSX.Element { + return ( + + ); +} diff --git a/src/router.tsx b/src/router.tsx index 4bb68a2..70730fc 100644 --- a/src/router.tsx +++ b/src/router.tsx @@ -10,7 +10,9 @@ import { DocsArchitectureRoute, DocsEditorRoute, DocsFeaturesRoute, + DocsHandTrackingRoute, DocsLayoutRoute, + DocsMainFeatureRoute, DocsReadmeRoute, DocsTargetArchitectureRoute, DocsTechnicalEditorRoute, @@ -43,7 +45,9 @@ const docsChildRoutes = [ { path: "architecture", component: DocsArchitectureRoute }, { path: "target-architecture", component: DocsTargetArchitectureRoute }, { path: "technical-editor", component: DocsTechnicalEditorRoute }, + { path: "hand-tracking", component: DocsHandTrackingRoute }, { path: "features", component: DocsFeaturesRoute }, + { path: "main-feature", component: DocsMainFeatureRoute }, { path: "editor", component: DocsEditorRoute }, ].map(({ path, component }) => createRoute({ diff --git a/src/routes/docs/DocsRouteComponents.tsx b/src/routes/docs/DocsRouteComponents.tsx index 01ef28c..6270213 100644 --- a/src/routes/docs/DocsRouteComponents.tsx +++ b/src/routes/docs/DocsRouteComponents.tsx @@ -30,12 +30,24 @@ const LazyDocsTechnicalEditorPage = lazy(() => })), ); +const LazyDocsHandTrackingPage = lazy(() => + import("@/pages/docs/hand-tracking/page").then((module) => ({ + default: module.DocsHandTrackingPage, + })), +); + const LazyDocsFeaturesPage = lazy(() => import("@/pages/docs/features/page").then((module) => ({ default: module.DocsFeaturesPage, })), ); +const LazyDocsMainFeaturePage = lazy(() => + import("@/pages/docs/main-feature/page").then((module) => ({ + default: module.DocsMainFeaturePage, + })), +); + const LazyDocsEditorPage = lazy(() => import("@/pages/docs/editor/page").then((module) => ({ default: module.DocsEditorPage, @@ -82,6 +94,14 @@ export function DocsTechnicalEditorRoute(): React.JSX.Element { ); } +export function DocsHandTrackingRoute(): React.JSX.Element { + return ( + + + + ); +} + export function DocsFeaturesRoute(): React.JSX.Element { return ( @@ -90,6 +110,14 @@ export function DocsFeaturesRoute(): React.JSX.Element { ); } +export function DocsMainFeatureRoute(): React.JSX.Element { + return ( + + + + ); +} + export function DocsEditorRoute(): React.JSX.Element { return (