Compare commits
44 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 7f37f9a747 | |||
| 386abf06b6 | |||
| a73f9fb951 | |||
| d29b01e398 | |||
| 6edc5f7972 | |||
| ae35eb1dfb | |||
| 4de86f4e58 | |||
| 5b123f9704 | |||
| d1bf438465 | |||
| d2ce990165 | |||
| 7d2a257e84 | |||
| 58eb60292f | |||
| 73c6d7d50d | |||
| d9cacdad12 | |||
| ab88ab722f | |||
| a30a9a2d29 | |||
| d217c3376b | |||
| 864e075b42 | |||
| 3fe5b32de2 | |||
| 72cb9f5be6 | |||
| f708c4cd2e | |||
| 193fc8b4b6 | |||
| c61760dafd | |||
| 18dd2ae49d | |||
| def0609383 | |||
| 19a1d20a97 | |||
| 49ebacfbfb | |||
| 68253fae41 | |||
| 2dabb73d3d | |||
| 4f1b3b4ff3 | |||
| 627c8d4eb9 | |||
| 0b801888f0 | |||
| a180b89ee6 | |||
| 3e66e31117 | |||
| 2c194cdd2e | |||
| feaf502e44 | |||
| 489499f5d2 | |||
| 39b996eb31 | |||
| 134c0aecb7 | |||
| b144dc1c18 | |||
| 69c720b86b | |||
| 1b57a25e5f | |||
| f6db7d74e2 | |||
| a1798aecb3 |
@@ -10,17 +10,23 @@ It is now also available to the production repair flow when a mission reaches a
|
|||||||
|
|
||||||
## Runtime Flow
|
## Runtime Flow
|
||||||
|
|
||||||
1. The browser captures webcam frames in `src/hooks/handTracking/useRemoteHandTracking.ts`.
|
The frontend can run hand tracking with two interchangeable sources, selected from the debug source controller:
|
||||||
2. Frames are sent to the local Python backend over WebSocket.
|
|
||||||
3. The backend runs MediaPipe hand landmark detection.
|
- **Browser JS** (`src/hooks/handTracking/useBrowserHandTracking.ts`) runs MediaPipe `hand_landmarker.task` directly in the browser via `@mediapipe/tasks-vision`. Default for debug.
|
||||||
4. The backend returns hand data including landmarks, handedness, score, center point, and `isFist`.
|
- **Backend** (`src/hooks/handTracking/useRemoteHandTracking.ts`) sends webcam frames as JPEG over WebSocket to a local Python process that runs MediaPipe and returns landmarks.
|
||||||
5. React stores the latest snapshot in the hand tracking provider.
|
|
||||||
6. `GrabbableObject` reads that snapshot each frame and uses fist state plus raycasting to grab objects.
|
Both sources funnel into the same `HandTrackingContext` so all consumers see one shared snapshot:
|
||||||
7. `HandTrackingGlove` reads the same snapshot and places the rigged `gant_l` and `gant_r` models on the detected hands when hand tracking is active.
|
|
||||||
|
1. The active source captures or receives landmarks.
|
||||||
|
2. The hook applies an EMA smoothing pass on the landmarks before publishing the snapshot.
|
||||||
|
3. `HandTrackingProvider` exposes that snapshot through React context.
|
||||||
|
4. `GrabbableObject` reads the snapshot each frame and uses the fist state plus raycasting to grab objects.
|
||||||
|
5. `HandTrackingGlove` reads the same snapshot and places a rigged glove on each detected hand.
|
||||||
|
6. `HandTrackingVisualizer` paints an SVG wireframe overlay on top of the canvas.
|
||||||
|
|
||||||
## Activation Rules
|
## Activation Rules
|
||||||
|
|
||||||
Hand tracking is intentionally gated so the webcam and backend are not used all the time.
|
Hand tracking is gated so the webcam and runtime are only spun up when actually needed.
|
||||||
|
|
||||||
The debug activation conditions are:
|
The debug activation conditions are:
|
||||||
|
|
||||||
@@ -28,16 +34,26 @@ The debug activation conditions are:
|
|||||||
- scene mode is `physics`
|
- scene mode is `physics`
|
||||||
- the player is near an interaction, is holding an object, or is hand-holding an object
|
- the player is near an interaction, is holding an object, or is hand-holding an object
|
||||||
|
|
||||||
This keeps hand tracking active while the player is inside an interaction zone, even if the camera is not aimed directly at the object.
|
|
||||||
|
|
||||||
The production repair activation conditions are:
|
The production repair activation conditions are:
|
||||||
|
|
||||||
- active `mainState` is `ebike`, `pylon`, or `farm`
|
- active `mainState` is `ebike`, `pylon`, or `farm`
|
||||||
- the active mission step is `inspected`, `repairing`, `reassembling`, or `done`
|
- the active mission step is `inspected`, `repairing`, `reassembling`, or `done`
|
||||||
|
|
||||||
This keeps the webcam off during `waiting`, `fragmented`, and `scanning`, then enables hand input only when the repair flow is expected to use hands.
|
This keeps the webcam off during `waiting`, `fragmented`, and `scanning`.
|
||||||
|
|
||||||
In the current production repair flow, `inspected` uses a two-fists hold gesture to advance to `fragmented`. The hold must last one second and is independent from local object interaction distance once the mission is in the correct state. Keyboard input for the same transition is handled separately by the repair case trigger, so pressing `E` requires the case to be focused through the shared interaction system.
|
### Linger
|
||||||
|
|
||||||
|
Once activation turns off (player walks back out of a trigger zone, or a mission step transitions away), the runtime stays alive for `HAND_TRACKING_LINGER_MS` (2000 ms) before being torn down. This gives MediaPipe enough time to finish initializing the webcam and load the model on a fresh entry — without the linger, a quick walk-through of a trigger zone never produces a detected hand.
|
||||||
|
|
||||||
|
## Provider Stability
|
||||||
|
|
||||||
|
`HandTrackingProvider` always renders the same JSX root (`HandTrackingRuntime`) and exposes `enabled` as a prop. Returning two different element types (`<HandTrackingContext value=IDLE>` vs `<ActiveHandTrackingProvider>`) used to be the historical shape and was the root cause of WebGL context loss: every `enabled` toggle forced React to remount the entire subtree, including the `<Canvas>`, which destroyed the WebGL renderer.
|
||||||
|
|
||||||
|
The two source hooks are therefore mounted in permanence with an `enabled` flag that they early-return on. No webcam or MediaPipe resources are created while `enabled` is false.
|
||||||
|
|
||||||
|
## StrictMode Resilience
|
||||||
|
|
||||||
|
In development, `<StrictMode>` mounts → unmounts → remounts each effect to surface non-idempotent code. The two source hooks delay their actual `start()` call by `HAND_TRACKING_RUNTIME_START_DELAY_MS` (80 ms) and clear the timer on cleanup, so a StrictMode double-mount or a rapid `nearby` flicker never reaches `getUserMedia` twice.
|
||||||
|
|
||||||
## Backend
|
## Backend
|
||||||
|
|
||||||
@@ -52,7 +68,27 @@ The Python process uses MediaPipe and the local model file:
|
|||||||
backend/hand_landmarker.task
|
backend/hand_landmarker.task
|
||||||
```
|
```
|
||||||
|
|
||||||
The backend sends normalized hand coordinates and landmarks. The frontend treats the values as screen-space inputs, then maps them into world space with the active Three.js camera.
|
The frontend sends JPEG frames at `HAND_TRACKING_FRAME_WIDTH × HAND_TRACKING_FRAME_HEIGHT` (320×240) to keep WebSocket bandwidth low. The backend sends normalized hand coordinates and landmarks.
|
||||||
|
|
||||||
|
## Browser MediaPipe
|
||||||
|
|
||||||
|
The browser path uses `hand_landmarker.task` (float16) downloaded from Google's MediaPipe model storage. The requested webcam resolution is **640×480** (`HAND_TRACKING_BROWSER_CAMERA_WIDTH/HEIGHT`), independent from the backend's 320×240. The float16 model is more sensitive than the backend Python model and needs the higher-resolution frame to detect hands reliably.
|
||||||
|
|
||||||
|
The MediaPipe delegate is currently `"GPU"`. CPU works too but is significantly slower; on a loaded scene the inference drops to ~5fps and the user feels noticeable lag during grab. MediaPipe creates its own WebGL context separate from Three.js, so there is no direct contention.
|
||||||
|
|
||||||
|
A singleton instance of `HandLandmarker` is cached in `src/lib/handTracking/browserHandTracking.ts`. `releaseBrowserHandLandmarker()` is called on cleanup and on WebGL context lost.
|
||||||
|
|
||||||
|
## Smoothing
|
||||||
|
|
||||||
|
MediaPipe at ~10 fps produces noticeable landmark jitter that, when fed raw into the scene, makes both the glove rig and any grabbed object tremble.
|
||||||
|
|
||||||
|
A simple exponential moving average is applied to every landmark before the snapshot is published:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
smoothed.x = previous.x * (1 - factor) + next.x * factor;
|
||||||
|
```
|
||||||
|
|
||||||
|
The factor is `HAND_TRACKING_LANDMARK_SMOOTHING` (0.4). Hands are matched across frames by `handedness` so left/right don't bleed into each other.
|
||||||
|
|
||||||
## Frontend Data Shape
|
## Frontend Data Shape
|
||||||
|
|
||||||
@@ -106,24 +142,36 @@ This is less expressive than true depth-aware hand movement, but it is more stab
|
|||||||
The current debug UI includes:
|
The current debug UI includes:
|
||||||
|
|
||||||
- `HandTrackingDebugPanel` inside `DebugOverlayLayout` for status, usage, loaded glove model, server state, hand count, and fist state
|
- `HandTrackingDebugPanel` inside `DebugOverlayLayout` for status, usage, loaded glove model, server state, hand count, and fist state
|
||||||
- `HandTrackingVisualizer` for the SVG landmark wireframe fallback
|
- `HandTrackingVisualizer` for the SVG landmark overlay
|
||||||
- `HandTrackingGlove` for the left-hand `gant_l` and right-hand `gant_r` models in the R3F scene
|
- `HandTrackingFallback` for the last-resort hand silhouette overlay
|
||||||
|
- `HandTrackingGlove` for the per-hand rigged glove models in the R3F scene
|
||||||
- `r3f-perf` for render performance
|
- `r3f-perf` for render performance
|
||||||
- `lil-gui` for scene, camera, lighting, interaction, and grab controls
|
- `lil-gui` for scene, camera, lighting, interaction, and grab controls
|
||||||
|
|
||||||
The hand tracking debug panel is a compact HTML grid outside the canvas. `Model loaded` displays the successfully loaded glove models. The SVG hand wireframe is only a fallback while models are loading or if a glove model fails to load.
|
The SVG visualizer uses a "blueish hand" style: white connection lines between landmarks, cyan circles with a dark blue outline. The outline gets thicker when the hand is detected as a fist, so the user gets a visual confirmation of the grab gesture without having to look at the debug panel.
|
||||||
|
|
||||||
|
The fallback overlay (`HandTrackingFallback`) draws a simple open-hand or fist silhouette positioned on the detected wrist landmark. It only renders for a hand whose matching glove is in the `"error"` state in `useHandTrackingGloveStatus`. This guarantees the user always sees something on their hand even when the 3D glove model fails to load.
|
||||||
|
|
||||||
## Glove Models
|
## Glove Models
|
||||||
|
|
||||||
The current glove MVP uses `public/models/gant_l/model.gltf` and `public/models/gant_r/model.gltf`, which contain GLTF skins and armatures. Each model is positioned, oriented, and scaled from palm landmarks, then each finger bone chain is rotated toward the matching MediaPipe landmark chain.
|
`HandTrackingGlove` loads `public/models/gant_l/model.gltf` for both hands. The right hand applies `scale.x = -1` at the group level to mirror the mesh, so the thumb ends up on the correct side. Both hands therefore share the same rig and the same material.
|
||||||
|
|
||||||
The glove models are intentionally smaller than the raw SVG overlay so they do not dominate the camera view.
|
The historical `public/models/gant_r/model.gltf` is kept as legacy but is not loaded by the frontend — its GLB embeds three skeletons (`Hand_l`, `Hand_l_pad`, `Hand_r`) plus a `galet` mesh, which made the finger rig unreliable.
|
||||||
|
|
||||||
|
The `gant_l` material is set to `alphaMode: OPAQUE` with `doubleSided: true`. The opaque mode prevents transparency sorting issues that made folded fingers disappear behind the palm; the double-sided flag covers the back faces revealed by the mirror scale on the right hand.
|
||||||
|
|
||||||
|
Two additional glove variants exist on disk:
|
||||||
|
|
||||||
|
- `public/models/gant_l_pad/model.gltf`
|
||||||
|
- `public/models/gant_r_pad/model.gltf`
|
||||||
|
|
||||||
|
They are intended for future swap-by-state usage but are **not yet rigged**. They cannot be animated by MediaPipe landmarks in their current form — re-exporting them from Blender with the same armature structure as `gant_l` is a prerequisite.
|
||||||
|
|
||||||
## Known Limitations
|
## Known Limitations
|
||||||
|
|
||||||
- Production usage is currently limited to repair mission steps that explicitly need hands.
|
- Production usage is currently limited to repair mission steps that explicitly need hands.
|
||||||
- MediaPipe depth is relative and currently not used for stable object depth control.
|
- MediaPipe depth is relative and currently not used for stable object depth control.
|
||||||
- The virtual hit zone is an approximation based on multiple raycasts, not a real 3D collider.
|
- The virtual hit zone is an approximation based on multiple raycasts, not a real 3D collider.
|
||||||
- There is no smoothing layer for hand position or depth yet.
|
- The right glove is a mirrored copy of `gant_l` rather than its own mesh; in the future a dedicated right-hand model would give a better visual.
|
||||||
- The SVG hand visualization is a fallback, not the primary display when glove models load correctly.
|
- The `_pad` glove variants are not rigged yet, so swap-by-state (normal ↔ pad) is not wired in.
|
||||||
- Finger bone animation is an approximate landmark-to-bone mapping; it still needs calibration for per-model twist, offsets, and smoothing.
|
- Finger bone animation is an approximate landmark-to-bone mapping; it still needs calibration for per-model twist, offsets, and smoothing.
|
||||||
|
|||||||
@@ -44,30 +44,39 @@ through opaque geometry.
|
|||||||
## Shadow rendering intermittence
|
## Shadow rendering intermittence
|
||||||
|
|
||||||
Shadows occasionally failed to render on initial load and could disappear
|
Shadows occasionally failed to render on initial load and could disappear
|
||||||
mid-session even though the `Lighting` configuration ran to completion.
|
mid-session even though the `Lighting` configuration ran to completion. The
|
||||||
|
fix has two layers:
|
||||||
|
|
||||||
Root cause: the sun follows the camera (its world matrix is dirty every frame
|
### Per-frame refresh (steady state)
|
||||||
via `updateMatrixWorld()` inside `Lighting.useFrame`). With `shadow.autoUpdate`
|
|
||||||
alone, three.js can skip the shadow map re-render on a frame where the matrix
|
|
||||||
update has happened but the renderer's internal dirty tracking does not pick
|
|
||||||
it up, leaving the shadow map stale or unrendered.
|
|
||||||
|
|
||||||
Fix in `src/world/Lighting.tsx`: explicit `sun.shadow.needsUpdate = true` in
|
The sun follows the camera, so its world matrix is dirty every frame. With
|
||||||
two places, restoring the belt-and-suspenders pattern from `develop`:
|
`shadow.autoUpdate` alone, three.js can skip the shadow map re-render on a
|
||||||
|
frame where the matrix update has happened but the renderer's internal dirty
|
||||||
|
tracking does not pick it up. To prevent that, `Lighting.useFrame` sets
|
||||||
|
`sun.shadow.needsUpdate = true` after the per-frame matrix updates. Shadow
|
||||||
|
config is centralized in `src/data/world/lightingConfig.ts` (`bias=0`,
|
||||||
|
`normalBias=0`, `cameraSize=95`).
|
||||||
|
|
||||||
- After `configureSunShadow(...)` in the mount `useEffect`.
|
### Mount-time shadow map reallocation (`useShadowMapWarmup`)
|
||||||
- At the end of the `useFrame` block, right after `sun.updateMatrixWorld()`.
|
|
||||||
|
|
||||||
Mitigations also in place:
|
The merged static map and other GLTFs mount imperatively after `Lighting`,
|
||||||
|
so the shadow render target ends up linked to a renderer state that pre-dates
|
||||||
|
the final scene. Materials compiled at that point bake a "no shadow map"
|
||||||
|
permutation into their shader program and silently fail to render shadows
|
||||||
|
until a WebGL context-restore cycle (the kind triggered by Chrome DevTools
|
||||||
|
in `?debug` runs) reallocates everything.
|
||||||
|
|
||||||
- Shadow config centralized in `src/data/world/lightingConfig.ts`
|
`src/hooks/three/useShadowMapWarmup.ts` replays that cycle programmatically
|
||||||
(`bias=0`, `normalBias=0`, `cameraSize=95`).
|
without the cost of a full context loss. It runs a `useFrame` watchdog that
|
||||||
- Late-suspension Suspense boundaries in `World.tsx` to prevent global scene
|
samples the scene mesh count every 6 frames; once the count has been stable
|
||||||
remounts that would re-run shadow setup mid-load.
|
for ~1 s (or after a 5 s safety cap), it:
|
||||||
- `gl.shadowMap.needsUpdate = true` on `onCreated` and on
|
|
||||||
`webglcontextrestored` in `src/pages/page.tsx`.
|
|
||||||
|
|
||||||
If the issue reproduces, capture `[diag]`-style logs from `useOctreeGraphNode`,
|
1. Disposes the directional light shadow map and nulls it. three.js
|
||||||
`Lighting`, and `GameMapCollision` to confirm there is no extra configuration
|
reallocates the render target on the next render at the configured
|
||||||
pass (which would indicate a remaining suspending hook outside the existing
|
`mapSize`.
|
||||||
Suspense boundaries).
|
2. Marks every material's `needsUpdate = true`, forcing a shader recompile
|
||||||
|
that rebinds every program to the freshly created shadow sampler.
|
||||||
|
3. Forces a single shadow pass and invalidates the renderer.
|
||||||
|
|
||||||
|
The watchdog runs once per mount and adds a single traversal every 6 frames
|
||||||
|
during the warmup window, after which it self-terminates.
|
||||||
|
|||||||
BIN
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user