Kiro Building Note 6 (Problem Formulation)

Full Self Page Turning (Non-rigid Object Manipulation)
Problem: Automatically turn pages of a physical book from start to end without human intervention.
Current Hardware Structure
- Overhead Camera
- Objective: Observe the entire page-turning process, detect current page and arm positions.
- Behavior: Continuously records video (640x480 resolution) at 20fps.
- LED
- Objective: Provide consistent lighting for OCR and state recognition.
- Behavior: Turn on/off based on brightness of the current environment.
- Lift Arm
- Objective: Lift only the next page.
- Behavior: Rotates around the -y axis, from position vector (1, 0, 0) to (0, 0, 1), anchoring on the book spine.
- Turn Arm
- Objective: Turn the lifted page across the book.
- Behavior: Rotates around the z axis, from position vector (1, 0, 0) to (-1, 0, 0).
- Hold Arm
- Objective: Prevent previous or turned pages from turning back.
- Behavior: Rotates around the -y axis, from position vector (0, 0, 1) to (-1, 0, 0), anchoring on the book spine.
System Definition
Current
- Observation: Overhead Camera video stream (640x480)
- Normal (4):
idle
: The system is waiting before turning page.lifted
: A single page is successfully lifted and is attached to the lift arm.turned
: The page is being turned from right to left by the turn arm.ready_to_hold
: The page is ready to be hold by the hold arm.
- Abnormal (5): critical❗(require_human_intervention), warning ⚠️ (need_to_monitor), check 🔄 (auto_retry)
page_stuck_on_lift
(critical❗ )The page is strongly adhered to the lift arm and cannot be released.A human must manually detach the page to continue.arm_misaligned
(critical❗)The arm's movement is misaligned from the intended trajectory.A human must correct the mechanical alignment.lack_of_adhesion
(critical❗)The arm failed to lift the page after multiple attempts due to insufficient adhesion.A human must replace or adjust the gripping mechanism (e.g., arm tip or suction).multi_page_lifted
(check 🔄)More than one page was lifted.page_slipped_during_turn
(check 🔄):The page slipped away from the turn arm during motion.page_slipped_during_hold
(check 🔄):The page was successfully turned but failed to stay held by the hold arm.
- Normal (4):
- Action: Control angle values for 3 arms (lift, turn, hold) or Request Human Intervention
State:

Future
- Observation: Add secondary camera (front view), tactile sensor, IR distance sensor, etc.
- State: More fine-grained
- Action: Fully autonomous - DO NOT request human intervention
Feasible Solution
- Observation → State → Action
- Observation → State (Perception)
- Algorithm (Software 1.0): Edge Detection, Contour Matching, Motion Tracking, etc.
- ML (Software 2.0): Page State Classification
- Algorithm + ML (Software 1.0 + 2.0): Heuristics + ML
- Large Model (Software 3.0)
- State → Action (Policy)
- Algorithm: Rule-based actuator control depending on the state
- Observation → State (Perception)
- Observation → Action (End-to-end)
- Vision-based direct control with imitation learning or reinforcement learning
v1.0 Solution
Autonomy is more important than speed or cost.
Architecture
Observation → State: Use Large Vision Model
State → Action: Algorithmic Control
Plan after v1.0
- Observation → State
- Replace large model with lighter ML classification model for effective, real-time classification
- Curate custom dataset from Kiro’s teleoperation videos
- Observation → Action (End-to-end)
- Imitation Learning or Reinforcement Learning