Post by Sachin P
Enthusiastic Learner | AIML Developer | Influencing Speaker
Just built a touchless AI workspace controller for Linux (Ubuntu/GNOME)! for media control Instead of relying on a physical keyboard or mouse, this system captures real-time hand geometry via a standard webcam to dynamically control system states. How it works under the hood: Right Hand (Volume): Tracks the Euclidean distance between the thumb and index finger, mapping the pinch gap dynamically to system audio via pactl. Left Hand (Brightness): Uses the exact same responsive pinch mechanic to slide monitor backlights via brightnessctl. Either Hand (Track Skip): Flashes an intentional Thumbs Up gesture to instantly skip songs or YouTube tracks inside browsers like Brave. The Engineering Challenge: Modern Linux environments running the Wayland display server strictly sandbox background input injection to prevent security hijacking. This completely breaks traditional UI-faking tools like xdotool or pyautogui. To bypass this roadblock cleanly, I avoided UI automation entirely and mapped the gesture engine directly to the native GNOME Session Bus using low-level D-Bus pipeline messaging (gdbus). This allows the script to safely communicate directly with the core desktop environment! Check out the video demo below to see it in action!
Video Content