Post by Sachin P

Enthusiastic Learner | AIML Developer | Influencing Speaker

Just built a touchless AI workspace controller for Linux (Ubuntu/GNOME)! for media control Instead of relying on a physical keyboard or mouse, this system captures real-time hand geometry via a standard webcam to dynamically control system states. How it works under the hood: Right Hand (Volume): Tracks the Euclidean distance between the thumb and index finger, mapping the pinch gap dynamically to system audio via pactl. Left Hand (Brightness): Uses the exact same responsive pinch mechanic to slide monitor backlights via brightnessctl. Either Hand (Track Skip): Flashes an intentional Thumbs Up gesture to instantly skip songs or YouTube tracks inside browsers like Brave. The Engineering Challenge: Modern Linux environments running the Wayland display server strictly sandbox background input injection to prevent security hijacking. This completely breaks traditional UI-faking tools like xdotool or pyautogui. To bypass this roadblock cleanly, I avoided UI automation entirely and mapped the gesture engine directly to the native GNOME Session Bus using low-level D-Bus pipeline messaging (gdbus). This allows the script to safely communicate directly with the core desktop environment! Check out the video demo below to see it in action!

Post content

Video Content