London Area, United Kingdom
30+ years at the intersection of creativity, editorial craft, and technology — from pioneering interactive television to hands-on work at the frontier of AI development. I've spent my career building things that require genuine breadth: writing and verifying content for the UK's most-watched quiz shows (The Weakest Link, Who Wants to Be a Millionaire, The Chase, Eggheads); growing a fine art business into the top 1% on Etsy with clients including Universal Music Group, HBO and CBS; and since 2024, contributing to AI training programmes for OpenAI and ElevenLabs — evaluating voice humanisation, writing analytical reports on regional accent characteristics, and working on language model quality as part of a specialist global team. My background is genuinely unusual: an MA from Chelsea College of Art & Design sits alongside 25 years of broadcast media work, self-taught VBA automation, a BAFTA nomination for interactive game design, and deep fluency in AI tools across text, image, and audio. I'm not someone who has recently discovered technology — I was designing interactive audience experiences in 1995. I'm currently looking for senior roles in content, editorial, creative strategy, or AI-adjacent fields where this combination of rigour, creativity, and technical curiosity is an asset rather than a puzzle.
Design and build evaluation tasks for Stellar's agent benchmarking platform, testing how frontier AI models (Claude, GPT, Gemini) handle complex instructions across multi-tool environments. Construct realistic scenarios with synthetic databases spanning 5-7 interconnected tools (HubSpot, Google Calendar, Google Drive, Gmail, Notes). Agents must cross-reference information across tools and complete multi-hop goals. Write Developer Instructions (behavioural rules), design evaluation rubrics, craft contextual hints, and run gold-standard trajectories through structured QA review rounds. Work requires logical scenario design using contextual indirection, meticulous data consistency checking across interconnected databases, and precise evaluation criteria that reviewers can verify objectively. Built a comprehensive methodology document and reference guide covering all 35 platform tools and 700+ functions. Received positive QA feedback on multi-hop reasoning design, timezone handling, and contextual complexity.
Contributing to the development of next-generation AI systems through structured evaluation and qualitative analysis, working across projects for OpenAI and ElevenLabs as part of a specialist global team of ~70–100. • OpenAI voice capability: Evaluating and improving the humanness, naturalness, and emotional range of ChatGPT's voice output. Work requires deep empathy, linguistic expertise, and the ability to write detailed analytical reports assessing subtle qualities of tone, pacing, and affect. • ElevenLabs accent characterisation: Analysing and defining the acoustic and cultural characteristics of regional accents to inform AI voice modelling — drawing on knowledge of language, regional identity, and cultural context. • Text-based RLHF: Evaluating and ranking model outputs across complex reasoning, language modelling, and visual culture tasks, operating within structured professional rubrics to consistently deliver high-quality analytical output under deadline pressure.
• Founded Standard Designs in 2010, an art print and poster company that became a 1% shop on Etsy and expanded to include a successful independent website. • Also specialised in creating authentic fictional museum and gallery exhibition posters under the Art Poster Archive brand, drawing on my extensive knowledge of historical poster designs to produce artworks that resonate with customers. • Developed a reputation for meticulous attention to detail and authenticity, offering customers stylish and affordable artistic artefacts. • Provided art and design services to notable clients in music, film, and TV, including Universal Music Group, K-Scope Records, HBO, and CBS, showcasing versatility and industry appeal. • Managed all aspects of business operations, from bookkeeping and liaising with accountants to coordinating with printing and shipping companies. • Implemented effective online advertising strategies using Google Ads, Facebook Ads Manager, and Microsoft Advertising, coupled with sophisticated customer analysis and segmentation. • Utilised SEO and copywriting skills to enhance product descriptions, advertising, and online presence, driving traffic and sales. • Navigated complex legal landscapes involving copyright, trademark, and intellectual property, including drafting legal correspondence and enforcing copyright protections. • Delivered exceptional customer service, building a loyal customer base and positive brand reputation. • Engaged in social media content creation and audience development, leveraging analytics to refine strategies and enhance engagement.
• Served as a question writer for a wide array of TV quiz shows, including 'Eggheads', 'Popmaster', 'In It To Win It', 'Who Dares Wins', 'Perfection', 'Puzzling' & 'Celebrity Puzzling', 'Unbeatable', 'Pressure Pad', 'Don't Blow The Inheritance', 'Better Than Average', 'Gift Wrapped', 'Revenge Of The Egghead', and various development projects. • As a question writer, crafted diverse, engaging, and intellectually stimulating questions, ensuring they resonated with each show's unique audience and format. • As a question verifier, meticulously reviewed and subedited questions for accuracy and consistency, diving deep into various subjects to guarantee their integrity and alignment with the show's standards. • Applied thorough research skills to ensure questions were factually watertight and met the high standards expected by both the production team and the audience. • Collaborated closely with producers and other team members, providing detailed reports and feedback to refine and enhance question quality. • Managed the delicate balance of maintaining each show's difficulty level, terminology, voice, and pacing, ensuring all questions were perfectly tailored to each quiz format.
Performed structured AI evaluation tasks, including prompt design, content verification, and failure-tracing across visual culture, language modelling, and reasoning accuracy.