Post by OECD.AI

55,626 followers

𝗜𝗳 𝘁𝗵𝗲 𝗶𝗻𝘁𝗲𝗿𝗻𝗲𝘁 𝗰𝗼𝗻𝘁𝗮𝗶𝗻𝘀 𝗵𝘂𝗻𝗱𝗿𝗲𝗱𝘀 𝗼𝗳 𝗯𝗶𝗹𝗹𝗶𝗼𝗻𝘀 𝗼𝗳 𝘄𝗲𝗯𝗽𝗮𝗴𝗲𝘀, 𝘄𝗵𝘆 𝗮𝗿𝗲 𝗔𝗜 𝗱𝗲𝘃𝗲𝗹𝗼𝗽𝗲𝗿𝘀 𝗳𝗮𝗰𝗶𝗻𝗴 𝗮 𝗱𝗮𝘁𝗮 𝘀𝗵𝗼𝗿𝘁𝗮𝗴𝗲? A new AI Wonk blog from the #Inria Centre of the GPAI Expert Community examines the growing AI data paradox and what could follow large-scale scraping practices. 𝗞𝗲𝘆 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀 𝗳𝗿𝗼𝗺 𝘁𝗵𝗲 𝗩𝗜𝗔𝗗𝗨𝗖𝗧 𝗶𝗻𝗶𝘁𝗶𝗮𝘁𝗶𝘃𝗲: 🔹 AI still relies heavily on scraped public web data, but legal risk and declining quality are changing the landscape 🔹 Data is not a single resource: it spans copyrighted content, personal data, trade secrets, public sector data and open data 🔹 Sustainable AI requires moving from extraction to structured data-sharing transactions 🔹 Three principles underpin responsible access to training data: legal compliance, trust and fairness 🔹 Practical solutions include opt-out systems, attribution tools, privacy-enhancing technologies and new economic models for sharing 𝗙𝘂𝘁𝘂𝗿𝗲 𝗔𝗜 𝗰𝗮𝗽𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝘄𝗶𝗹𝗹 𝗱𝗲𝗽𝗲𝗻𝗱 𝗼𝗻 𝘁𝗵𝗲 𝗲𝘁𝗵𝗶𝗰𝘀, 𝘀𝗰𝗮𝗹𝗮𝗯𝗶𝗹𝗶𝘁𝘆, 𝗮𝗻𝗱 𝗺𝘂𝘁𝘂𝗮𝗹 𝗯𝗲𝗻𝗲𝗳𝗶𝘁 𝗼𝗳 𝗱𝗮𝘁𝗮 𝗲𝗰𝗼𝘀𝘆𝘀𝘁𝗲𝗺𝘀. 𝗥𝗲𝗮𝗱 𝘁𝗵𝗲 𝗳𝘂𝗹𝗹 𝗯𝗹𝗼𝗴 𝗽𝗼𝘀𝘁 𝗮𝘁 𝘁𝗵𝗲 𝗹𝗶𝗻𝗸 𝗶𝗻 𝗰𝗼𝗺𝗺𝗲𝗻𝘁𝘀 𝗯𝗲𝗹𝗼𝘄. 👇 Marie Langé Yann Dietrich Bertrand Monthubert Aurélie Simard, Ph.D. Jean Constantin Christian Reimsbach Kounatze Clarisse Girot Sergi Gálvez Duran #TheAIWonk #AIData #DataGovernance #TrustworthyAI #GPAI #ArtificialIntelligence