Big News in ๐๐ถ๐๐๐๐ฏ ๐๐ผ๐ฝ๐ถ๐น๐ผ๐ ๐๐ฎ๐๐๐๐ถ๐!
A judge has thrown out most claims in the high-profile lawsuit against GitHub, Microsoft, and OpenAI over the AI coding assistant GitHub Copilot.
Here's what happened:
๐๐ฎ๐ฐ๐ธ๐ด๐ฟ๐ผ๐๐ป๐ฑ: In 2022, developers filed a $1B class-action lawsuit alleging Copilot violated copyright laws by utilizing code from GitHub repositories without proper attribution or adherence to licensing terms
๐ฅ๐ฒ๐ฐ๐ฒ๐ป๐ ๐ฅ๐๐น๐ถ๐ป๐ด: Of 22 initial claims, 20 have now been dismissed, including a crucial allegation under the ๐๐ถ๐ด๐ถ๐๐ฎ๐น ๐ ๐ถ๐น๐น๐ฒ๐ป๐ป๐ถ๐๐บ ๐๐ผ๐ฝ๐๐ฟ๐ถ๐ด๐ต๐ ๐๐ฐ๐ (๐๐ ๐๐) section 1202(b). DMCA claim alleged that Copilot removed essential copyright information when suggesting code snippets
๐ช๐ต๐ฎ๐ ๐ฑ๐ผ๐ฒ๐ ๐๐ ๐๐ ๐๐ฎ๐:ย โNo person shall, without the authority of the copyright owner or the lawโ
(1)intentionally remove or alter any copyright management information,
(2)distribute or import for distribution copyright management information..
(3)distribute, import for distribution, or publicly perform works, copies of works.. (details in comments)
๐ง๐๐ฟ๐ป๐ถ๐ป๐ด ๐ฃ๐ผ๐ถ๐ป๐: Judge Tigar found insufficient evidence of substantial code similarity and rejected claims of exact code reproduction. One possible reason could be ๐๐ถ๐๐๐๐ฏ'๐ ๐ฎ๐ฑ๐ท๐๐๐๐บ๐ฒ๐ป๐๐ ๐๐ผ ๐๐ผ๐ฝ๐ถ๐น๐ผ๐, which were designed to generate variations of training code rather than exact copies, thereby avoiding direct infringement accusations
๐ฅ๐ฒ๐บ๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐๐น๐ฎ๐ถ๐บ๐: Only two claims survive - open-source license violation and breach of contract.
๐๐จ ๐๐ ๐๐ฐ๐ ๐ฃ๐ฒ๐ฟ๐๐ฝ๐ฒ๐ฐ๐๐ถ๐๐ฒ: While the EU AI Act doesn't directly address these specific issues, it does emphasize:
-Compliance with copyright laws when using data to train AI systems (Article 10)
-Transparency requirements for AI systems, including documentation of data sources (Article 13)
-Obligations for AI providers that could relate to licensing and contractual issues (Chapter 3)
Our ๐ง๐ฎ๐ธ๐ฒ: This ruling is critical as it could set precedents for how copyright law applies to AI-generated content. The dismissal of copyright claims may encourage AI companies to continue using publicly available code for training purposes with slight adjustments ๐๐ถ๐๐ต๐ผ๐๐ ๐ฟ๐ฒ๐พ๐๐ถ๐ฟ๐ฒ๐ฑ ๐ฐ๐ผ๐ป๐๐ฒ๐ป๐/ ๐ฝ๐ฒ๐ฟ๐บ๐ถ๐๐๐ถ๐ผ๐ป๐. I strongly feel these type of judgements could lead to potential misuse of developers' work ๐๐ถ๐๐ต๐ผ๐๐ ๐ฝ๐ฟ๐ผ๐ฝ๐ฒ๐ฟ ๐ฐ๐ฟ๐ฒ๐ฑ๐ถ๐ ๐ผ๐ฟ ๐ฐ๐ผ๐บ๐ฝ๐ฒ๐ป๐๐ฎ๐๐ถ๐ผ๐ป. This case also highlights the need for techniques like ๐๐-๐๐ข๐ฃ, ๐ ๐๐ก-๐ ๐ฃ๐ฅ๐ข๐ to detect Copyrighted Content in LLM Training Data (link in comments)
Discussion about this post
No posts