
The OpenChain Project will hold a webinar on the 30th of May 2025 to discuss LLM generated code and the potential risks associated with it from the perspective of open source license compliance.
2025-05-30 @ 07:00 UTC / 08:00 BST / 09:00 CEST / 15:00 CST / 16:00 KST + JST
Join at the start time using this link:
https://zoom-lfx.platform.linuxfoundation.org/meeting/91794322307?password=7d786333-1dcf-4693-8d6b-fbe2dd7d55aa
Abstract:
Oscar Goñi (Quique) has investigated source code similarity detection in Large Language Model (LLM) out-puts using the SCANOSS platform. While recent research has identified concerns regarding LLMs generating code that closely resembles their training data, the full extent of this similarity across the broader open-source ecosystem remained unexplored. Quique will describe during this talk his findings, which indicate that code similarity in LLM outputs may be more prevalent than previously indicated when evaluated against a broader open-source code base. At the same time, Quique will describe how this study contributes to the ongoing discussion of LLM-generated code’s originality and its implications for software licensing compliance, while validating the effectiveness of lightweight similarity detection algorithms as preliminary indicators for more comprehensive analysis. Finally, a Q&A session hopefully will provide participants some light of the implications of the study and to Quique about next steps in his research.
Link to the study: https://shorter.me/_XHcS
Our Speaker:

Oscar Enrique (Quique) Goñi, UNICEN, Professor – STF Head of academic program
Oscar Enrique Goñi is a systems engineer who graduated from the National University of the Center of the Province of Buenos Aires, Faculty of Exact Sciences (Argentina, 2009), and holds a Ph.D. in Computer Science from the National University of La Plata (Argentina, 2015). Since 2004, he has been engaged in teaching and research activities at the National University of the Center of the Province of Buenos Aires. Additionally, he has led the design and management of critical systems projects, as well as in data mining and high-performance systems.