
Oscar Goñi (Quique) discussed research around LLM generated code and the potential for risks associated with open source license compliance. This event looked at source code similarity detection via open source tooling.
Watch the Webinar:
Abstract:
Oscar Goñi (Quique) has investigated source code similarity detection in Large Language Model (LLM) out-puts using the SCANOSS platform. While recent research has identified concerns regarding LLMs generating code that closely resembles their training data, the full extent of this similarity across the broader open-source ecosystem remained unexplored. Quique will describe during this talk his findings, which indicate that code similarity in LLM outputs may be more prevalent than previously indicated when evaluated against a broader open-source code base. At the same time, Quique will describe how this study contributes to the ongoing discussion of LLM-generated code’s originality and its implications for software licensing compliance, while validating the effectiveness of lightweight similarity detection algorithms as preliminary indicators for more comprehensive analysis. Finally, a Q&A session hopefully will provide participants some light of the implications of the study and to Quique about next steps in his research.
Link to the study: https://1598a6a9-df1a-48d5-891f-3e90e39b960e.usrfiles.com/ugd/1598a6_a32407fa87264fadb3646274c31f3fd8.pdf
Our Speaker:

Oscar Enrique (Quique) Goñi, UNICEN, Professor – STF Head of academic program
Oscar Enrique Goñi is a systems engineer who graduated from the National University of the Center of the Province of Buenos Aires, Faculty of Exact Sciences (Argentina, 2009), and holds a Ph.D. in Computer Science from the National University of La Plata (Argentina, 2015). Since 2004, he has been engaged in teaching and research activities at the National University of the Center of the Province of Buenos Aires. Additionally, he has led the design and management of critical systems projects, as well as in data mining and high-performance systems.
More About Our Webinars:
This event is part of the overarching OpenChain Project Webinar Series. Our series highlights knowledge from throughout the global OpenChain eco-system. Participants are discussing approaches, processes and activities from their experience, providing a free service to increase shared knowledge in the supply chain. Our goal, as always, is to increase trust and therefore efficiency. No registration or costs involved. This is user companies producing great informative content for their peers.
Check Out The Rest Of Our Webinars
This OpenChain Webinar will be broadcast on 2025-05-30.