The Linux Foundation Projects
Skip to main content

Webinar – How big is the risk of using LLM-generated code from the open source license compliance point of view?

Oscar Goñi (Quique) discussed research around LLM generated code and the potential for risks associated with open source license compliance. This event looked at source code similarity detection via open source tooling.

Watch the Webinar:

 

Abstract:

Oscar Goñi (Quique) has investigated source code similarity detection in Large Language Model (LLM) out-puts using the SCANOSS platform. While recent research has identified concerns regarding LLMs generating code that closely resembles their training data, the full extent of this similarity across the broader open-source ecosystem remained unexplored. Quique will describe during this talk his findings, which indicate that code similarity in LLM outputs may be more prevalent than previously indicated when evaluated against a broader open-source code base. At the same time, Quique will describe how this study contributes to the ongoing discussion of LLM-generated code’s originality and its implications for software licensing compliance, while validating the effectiveness of lightweight similarity detection algorithms as preliminary indicators for more comprehensive analysis. Finally, a Q&A session hopefully will provide participants some light of the implications of the study and to Quique about next steps in his research.

Link to the study: https://1598a6a9-df1a-48d5-891f-3e90e39b960e.usrfiles.com/ugd/1598a6_a32407fa87264fadb3646274c31f3fd8.pdf

 

Our Speaker:

Oscar Enrique (Quique) Goñi, UNICEN, Professor – STF Head of academic program

Oscar Enrique Goñi is a systems engineer who graduated from the National University of the Center of the Province of Buenos Aires, Faculty of Exact Sciences (Argentina, 2009), and holds a Ph.D. in Computer Science from the National University of La Plata (Argentina, 2015). Since 2004, he has been engaged in teaching and research activities at the National University of the Center of the Province of Buenos Aires. Additionally, he has led the design and management of critical systems projects, as well as in data mining and high-performance systems.

More About Our Webinars:

This event is part of the overarching OpenChain Project Webinar Series. Our series highlights knowledge from throughout the global OpenChain eco-system. Participants are discussing approaches, processes and activities from their experience, providing a free service to increase shared knowledge in the supply chain. Our goal, as always, is to increase trust and therefore efficiency. No registration or costs involved. This is user companies producing great informative content for their peers.

 

Check Out The Rest Of Our Webinars

 

 

This OpenChain Webinar will be broadcast on 2025-05-30.