Workshop on Generative AI for Software Engineering (GenAI4SE)

Objective of Workshop

Software engineering in the large is an effort-intensive and time-consuming activity whereas IT systems today need to make changes in the shortest possible duration. Most complex, large-scale software systems of today derive their requirements from existing (legacy) software and partial (incomplete) descriptions. Software development is thus a complex combination of transformation, reverse and forward engineering, involving code, data, and specifications, wherein data is both structured and unstructured. Expertise from subject matter experts (SMEs) is essential at each phase, which brings in the important component of domain knowledge. While Model-Driven Engineering (MDE), Knowledge Engineering (KE), and Reverse Engineering (RE) have mitigated some of the challenges, the emergence of Generative AI techniques holds the potential for a substantial breakthrough, though is yet to bring a consistent and substantial jump in productivity. There are challenges in understanding and resolving issues reported in GitHub. The absence of high-quality datasets that encompass a wide range of programming tasks, styles, and languages add to the challenges. With dynamically changing businesses of enterprises, and evolution of fast-changing and new technologies, the LLMs need to keep pace with the evolving code knowledge.

Use of GenAI for software development has seen increasing maturity in the past 1+ year. Different tools like GitHub Copilot, Amazon Q developer agent (Code Whisperer), AutoCodeRover + GPT 4o, Assistant GRU, SWE agent + Claude 3.5 Sonnet have enhanced and are enhancing code development, code completion, test case generation, debugging and issue fixing tasks. Some methods have exploited the instruction tuning and reinforcement learning with feedback. Several more small and large models and tools are exploiting RACG techniques and claim to considerably enhance the software development tasks.

The new paradigm of LLM Based AI agents, have demonstrated effectiveness in variety of Software Engineering (SE) tasks, such as program generation, software testing and debugging and program improvement as well as end-to-end software engineering. These agents can extend the capabilities of the backbone LLMs by utilizing external resources and tools and soliciting human interactions. From SE perspective, there is a need to analyze how LLM-based agents can tackle different software development and maintenance tasks. Whereas, from the agent perspective, there is a need to throw some light on the basic agentic design components, including planning, memory, perception, action and their roles and collaboration mechanisms, in multi-agent settings.

The proposed workshop aims to provide a collaborative platform for researchers and practitioners to delve into the convergence of traditional MDE, KE, and RE methodologies together with Generative AI technologies. By synergizing the strengths of Gen AI, LLM Agentic Frameworks, modeling, and knowledge representation for SDLC, our goal is to define a trajectory toward enhanced software engineering practices. We seek discussion on the following pivotal questions:

Architecting, designing, developing and maintaining industry-strength software is a multi-skill, long-drawn activity that cannot be effectively addressed by LLMs alone. What kinds of augmentation make it effective?
LLM is a vast storehouse of general information, but the typical need during SDLC is rather sharply focused. How best to bear local knowledge at scale to get the required focus?
How can Generative AI be effectively utilized for mining and constructing purpose-driven knowledge from software artifacts? Can Generative AI catalyze existing legacy modernization techniques to reduce cost, time and improve correctness?
How can Generative AI enhance the synthesis of tests and test data during SDLC?
Can bug fixes and change requests be analyzed and implemented expeditiously using Generative AI?
What are the best practices for maintaining the accuracy, relevance and reliability of Generative AI generated software artifacts?
What are the state-of-the-art techniques, experimental models, methodologies, benchmark datasets and evaluation metrics employed for the usage of LLMs and LLM-based agents in SE applications? What are the key differences in task performance between LLMs and LLM-based agents?

The workshop aims to foster interactive discussions, enabling participants to collectively shape the future of advanced software engineering. The inaugural edition will feature talks by invited speakers who are exploring one or more of the above questions, interspersed with short experiences of researchers who are exploring specific challenges of software engineering using GenAI techniques. Time permitting, we will have short lightening talks by researchers to share interesting observations and anecdotes.

Invited Speakers
Dr. Diptikalyan Saha IBM Research, India	Navigating the Landscape of AI Agents: Characteristics, Innovations, and Future Directions
Karthik Vaidhyanathan IIIT-Hyderabad, India	Playing with Abstractions: At the crossroads of Software Architecture and Generative AI
Rajaswa Patil Applied AI Consultant; Previously: Postman/Microsoft	From Code to Cognition: How Agentic AI Unites Software 1.0 and Software 2.0
Dr. Alexander Serebrenik Eindhoven University of Technology	Exploring the Effect of Multiple Natural Languages on Code Suggestion Using GitHub Copilot

Important Dates
22 Dec 2024	Deadline for submission
5 Jan 2025	Extended date for submission
22 Jan 2025	Notification of acceptance
10 Feb 2025	Publishing list of accepted talks
20 Feb 2025	GenAI4SE workshop

Submission Link

Abstract Submission Link

Paper Submission Link

Call For Abstract And Call For Paper

Software Engineering (SE) in the large is an effort-intensive and time-consuming activity whereas IT systems today need to make changes in the shortest possible duration. Software development is thus a complex combination of transformation, reverse and forward engineering, involving code, data, and specifications. While Model-Driven Engineering (MDE), Knowledge Engineering (KE), and Reverse Engineering (RE) have mitigated some of the challenges, the emergence of Generative AI techniques holds the potential for a substantial breakthrough and is an important area of study and exploration. The absence of high-quality datasets that encompass a wide range of programming tasks, styles, and languages add to the challenges. Further, as the technology and business landscapes change, LLMs constantly need to match the pace.

The new paradigm of LLM Based AI agents have demonstrated effectiveness in variety of Software Engineering (SE) tasks, such as program generation, software testing and debugging and program improvement as well as end-to-end software engineering. These agents can extend the capabilities of the backbone LLMs by utilizing external resources and tools and soliciting human interactions.   From SE perspective, there is a need to analyze how LLM-based agents can tackle tasks across Software Development Lifecycle (SDLC) and how to design the basic agentic components, including planning, memory, perception, action and their roles and collaboration mechanisms, in multi-agent settings.

The Workshop on GenAI Based Software Engineering aims to provide a collaborative platform for researchers from academia industry, and practitioners to delve into the convergence of traditional MDE, KE, and RE methodologies and software engineering approaches together with Generative AI technologies. We solicit submissions in the form of one-page abstract (max 500 words) OR papers of maximum 5-pages + 1-page references in the standard ACM format describing case studies, interesting experiments, techniques and best practices, and lessons learned while applying Generative AI and LLM agent frameworks to various SE areas, but not limited to the following topics:

Agentic frameworks for SE
Intelligent Code Assistants
Neuro-Symbolic Approaches for SE
GenAI Frameworks for SE with Human in the Loop
Advanced Retrieval Augmentation for SE
Datasets for SE
LLMs for Knowledge Engineering
Tuning of SLMs (Small Language Models) for SE
Technical Risks associated in AI/ML implementations
Negative Results demonstrating failed application of GenAI for SE
Low-cost GenAI solutions for SE
Reliability of GenAI generated software artifacts
LLMs as a judge for evaluation of SE tasks

The areas of interest across SE include but are not restricted to Requirements Engineering, Software Architecture and Design, Software Development and Maintenance, Software Verification, Testing and Debugging, Legacy Modernization, Reverse Engineering from code, documents, Software Analysis, Repository level coding tasks including code development , code completion, test case generation, debugging and issue fixing tasks.

Submission Information : Abstract (maximum 500 words) OR papers (maximum 5 pages + 1 page references). Abstracts can be from already published work at other conference venues and should include the details of when and where the original work is published. Papers should be original work, not being considered for publishing elsewhere, written in text format, in English. Accepted papers may be considered for publication at an appropriate forum (either in ACM DL, or SE Notes or CEUR publication). Abstracts should be submitted via the Google Form Link: Click Here!. Papers should be submitted via Easychair link: Submit Here. In case of any questions, you can write an email to genai4se@googlegroups.com

Acceptance criteria : Abstracts and papers will be selected for presentation based on reviews by members of the workshop organizing committee. The tentative criteria will be the clarity of articulation of the problem being solved, motivating the need of Generative AI to solve the problem and the novelty of the approach. Authors of accepted papers will receive further instructions for submitting camera-ready versions.

At least one of the authors of the selected abstract or paper MUST register for the ISEC conference to present their paper at the workshop.

Program

Schedule
Time	Talk	Speaker / Chair	Topic
9:50 AM	Welcome	R.D.Naik	Opening remarks and Agenda
10:00 to 11:30 AM	Session 1	Chair: Raveendra Kumar	GenAI for Software at Scale
10:00 AM	Invited talk 1	Karthik Vaidhyanathan, IIIT Hyd	Playing with Abstractions: At the crossroads of Software Architecture and Generative AI
10:35 AM	Paper 1	Mihir Shriniwas Arya, Aditya Ranjan and Ananmay Abhishek Lohia	ApexCodium: A Multi-Agent System for Automated Code Generation with Enhanced Self-Reflection and Testing
10:55 AM	Abstract 1	Gayathri Ekambaram	Professional Code Engine
11:10 AM	Discussion / Q&A	Moderated by Raveendra Kumar	Interaction with all speakers of the session
11:30:00 AM : Tea Break
12:00 to 13:30 PM	Session 2	Chair: Manasi Patwardhan	Agentic AI for Software Engineering
12:00 PM	Invited talk 2	Diptikalyan Saha	Navigating the Landscape of AI Agents: Characteristics, Innovations, and Future Directions
12:35 PM	Abstract 2	Dr Anjaneyulu Pasala	MultiFluxAI: Enhancing Platform Engineering with Advanced Agent-Orchestrated Retrieval Systems
12:50 PM	Invited talk 3	Rajaswa Patil	From Code to Cognition: How Agentic AI Unites Software 1.0 and Software 2.0
13:25 PM	Discussion / Q&A	Moderated by Manasi Patwardhan	Interaction with all speakers of the session
13:45:00 PM: Lunch Break
15:00 to 16:15 PM	Session 3	Chair: RD Naik
15:00 PM	Abstract 3	Ponnampalam Pirapuraj	MuGNN: API Misuse Detection using Graph Neural Networks
15:15 PM	Panel Discussion		GenAI for Large Scale Software Engineering - Hype to Reality
16:15:00 PM: Tea Break
16:30 to 18:30 PM	Session 4	Chair: Asha Rajbhoj	GenAI for Software Engineering
16:30 PM	Invited talk 4	Prof Alexander Serebrenik	Exploring the Effect of Multiple Natural Languages on Code Suggestion using GitHub Copilot
17:05 PM	Paper 2	Chandan Prakash, Balla Rathan Veer, Pavan Kumar Chittimalli and Ravindra Naik	Model-based Structuring for Enterprise Document Digitalization
17:25 PM	Paper 3	Akanksha Somase, Piyush Kulkarni, Asha Rajbhoj and Vinay Kulkarni	Leveraging LLMs for Requirements Specification Generation
17:45 PM	Paper 4	Tirupati Sahu, Neelamadhab Padhy, Rasmita Panigrahi, Lov Kumar and Sanjay Mishra	Federated Risk Register Using RAG model
18:05 PM	Discussion / Q&A	Moderated by Asha Rajbhoj	Interaction with all speakers of the session
18:20 PM	Closing Remarks	RD Naik

Organizers

Raveendra Medicherla

is a Principal Scientist at TCS Research. He has 27+ years of experience in Software services delivery and related research. His research interests include application of Symbolic, Generative AI, Neuro symbolic techniques to Software system’s transformation and Software testing.

Asha Rajbhoj

is a Principal Scientist at TCS Research. She has 28+ years of experience in industrial research and has 20+ publications and several patents to her credit. Her research interests are in GenAI, Requirement Engineering, Model-Driven Engineering, Meta-modelling, Natural Language Processing, and Business Process Modelling.

Mansi Patwardhan

is a Senior Scientist at TCS Research. She has 19+ years of experience in academic and industry research, with interests spanning LLM Agnetic Framewokrs, Generative AI, AI for Code and Program Synthesis, Natural Language Understanding and Reasoning, Neuro-Symbolic systems, Multi-Modal Multi-Lingual Processing in which she has over 30+ publications and 5 patents.

Lalit Mohan

is a Chief Product Officer at Quick Heal. He has 25+ years of experience in Banking Technology and Cyber Security and worked at Infosys, Wells Fargo, IDRBT (an institute established Reserve Bank of India) for building banking platforms. His research includes Software Engineering, Knowledge Management, Cloud and Cyber Security. His recent interest is productizing AI/ML implementations and AI/ML risks, specifically GenAI.

Vibhu Saujanya Sharma

is a Technology Research Principal Director at Accenture Labs, with 20+ years of experience in industrial and academic research in areas like software metrics and process insights, cloud computing, and software performance engineering. He has published more than 50+ peer-reviewed research papers in key journals and conferences and is an inventor in 75+ granted patents. His current research focuses on green software engineering and the use of Gen AI in software engineering.

Ravindra Naik

is a Chief Scientist at Tata Consultancy Services Research (TCSR), with over 34 years of experience in industry research around IT system transformations and software development tools, specializing in code analysis, software modelling, code synthesis, NL analysis and reasoning and use of ML and GenAI for various software engineering tasks. He has 20+ publications and 10+ patents to his credit.

Venue

Please visit the ISEC Conference page for workshop location and registration.

Workshop on GenAI Based Software Engineering

Co-located with ISEC 2025 February 20th, 2025, NIT Kurukshetra, India