Retrieval Augmented Generation - MRIWA

The advent of LLMs such as ChatGPT has significantly changed users perceptions of information retrieval with the in-chat chain of prompts able to elicit information, sources and summaries. However data held within organisations is not allowed to be uploaded to API-based language models as data inside the organisation cannot be placed in the public sphere. An alternative, acceptable-to-organisations approach is to augment an LLM application with specific organisational data. There are a number of open source tools that may this e.g. llamaIndex; but their suitability needs to be assessed. Broadly this area is called Retrieval-augmented generation.

The industry client is the Minerals Research Institute of WA (MRIWA). MRIWA has a database of pdf reports and will make a selection available for this project. The industry mentor is Nicole Rooke, the CEO of MRIWA . Nicole will be supported by Prof. Melinda Hodkiewicz (Engineering) and Caitlin Woods (CSSE), both from the NLP-TLP group at UWA.

The goals of this project are to 1) select suitable open source tool(s), 2) develop a pipeline to answer competency questions on a data set of pdfs stored on the MRIWA SharePoint site with a text-based answer to a question (as with ChatGPT), 3) provide references to the pdfs from which the response is drawn, 4) report on lessons learned and gaps, 5) provide concrete suggestions for next steps, and 6) provide access to the code in GitHub. MRIWA will provide a set of competency questions.

This is a meaningful and timely industry-relevant project as many organisations are seeking to use developments demonstrated by ChatGPT to change the way they can interogate their internal documents.

Client


Contact: Melinda Hodkiewicz
Phone: +61439512475
Email[email protected]
Preferred contact: Email
Location: UWA System Health Lab 1.53 Eng

IP Exploitation Model


The IP exploitation model requested by the Client is: Creative Commons (open source) http://creativecommons.org.au/



Department of Computer Science & Software Engineering
The University of Western Australia
Last modified: 12 July 2023
Modified By: Michael Wise
UWA