We would like to set up a simple experimental platform (probably in Python) similar to that used in cognitive psychology to first test multimodal LLMs on some standard road safety stimuli (e.g., this existing dataset. Following this, we would incorporate manipulations to the stimuli that might interfere with how the multimodal LLM perceives the image, and subsequently makes inferences on the scene. For example, simple image degradations consistent with rain, darkness, or perhaps an explicit manipulation designed to specifically interfere with an LLM.
Department of Computer Science & Software Engineering The University of Western Australia Last modified: 16 July 2024 Modified By: Michael Wise |