Please allow us to attach images to the chat so that we can use the vision capabilities of multimodal models.