Amazon Takes the Lead in Chatbot Advancement with Multimodal-CoT

Feb. 13, 2023.
The latest language models from Amazon are making waves in the world of chatbot technology. In a recent study, the company’s new models outperformed GPT-3.5 on the ScienceQA benchmark by a whopping 16 percentage points. This benchmark is a large set of annotated multimodal science questions with over 21,000 multimodal multiple-choice questions.

The use of Multimodal-CoT, a two-stage framework that combines visual and language representations to elicit more effective reasoning and answer inference, is critical to Amazon’s success. By utilizing a novel combination of vision and language inputs in the inference and reasoning-generating stages, this technique outperforms the previous state-of-the-art GPT-3.5 model.

Finally, the study by Amazon researchers emphasizes the significance of visual features in developing more effective rationales and contributing to more accurate answer inference. Amazon has clearly taken the lead in the race for the best chatbot solution, as other companies scramble to keep up.


