Research
My research is in Conversational Agents, Large Language Models, and Dialogue Systems.
|
|
SMART: Self-Aware Agent for Tool Overuse Mitigation
Cheng Qian*, Emre Can Acikgoz*, Hongru Wang, Xiusi Chen, Avirup Sil, Dilek Hakkani-Tür, Gokhan Tur, Heng Ji
arXiv, 2025
arxiv
/ code
/ huggingface
Inspired by human metacognition, SMART enhances LLM's self-awareness to reduce tool overuse while boosting performance. Our experiments show that SMARTAgent reduces tool use by 24% while improving performance by 37%.
|
|
Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model
Emre Can Acikgoz, Jeremiah Greer, Akul Datta, Ze Yang, William Zeng, Oussama Elachqar, Emmanouil Koukoumidis, Gokhan Tur, Dilek Hakkani-Tür
arXiv, 2025
arxiv
/ website
/ code
CoALM unifies multi-turn dialogue management and complex API usage in a single model. Trained on the CoALM-IT multi-task dataset, CoALM (8B, 70B, 405B) outperforms domain-specific models like GPT-4o on MultiWOZ 2.4, BFCL V3, and API-Bank benchmarks.
|
|
ReSpAct: Harmonizing Reasoning, Speaking, and Acting
Vardhan Dongre, Xiaocheng Yang, Emre Can Acikgoz, Suvodip Dey, Gokhan Tur, Dilek Hakkani-Tür
arXiv, 2024
arxiv
/ website
/ code
ReSpAct is a framework that enables LLM agents to engage in interactive, user-aligned task-solving. It enhances agents' ability to clarify, adapt, and act on feedback.
|
|
Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare
Emre Can Acikgoz, Osman Batur İnce, Rayene Bech, Arda Anıl Boz, Ilker Kesen, Aykut Erdem, Erkut Erdem
NeurIPS Workshop (Oral), 2024
arxiv
/ website
/ poster
We present Hippocrates, an open-source LLM framework specifically developed for the medical domain. Also, we introduce Hippo, a family of 7B models tailored for the medical domain, fine-tuned from Mistral and LLaMA2 through continual pre-training, instruction tuning, and reinforcement learning from human and AI feedback.
|
|
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models
Ilker Kesen, Andrea Pedrotti, Mustafa Dogan, Michele Cafagna, Emre Can Acikgoz, Letitia Parcalabescu, Iacer Calixto, Anette Frank, Albert Gatt, Aykut Erdem, Erkut Erdem
ICLR, 2024
arxiv
/ website
/ code
ViLMA (Video Language Model Assessment) presents a comprehensive benchmark for Video-Language Models, starting with a fundamental comprehension test and followed by a more advanced evaluation for temporal reasoning skills.
|
Talks
Huawei NLP/ML Community Seminer Series: Morphological Analysis with Large Language Models (2022, Virtual)
EMNLP MRL: Winning Paper Presentation (2022, Abu-Dhabi)
|
Academic Service
Neural Information Processing Systems (NeurIPS) 2024, Reviewer
Empirical Methods in Natural Language Processing (EMNLP) 2022, Reviewer
|
|