Dingjie Song

I am a Ph.D. student in Computer Science and Engineering at Lehigh University, advised by Prof. Lichao Sun, and currently a summer research intern at Salesforce AI Research in Palo Alto. Previously a research assistant at the CUHK-Shenzhen NLP group with Dr. Benyou Wang; M.E. from Nanjing University.

My research is in Large Language Models and Natural Language Processing, currently focused on LLM agents 🤖 and AI for scientific discovery 🔬 — building agents that learn and self-evolve in open-ended environments, and AI systems that automate and accelerate the research process end-to-end. This builds on my earlier work making multimodal and medical LLMs efficient, robust, and trustworthy. My work spans:

LLM Agents: [OpenSkill], [Agentic Robot]
AI Scientist & Scientific Discovery: [Dr. Claw 🦞], [AutoResearch AI], [Medical AI Scientist]
Multimodal LLM: [COLM 2024], [COLING 2025], [NAACL 2025], [ACL 2025], [EMNLP Findings 2025, ICML 2025 DIG-BUG Workshop Oral], [EMNLP Findings 2025], [ScratchMath]
Medical LLM: [NAACL 2024], [COLM 2024], [MICCAI 2025], [KDD 2026]

Earlier directions

Multilingual LLM: [NAACL 2024]
Task-oriented dialogue systems: [NLPCC 2023 Oral], [JCST 2023]

🦞 Featured — Dr. Claw: the best open-source IDE for AI-powered research. Carry a project from idea to publication with structured pipelines, multi-agent support (Claude Code, Gemini CLI, Codex, OpenRouter), and 100+ built-in research skills.

🌐 Website · ⭐ Star on GitHub · 📰 Featured by Lehigh Engineering

Updates

June 2026: 🌴 Started a summer research internship at Salesforce AI Research in Palo Alto!

June 2026: 🦾 OpenSkill released — open-world self-evolution for LLM agents! 📃 paper, code.

May 2026: 🎉🎉 LiveMedBench was accepted to KDD’26 (Datasets and Benchmarks Track)!

May 2026: 🦞 Dr. Claw open-sourced — a full-stack AI research workspace with multi-agent support and 100+ research skills; featured by Lehigh Engineering! 🌐 Website

May 2026: 📃 AutoResearch AI preprint released — toward AI-powered research automation for scientific discovery.

Mar 2026: 📃 New preprints released: Towards a Medical AI Scientist and ScratchMath.

Earlier updates

Aug 2025: 🎉🎉 MM-Detect and LongLLaVA were accepted to EMNLP Findings'25!

July 2025: 🎉🎉 SAMed-2 was accepted to MICCAI'25!

June 2025: 🎉🎉 MM-Detect was accepted to ICML'25 DIG-BUG Workshop!

May 2025: 🎉🎉 Med-MAT was accepted to ACL'25 main conference!

Jan 2025: 🎉🎉 MLLM-Bench was accepted to NAACL'25 main conference!

Dec 2024: 🎉🎉 TRIM was accepted to COLING'25 main conference!

Nov 2024: MM-Detect 🕵️ released! MM-Detect is the first Data Contamination Detection Framework for MLLMs! More information can be found in 📃 paper and the GitHub.

Sep 2024: TRIM ✂️ released! TRIM is a simple yet effective Image Token Reduction Method for efficient MLLMs! More information can be found in 📃 paper, 🤗 HuggingFace and the GitHub.

Sep 2024: LongLLaVA 🐍🦙 released! LongLLaVA is the first MLLM with hybrid architecture that can handle up to 1000 images! More information can be found in 📃 paper, 🤗 HuggingFace and the GitHub. 🥈#2 Paper of the day on Huggingface Daily Paper.

July 2024: 🎉🎉 Two papers MileBench and HuatuoGPT2 were accepted to COLM'24 main conference!

April 2024: MileBench 🛣️ released! MileBench is a pioneering benchmark designed to rigorously test the MultImodal Long-contExt capabilities of MLLMs. More information can be found on the 🌐 website, 📃 paper, 🤗 HuggingFace and the GitHub.

March 2024: 🎉🎉 Two papers CMB and AceGPT were accepted to NAACL'24 main conference!

Before 2024

Nov 2023: HuatuoGPT2 released! Try it out on the 🌐 demo! HuatuoGPT2 employs an innovative domain adaptation method to significantly boost its medical knowledge and dialogue proficiency and showcases SOTA performance in several medical benchmarks, especially surpassing GPT-4 in expert evaluations and the fresh medical licensing exams. More info can be found in 📃 paper and 🤗 HuggingFace.

Sep 2023: We publish AceGPT that achieved top performance among open-source Arabic language models in benchmark tests. More info can be found in 📃 paper and 🤗 HuggingFace.

Aug 2023: Checkout our 📃 new paper that focuses on benchmarking prevalent Medical LLMs for their medical knowledge and clinical diagnostic capabilities. More information can be found on the 🌐 website and the 🤗 HuggingFace.

Jul 2023: Start the journey in CUHK-sz as a research assistant under the guidance of Benyou Wang.

Jun 2023: I defended my master's degree and got my master's degree in software engineering. Thanks to all those who have supported me.

Aug 2022 - Apr 2023: Finished my internship with Jiaxing Zhang on LLM SFT.

Selected Publications

Google Scholar · See all publications →

LLM Agents

OpenSkill: Open-World Self-Evolution for LLM Agents
Zhiling Yan*, Dingjie Song*, Hanrong Zhang, Wei Liang, Yuxuan Zhang, Yutong Dai, Lifang He, Philip S. Yu, Ran Xu, Xiang Li, Lichao Sun
arXiv, Under Review, project page, code

AI Scientist & Scientific Discovery

Dr. Claw 🦞: A Full-Stack AI Research Workspace
“A Super AI Lab with massive AI Doctors as Assistants. Best IDE for Research via AI Power.” — carries a project from idea to publication, with structured pipelines, multi-agent support (Claude Code, Gemini CLI, Codex, OpenRouter), and 100+ built-in research skills.
🌐 Website · GitHub · 📰 Featured by Lehigh Engineering
Towards a Medical AI Scientist
Hongtao Wu*, Boyun Zheng*, Dingjie Song*, Yu Jiang, Jianfeng Gao, Lei Xing, Lichao Sun, Yixuan Yuan
arXiv, Under Review

Multimodal LLM

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Dingjie Song*, Sicheng Lai*, Mingxuan Wang, Shunian Chen, Lichao Sun, Benyou Wang
EMNLP Findings 2025, ICML 2025 DIG-BUG Workshop Oral, project page
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture
Xidong Wang*, Dingjie Song*, Shunian Chen, Chen Zhang, Benyou Wang
EMNLP Findings 2025, project page, code and data
Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs
Dingjie Song, Wenjun Wang, Shunian Chen, Xidong Wang, Michael Guan, Benyou Wang
COLING 2025, code and model
MileBench: Benchmarking MLLMs in Long Context
Dingjie Song, Shunian Chen, Guiming Hardy Chen, Fei Yu, Xiang Wan, Benyou Wang
COLM 2024, project page, code and data

→ Browse the full list of publications

Awards

Outstanding Graduate Student, Nanjing University, 2022
Yingcai Scholarship, Nanjing University, 2022
Renmin Scholarship (People’s Scholarship), Nanjing University, 2018-2021

More awards

Third Runner-Up, 15th Citi Cup Financial Innovation Application Competition, Citigroup, 2019
Second Runner-Up, 2019 "Chain to Future" University Blockchain Technology Application Competition, CCF, 2019
Outstanding Student Leader of the Communist Youth League, Nanjing University, 2018-2019

Services

Conference reviewer: EMNLP, ACL Rolling Review