Publications
Developed an end-to-end multimodal reasoning framework with three core contributions: (1) built a scalable data synthesis pipeline (CoRe) that constructs fine-grained process-level supervision by generating targeted multimodal error/hallucination cases for verifier training; (2) updated the RL optimization recipe to incorporate dynamic verifier-guided process supervision during rollouts, improving credit assignment and training stability; and (3) introduced an inference-time collaborative reasoning algorithm where the guided verifier adaptively intervenes to provide targeted corrections and trajectory refinements, boosting robustness and efficiency beyond post-hoc verification.
Developed Canvas-of-Thought (Canvas-CoT), a stateful multimodal reasoning framework that transforms linear text generation into mutable state manipulation. The framework's core contributions include: (1) introducing a structured DOM-based reasoning substrate that enables atomic CRUD operations (Insert, Replace, Modify, Delete) for in-place state revisions, significantly reducing the serialization tax of context regeneration; (2) integrating a rendering-based critique loop that provides explicit visual feedback to resolve spatial and geometric hallucinations difficult to articulate through text alone; and (3) implementing recurrent context optimization which discards verbose thought traces to maintain a Markovian-like persistent state, boosting both reasoning precision and token efficiency in high-dimensional tasks.
Proposed Semantic-Condition Tuning (SCT), a framework that synthesizes context-aware semantic data from graph structures to enhance LLM reasoning. Designed a knowledge-enhanced data selection mechanism that utilizes LLM-generated descriptions to filter noisy topological context, enabling deep feature-level fusion of structural priors into textual representations.
Research on Semantic-Aware Knowledge Graph Completion. Conducted research on Knowledge Graph (KG) representation learning, focusing on structure-aware data synthesis and dynamic semantics modeling. Proposed Flow-Modulated Scoring (FMS), a framework that utilizes an energy-based Top-K data selection mechanism to filter noisy context and employs conditional flow matching to synthesize dynamic entity state representations, achieving state-of-the-art performance.
Led the research on Post-Training data preparation, systematically surveying the full lifecycle of data pipelines for Supervised Fine-Tuning (SFT) and RLHF. Analyzed and categorized state-of-the-art methodologies for instruction synthesis, quality filtering, and preference alignment, establishing a rigorous taxonomy to guide the construction of high-quality alignment corpora.
Proposed Semantic-Aware Relational Message Passing (SARMP), a framework that addresses noise and over-smoothing in Knowledge Graph Completion. Designed a semantic-aware Top-K neighbor selection strategy to filter irrelevant edges based on latent similarity, enabling the precise aggregation of contextual cues via multi-head attention mechanisms.
Contributed to the data visualization of benchmark statistics and experimental results. Assisted in the \textbf{data cleaning pipeline} by verifying the executability of generated SQL queries, ensuring the removal of syntax errors to maintain dataset reliability.
Oversee the overall design and progress arrangement of the research project
