Posts by Collection

portfolio

publications

Follow My Lead: Logical Fallacy Classification with Knowledge-Augmented LLMs

Published in The Twelfth Annual Conference on Advances in Cognitive Systems, 2025

This paper applied the ideologies from decision procedure by breaking down the complex logical fallacy description into atomic questions that can be answered with “yes” or “no”. This paper intends to explore whether we could enforce LLMs to perform system 2 thinking in this way. This paper was accepted by The Twelfth Annual Conference on Advances in Cognitive Systems, and Bay Area Machine Learning Symposium for poster presentation

Recommended citation: Wang, O. P., Bansal, T., Bai, R., Chui, E. M., & Gilpin, L. H. (2025). Follow My Lead: Logical Fallacy Classification with Knowledge-Augmented LLMs. arXiv preprint arXiv:2510.09970.
Download Paper

Bridging Legal Interpretation and Formal Logic: Faithfulness, Assumption, and the Future of AI Legal Reasoning

Published in Bloomberg LSLLAI 2026, 2026

This proposal presents a neuro-symbolic approach to legal AI that combines the expressive power of large language models with the rigor of formal verification, aiming to make AI-assisted legal reasoning both capable and trustworthy, thus reducing the burden of manual verification without sacrificing the accountability that legal practice demands.

Recommended citation: Wang, O. P., & Gilpin, L. H. (2026). Bridging Legal Interpretation and Formal Logic: Faithfulness, Assumption, and the Future of AI Legal Reasoning. ArXiv. https://arxiv.org/abs/2605.14049
Download Paper

Know Your Limits : On the Faithfulness of LLMs as Solvers and Autoformalizers in Legal Reasoning

Published in AI4Law Workshop at ICML 2026; AI4Math Workshop at ICML 2026, 2026

Large language models are increasingly tested as legal reasoning tools, formalizing contract clauses into strict logic and running them through automated solvers to verify whether one statement follows from another. Our new study re-annotating real non-disclosure agreements finds this added rigor is partly an illusion. Comparing five leading models across plain LLM judgment, LLM reasoning over formal logic, and an actual SMT solver (Z3), we found that formal structure boosted accuracy, but often because models reported solver-like conclusions without actually running the solver — a failure we call scope laundering, found in every model tested and in some cases over half the time (compounded by models missing logical constraints in their own formalizations and frequently generating faulty solver code), underscoring that looking logically grounded and being logically grounded aren’t the same thing as these tools edge closer to legal practice.

Recommended citation: Wang, O. P., Wong-Toropainen, S., Amrollahi, D., Bai, R., Bansal, T., Garg, A., & Gilpin, L. H. (2026). Know your limits: On the faithfulness of LLMs as solvers and autoformalizers in legal reasoning. arXiv. https://arxiv.org/abs/2606.16118
Download Paper