• CodeActAgent: Executable Code Actions Elicit Better LLM Agents

    We propose to use executable Python code to consolidate LLM agents’ actions into a unified action space (CodeAct). Integrated with a Python interpreter, CodeActAgent is an LLM agent that can execute code actions and dynamically revise prior actions or emit new actions upon new observations (e.g., code execution results) through multi-turn interactions (check out this example!).

    Citation
    • Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji. Executable Code Actions Elicit Better LLM Agents. Arxiv preprint, 2024.
    Link software, model, data, demo, paper
  • MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback

    MINT benchmark aims to evaluate LLMs' ability to solve tasks with multi-turn interactions by (1) using tools and (2) leveraging natural language feedback.

    Citation
    • Xingyao Wang*, Zihan Wang*, Jiateng Liu, Yangyi Chen, Lifan Yuan, Hao Peng, Heng Ji. MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback. ICLR, 2024.
    Link software, website, paper
  • SmartBook: AI-Assisted Situation Report Generation

    Produces structured report, from large volumes of news data, with multiple hypotheses (claims) summarized and grounded to sources with factual evidence.

    Citation
    • Revanth Reddy, Yi Fung, Vicki Zeng, Manling Li, Ziqi Wang, Heng Ji. 2023. AI-Assisted Situation Report Generation. In Submission, 2023.
    Link software, manual
  • MolT5: Translation between Molecules and Natural Language

    We present MolT5 − a self-supervised learning framework for pretraining models on a vast amount of unlabeled natural language text and molecule strings.

    Citation
    • Carl Edwards, Tuan Lai, Kevin Ros, Garrett Honke, Kyunghyun Cho, Heng Ji Translation between Molecules and Natural Language. EMNLP, 2022.
    Link software, manual
  • OneIE: A Joint Neural Model for Information Extraction with Global Features

    End-to-end Neural Information Extraction model for English, Spanish, and Chinese.

    Citation
    • Ying Lin, Heng Ji, Fei Huang, Lingfei Wu. 2020. A Joint Neural Model for Information Extraction with Global Features. Proceedings of The 58th Annual Meeting of the Association for Computational Linguistics.
    Link software, manual
  • Cross-lingual Entity Discovery and Linking

    Entity extraction and coreference for English, Spanish, and Chinese.

    Citation
    • Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare Voss, Daniel Napierski, Marjorie Freedman. 2020. GAIA: A fine-grained multimedia knowledge extraction system. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 77-86. 2020.
    • Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight and Heng Ji. 2017. Cross-lingual Name Tagging and Linking for 282 Languages. In Proc. ACL 2017.
    Link software
  • GAIA: A Fine-grained Multimedia Knowledge Extraction System

    Entity, relaton, event extraction with entity and event coreference for multilingual (English, Spanish, and Chinese) and multimedia corpus.

    Citation
    • Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare Voss, Daniel Napierski, Marjorie Freedman. 2020. GAIA: A fine-grained multimedia knowledge extraction system. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 77-86. 2020.
    Link software
  • SARAL Cross Lingual Resources

    We provide several cross-lingual resources such as multilingual embeddings and parallel corpora.

    Citation
    • Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight and Heng Ji. 2017. Cross-lingual Name Tagging and Linking for 282 Languages. In Proc. ACL 2017.
    • Xiaoman Pan, Thamme Gowda, Heng Ji, Jonathan May, and Scott Miller. 2019. Cross-lingual joint entity and word embedding to improve entity linking and parallel sentence mining. In Proc. DeepLo 2019.
    Link page
  • RESIN: Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System

    Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System.

    Citation
    • Haoyang Wen, Ying Lin, Tuan Lai, Xiaoman Pan, Sha Li, Xudong Lin, Ben Zhou, Manling Li, Haoyu Wang, Hongming Zhang, Xiaodong Yu, Alexander Dong, Zhenhailong Wang, Yi Fung, Piyush Mishra, Qing Lyu, Dídac Surís, Brian Chen, Susan Windisch Brown, Martha Palmer, Chris Callison-Burch, Carl Vondrick, Jiawei Han, Dan Roth, Shih-Fu Chang, Heng Ji. 2021. RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations, pp. 133-143. 2021.
    Link software
  • Multilingual Enity, Relation, Event and Human Value Extraction

    An end-to-end multilingual (English, Russian, and Ukrainian) knowledge extraction system that performs entity discovery and linking, relation extraction, event extraction, and coreference.

    Citation
    • Manling Li, Ying Lin, Joseph Hoover, Spencer Whitehead, Clare R. Voss, Morteza Dehghani, Heng Ji. 2019. Multilingual Entity, Relation, Event and Human Value Extraction. In NAACL-HLT 2019
    • Tongtao Zhang, Ananya Subburathinam, Ge Shi, LifuHuang, Di Lu, Xiaoman Pan, Manling Li, Bo-liang Zhang, Qingyun Wang, Spencer Whitehead, Heng Ji, Alireza Zareian, Hassan Akbari, BrianChen, Ruiqi Zhong, Steven Shao, Emily All-away, Shih-Fu Chang, Kathleen McKeown, DongyuLi, Xin Huang, Kexuan Sun, Xujun Peng, RyanGabbard, Marjorie Freedman, Mayank Kejriwal, Ram Nevatia, Pedro Szekely, T.K. Satish Kumar, Ali Sadeghian, Giacomo Bergami, Sourav Dutta, Miguel Rodriguez, and Daisy Zhe Wang. 2018. GAIA-a multi-media multi-lingual knowl-edge extraction and hypothesis generation system. In Proc. TAC-KBP 2018
    • Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight and Heng Ji. 2017. Cross-lingual Name Tagging and Linking for 282 Languages. In Proc. ACL 2017.
    • Ying Lin, Shengqi Yang, Veselin Stoyanov, and HengJi. A multi-lingual multi-task architecturefor low-resource sequence labeling. In Proc. ACL 2018.
    • Ge Shi, Chong Feng, Lifu Huang, Boliang Zhang,Heng Ji, Lejian Liao, and Heyan Huang. 2018. Genre separation network with adversarial training for cross-genre relation extraction.In Proc. EMNLP 2018.
    • Boliang Zhang, Di Lu, Xiaoman Pan, Ying Lin, Hal-idanmu Abudukelimu, Heng Ji, and Kevin Knight. 2017. Embracing non-traditional linguistic re-sources for low-resource language name tagging. In Proc. IJCNLP 2017.
    • Tongtao Zhang, Heng Ji, and Avirup Sil. 2019. Joint Entity and Event Extraction with Generative Adversarial Imitation Learning. In Data Intelligence.
    • Tongtao Zhang, Hongzhi Li, Heng Ji, and Shih-FuChang. 2015. Cross-document event coreferenceresolution based on cross-media features. In Proc. EMNLP 2015.
    Link docker, knowledge extraction, event timeline, event heatmap, entity-relation graph, event recommendation
  • ELISA Information Extraction System for Low-resource Languages

    Information Extraction system for low-resource languages (282 languages as of September 2017, growing fast).

    Citation
    • Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight and Heng Ji. 2017. Cross-lingual Name Tagging and Linking for 282 Languages. In Proc. ACL 2017.
    • Ulf Hermjakob, Qiang Li, Jonathan May, Sebastian Mielke, Nima Pourdamghani, Michael Pust, Xing Shi, Kevin Knight, Daniel Marcu, Nikolaos Malandrakis, Anil Ramakrishna, Victor Martinez, Elisabeth Staruk, Tanner Sorensen, Dogan Can, Shrikanth Narayanan, Tomer Levinboim, Kenton Murray, David Chiang, Boliang Zhang, Xiaoman Pan, Di Lu, Lifu Huang, Xiaocheng Feng and Heng Ji. 2016. ELISA System Description for LoReHLT 2016. In Proc. NIST LoReHLT 2016 Workshop
    • Boliang Zhang, Xiaoman Pan, Tianlu Wang, Ashish Vaswani, Heng Ji, Kevin Knight and Daniel Marcu. 2016. Name Tagging for Low-resource Incident Languages based on Expectation-driven Learning. In NAACL-HLT 2016.
    • Di Lu, Xiaoman Pan, Nima Pourdamghani, Shih-Fu Chang, Heng Ji and Kevin Knight. 2016. A Multi-media Approach to Cross-lingual Entity Knowledge Transfer. In Proc. ACL 2016.
    • Ying Lin, Xiaoman Pan, Aliya Deri, Heng Ji and Kevin Knight. 2016. Leveraging Entity Linking and Related Language Projection to Improve Name Transliteration. In Proc. ACL2016 Workshop on Named Entities.
    Link resources, system API
  • RPI Joint Information Extraction System

    The English joint IE system can extract ACE types of entity mentions, relations and events jointly.

    Citation
    • Qi Li and Heng Ji. 2014. Incremental Joint Extraction of Entity Mentions and Relations. In Proc. ACL.
    • Qi Li, Heng Ji, Yu Hong and Sujian Li. 2014. Constructing Information Networks Using One Single Model. In Proc. EMNLP.
    • Qi Li, Heng Ji and Liang Huang. 2013. Joint Event Extraction via Structured Prediction with Global Features. In Proc. ACL.
    Link download, demo
  • RPI Entity Discovery and Linking System

    Biomedical Entity Linking to 300+ Biological Ontologies

    Citation
    • Han Wang, Jin Guang Zheng, Xiaogang Ma, Peter Fox and Heng Ji. 2015. Language and Domain Independent Entity Linking with Quantified Collective Validation. In Proc. EMNLP.
    • Jin Guang Zheng, Daniel Howsmon, Boliang Zhang, Juergen Hahn, Deborah McGuinness, James Hendler and Heng Ji. 2014. Entity Linking for Biomedical Literature. BMC Medical Informatics and Decision Making.
    Link bio linker