About Me

Thanks for visiting Zhengzhong Liu’s page (People often refer me by my nickname Hector). I am currently the Director of Institute of Foundation Models Silicon Valley Lab, where we conduct research about foundation models and AI in general. The IFM SV Lab leads the development of LLM360, an initiative for fully open foundation models; K2, the most performant open model; and Jais, the leading Arabic language model. Prior to that, I am the Head of Engineering at Petuum Inc.

I did my PhD in Natural Language Processing and Computational Linguistics at Carnegie Mellon University, working with Professor Teruko Mitamura and Professor Eduard Hovy. I also work closely with Professor Eric Xing on the CASL project. I obtained my bachelor degree in the Department of Computing from the Hong Kong Polytechnic University, where I start to learn about NLP by working with Professor Qin Lu.

News

Dec, 2023, 2 models, Amber and CrystalCoder are announced as the first models in the LLM360 open source initiative.
Dec, 2023, we announced LLM360, a project that open source our LLM training efforts, our goal is to make it a release framework with all the details (that we know).
Feb 23-24, 2023, the CASL workshop comes again! The second CASL workshop features in-depth discussion on AI at scale in industry. Stay tuned for the videos!
Oct 13-15, 2022, We organized the first CASL workshop (News). We have an amazing list of speakers!
Recently, we have combined several OSS projects (Petuum and AYSML) together and launched the CASL (Composible, Auotmatic, and Scalable ML) open source effort, led by Professor Eric Xing, unifying our efforts on building the Production and Industrial AI Platform.

Research

Many people in the NLP/computational linguistic field have to change their mindset due to the power of LLMs, me included. I am fortunate to lead the LLM training team at MBZUAI. I am now very interested in how knowledge and ability can be learned by LLMs via next token prediction tasks. This is also one direction that LLM360 project is pursuing. I am working with a few talented researchers on this problem. Code/results will be made available in the Analysis360 repository.

Prior to the LLM era, I believe that understanding linguistic problems would allow one to apply proper computational methods solve them. During my PhD, I focus on modeling event semantics, which includes event detection (TAC’17, TAC’15), anaphora resolution (NAACL’16, NAACL’16, LREC’14), script modeling (COLING’18) and salience modeling (EMNLP’18).

I am broadly interested in solving problems by combining machine learning techniques with linguistic insights and human knowledge. In the early days I worked on semantic web problems such as entity disambiguation(WWW’12, entity linking) or slot filling. Particularly, I love incorporating knowledge into NLP problems, such as in information retrieval (SIGIR’18,CIKM’17), or core NLP tasks (ACL’16).

Open Source

I am also a fan of developing open-source and high quality toolkits about NLP, I have recently worked on the following projects:

LLM360: A framework for open-source the full construction process of LLMs to foster transparency, trust, and collaborative research.
Texar: A modularized approach for Neural Network Based text Generation and more. Texar is nominated for the best demo paper in ACL 2019.
Forte: A flexible and highly composable NLP toolkit for a wide range of text applications (including IR, NLU, NLG). Checkout out our AAAI 2020 and GTC 2020 Tutorials for the design philosophy.
Stave: A general purpose, modern annotation toolkit (under development).

The ASYML project is now part of the CASL project!

Community

I served as Program Committee (reviewer) for ACL, NAACL, EMNLP, NLPCC, CoNLL these yeaers, and was only named outstanding reviewer once in ACL 2019.