The Humanistic Buddhism Corpus v1.0 is a high-quality dataset containing 81,000 Chinese-English parallel phrases. The corpus is composed of 9 sutras, 46 books, 34 booklets, and 7 DVDs. It includes modern and classical Chinese text concerning Humanistic Buddhism, including proverbs and poems from over 100 Buddhist scholars, as well as publications authored by Venerable Master Hsing Yun, who was the founder of the Fo Guang Shan (FGS) International Buddhist Order. Throughout his life, he was an advocate of Humanistic Buddhism, which focuses on applying complicated Buddhist teachings in everyday life.
The Humanistic Buddhism Corpus contains 77k train set, 2k validation set, and 2k test set. For details about the HBC, please read the 2024 LREC-COLING conference paper titled Humanistic Buddhism Corpus: A Challenging Domain-Specific Dataset of English Translations for Classical and Modern Chinese.
The Humanistic Buddhism Corpus is intended to be used in academic and research settings. The corpus is not for use in for-profit industries. Researchers may use this corpus under the Creative Commons license agreement: Attribution NonCommercial-ShareAlike CC BY-NC-SA. Go to Download HBC to fill out the agreement form. Information on how to download the corpus will be sent to you within 7 days after your submission of the form.