Advanced Translation Technology Laboratory
MENUCLOSE
Software
Software and Data
- Atsushi Fujita. MTPEdocs-MQM: Span-based issue annotations based on MQM-like manual quality assessment of Japanese-to-English machine translation, October 22, 2024.
- Raphael Rubino. TexTra-MTQE: Software for machine translation quality estimation (MTQE), September 9, 2024.
- Hiroyuki Deguchi. semsis: A library for semantic similarity search, October, 2023.
- Shohei Higashiyama, Japanese Entity Linking Corpus, Under Construction.
- Atsushi Fujita. MultiEnJa: Sample English documents in various content domains that are often dealt with by translation service providers, their translations in Japanese, and markups for machine translation quality estimation, February 24, 2023 - October 21, 2024.
- Atsushi Fujita. Staged PE Dataset (staged-PE), August 18, 2021.
- Benjamin marie, Atsushi Fujita and Rubino Raphael. Annotations for the Meta-Evaluation of Machine Translation Research (meta_evaluation_mt), July 13, 2021.
- Dabre Raj. YANMTT (yanmtt), June 2021.
- Shohei Higashiyama. JLexNorm (jlexnorm), May 6, 2021.
- Asian Language Treebank (ALT) Project
- Masao Utiyama. ParaNatCom --- Parallel English-Japanese abstract corpus made from Nature Communications articles, Nov 27, 2020.
- Aizhan Imankulova, Atsushi Fujita, and Kenji Imamura, JaRuNC: Japanese-Russian-English News Commentary Parallel Data, May 2019.
- Masao Utiyama and Jun Kawai. Nikkaji Parallel Corpus, May 7, 2018.
- Yusuke Oda, primitiv: A neural network toolkit, October 2017.
- Yusuke Oda, NMTKit, October 2017. / Neural Machine Translation via Binary Code Prediction, ACL 2017. / An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation, 1st Workshop on Neural Machine Translation.
- Atsushi Fujita and Eiichiro Sumita, NICT QE/APE Dataset, October 2017.
- Lemao Liu and Atsushi Fujita, qenn: a word-level translation quality estimation system based on feedforward neural networks, July 2017.
- Atsushi Fujita, Lexpanded-PPDB: Lexically-Expanded Paraphrase Database, July 2016.
- Lemao Liu, JANUS: a Joint Agreement Neural Transduction System for sequence2sequence learning, November 2015.
- Taro Watanabe, trance: a transition-based neural network constituent parser, April 2014.
- Graham Neubig, lader: latent derivation reorder for pre-reordering of MT input, July 2012.
- Tatsuya Ishisaka, Masao Utiyama, Eiichiro Sumita, and Kazuhide Yamamoto. Japanese-English Software Manual Parallel Corpus, 2009.
- Masao Utiyama and Mayumi Takahashi. English-Japanese Translation Alignment Data (in Japanese), 2003.
- Masao Utiyama and Hitoshi Isahara. Alignment of Reuters Corpora, 2003.
- Masao Utiyama and Hitoshi Isahara, Textseg: text segmentation tool, 2000. / A Statistical Model for Domain-Independent Text Segmentation, ACL/EACL-2001, pp. 491--498. / README
- English Basic Travel Expression Corpus (BTEC), 20k sentences, License: CC BY 4.0. (README)