Japanese
Development of a Japanese-English Software Manual Parallel Corpus
This page lists the alignment data presented in
- Tatsuya Ishisaka, Masao Utiyama, Eiichiro Sumita, and Kazuhide Yamamoto. (2009) Development of a Japanese-English Software Manual Parallel Corpus. MT summit.
 
The license of these alignment data is Creative Commons Attribution-Share Alike 3.0 Unported. However, note that you should also follow the licenses of the original Japanese and English texts.
The format of this page is as follows.
Name of Software
- English site: URL / license
 - Japanese site: URL / license
 - Alignment data: je.tgz
 
je/	
    align/: alignment files
        align/ the format of each file is
            SCORE ||| NM ||| JA ||| EN
            ===============================================
            name        meaning	
            -----------------------------------------------
            SCORE       Score of the alignment
            NM          # of Japanese and English sentences are N and M
            JA          Japanese sentences
            EN          English sentences
            ===============================================
    para.txt: 1-1, 1-2, 2-1 sentences from align/*
		
         SCORE ||| JA ||| EN
    Japanese is encoded in EUC.
Alignment Data
FreeBSD
Gentoo_Linux
JM
JF
NetBeans
PEAR
PHP
PostgreSQL
Python
XFree86
Last updated: Wed Jul  1 14:20:58 JST 2009