Difference between revisions of "MGIZA++"

Latest revision as of 11:51, 22 September 2010

Introduction

MGIZA++ is a software based on the famous word-alignment software GIZA++. Since GIZA++ is an signal-processing software and the processing of GIZA++ is time-consuming, MGIZA++ modify the structure of GIZA++ and then support the multi-thread architecture.

Support Word Alignment Model

IBM Model 1
IBM Model 2
IBM Model 3
IBM Model 4
IBM Model 5
IBM Model 6
Hidden Markov Model

Install

For more information about how to install MGIZA++, please go to Install MGIZA++.

Input File Format

The input format of MGIZA++ is almost the same with GIZA++, please read the GIZA++#Input_Format part to get more idea.

The only different between these two softwares is for the cooccurence file. MGIZA++ needs cooccurence file for processing. To get the coocurence file, please input the commend

giza-pp/snt2cooc.out vocFile1 vocFile2 snt12

Parameters

The parameters for MGIZA++ is also almost the same with GIZA++, please refer to the GIZA++#Parameters part.

Since the MGIZA++ is for multi-processing GIZA++ training, the only different parameter is the number of CPUs using in the processing. Please indicate how many CPUs by using -ncpu NUM.

Run

The configure file is the same with GIZA++. Please refer to GIZA++#Run.

mgiza-pp/mgiza configure.gizacfg -ncpu NUM

Path Information on Verbs

MGIZA++
- /home/verbs/shared/stages/tools/mgiza-pp/

Reference

http://geek.kyloo.net/software/doku.php/mgiza:overview

@@ Line 2: / Line 2: @@
 MGIZA++ is a software based on the famous word-alignment software GIZA++. Since GIZA++ is an signal-processing software and the processing of GIZA++ is time-consuming, MGIZA++ modify the structure of GIZA++ and then support the multi-thread architecture.
-== Support Word Alignment Model
+== Support Word Alignment Model ==
 * IBM Model 1
 * IBM Model 2
@@ Line 15: / Line 15: @@
 == Input File Format ==
+The input format of MGIZA++ is almost the same with [[GIZA++]], please read the [[GIZA++#Input_Format]] part to get more idea.
+The only different between these two softwares is for the cooccurence file. MGIZA++ needs cooccurence file for processing. To get the coocurence file, please input the commend
+ giza-pp/snt2cooc.out vocFile1 vocFile2 snt12
 == Parameters ==
+The parameters for MGIZA++ is also almost the same with GIZA++, please refer to the [[GIZA++#Parameters]] part.
+Since the MGIZA++ is for multi-processing GIZA++ training, the only different parameter is the number of CPUs using in the processing. Please indicate how many CPUs by using -ncpu NUM.
 == Run ==
+The configure file is the same with GIZA++. Please refer to [[GIZA++#Run]].
+ mgiza-pp/mgiza configure.gizacfg -ncpu NUM
+== Path Information on Verbs ==
+* MGIZA++
+** /home/verbs/shared/stages/tools/mgiza-pp/
 == Reference ==
+http://geek.kyloo.net/software/doku.php/mgiza:overview
-http://geek.kyloo.net/software/doku.php/mgiza:overview
+[[Category:Machine Translation]]

Difference between revisions of "MGIZA++"

Latest revision as of 11:51, 22 September 2010

Contents

Introduction

Support Word Alignment Model

Install

Input File Format

Parameters

Run

Path Information on Verbs

Reference

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools