Difference between revisions of "MGIZA++"

From CompSemWiki
Jump to navigationJump to search
Line 16: Line 16:
 
== Input File Format ==
 
== Input File Format ==
 
The input format of MGIZA++ is almost the same with [[GIZA++]], please read the [[GIZA++#Input_Format]] part to get more idea.
 
The input format of MGIZA++ is almost the same with [[GIZA++]], please read the [[GIZA++#Input_Format]] part to get more idea.
 +
 +
The only different between these two softwares is for the cooccurence file. MGIZA++ needs cooccurence file for processing. To get the coocurence file, please input the commend
 +
 +
giza-pp/snt2cooc.out vocFile1 vocFile2 snt12
  
 
== Parameters ==
 
== Parameters ==

Revision as of 11:25, 22 September 2010

Introduction

MGIZA++ is a software based on the famous word-alignment software GIZA++. Since GIZA++ is an signal-processing software and the processing of GIZA++ is time-consuming, MGIZA++ modify the structure of GIZA++ and then support the multi-thread architecture.

Support Word Alignment Model

  • IBM Model 1
  • IBM Model 2
  • IBM Model 3
  • IBM Model 4
  • IBM Model 5
  • IBM Model 6
  • Hidden Markov Model

Install

For more information about how to install MGIZA++, please go to Install MGIZA++.

Input File Format

The input format of MGIZA++ is almost the same with GIZA++, please read the GIZA++#Input_Format part to get more idea.

The only different between these two softwares is for the cooccurence file. MGIZA++ needs cooccurence file for processing. To get the coocurence file, please input the commend

giza-pp/snt2cooc.out vocFile1 vocFile2 snt12

Parameters

Run

Reference

http://geek.kyloo.net/software/doku.php/mgiza:overview