Skip to content

Sentence Alignment

The biggest Issues I am receiving to collect the parallel corpus is sentences alignment.


  1. The sentences are in an unordered format. Where the first sentence from one language might match with third sentence with second language.
  2. second one

Possible Solutions

  1. Using a set of pre-translated words pairs to match among the sentence pairs.
  2. Using numbers among the pairs to match.
  3. Crawl the entire purnachandra bhashakosh and prepare the dictionary.

Last update: 2023-03-27