Sentence Alignment¶
The biggest Issues I am receiving to collect the parallel corpus is sentences alignment.
Issues¶
- The sentences are in an unordered format. Where the first sentence from one language might match with third sentence with second language.
- second one
Possible Solutions¶
- Using a set of pre-translated words pairs to match among the sentence pairs.
- Using numbers among the pairs to match.
- Crawl the entire purnachandra bhashakosh and prepare the dictionary.
Last update:
2023-03-27