Language Identifiers
Thread poster: Erwin_Franz
Erwin_Franz
Erwin_Franz
Latvia
Local time: 19:21
Russian to Latvian
+ ...
Dec 10, 2014

Dear colleagues,

Do you know any language identifying tools for processing tmx files?


 
Jack Doughty
Jack Doughty  Identity Verified
United Kingdom
Local time: 17:21
Russian to English
+ ...
In memoriam
Polyglot Dec 11, 2014

I don't know anything about .tmx files, but I use a language identifier called Polyglot 3000.
http://www.polyglot3000.com/

[Edited at 2014-12-11 08:48 GMT]

[Edited at 2014-12-11 08:49 GMT]


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 18:21
English to Hungarian
+ ...
Interesting Dec 11, 2014

I didn't know there were GUI language ID tools.
I've played a little bit with a perl module that does this:
http://search.cpan.org/~ambs/Lingua-Identify-0.56/lib/Lingua/Identify.pm
It seems to work pretty well.

If you need to do this automatically on a large number of files, I may be able to write a perl script to do it.

[Edite
... See more
I didn't know there were GUI language ID tools.
I've played a little bit with a perl module that does this:
http://search.cpan.org/~ambs/Lingua-Identify-0.56/lib/Lingua/Identify.pm
It seems to work pretty well.

If you need to do this automatically on a large number of files, I may be able to write a perl script to do it.

[Edited at 2014-12-11 11:55 GMT]
Collapse


 
Rolf Keller
Rolf Keller
Germany
Local time: 18:21
English to German
The question is ?? Dec 11, 2014

On principle, any .tmx file includes language indentifiers. So, what is the question?

 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 18:21
English to Hungarian
+ ...
Mislabeled Dec 12, 2014

I assumed that the tmx files in question have incorrect or missing language identifiers.
On second thought, it may well be a case of having a bunch of tmx files (potentially hundreds or even thousands) and needing to sort them, e.g. find all the en-fr files among the lot based on the language codes. That could also be automated with software. It would be easier than recognizing them based on the text itself.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Language Identifiers







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »