描述
Hello all. Looking for the best method, for replacing incomplete words in sentence.
An idea : use SequenceMatcher from python difflib, who reports a percent :
ex in french : intelligen (we don't hear the 't')
ratio = difflib.SequenceMatcher(None, 'intelligen', 'intelligent').ratio() ratio -> 0.9523809523809523
if the ratio == 1.0 pass (good) if the ratio >= defined_value, change to dic word if under, pass or report error, or report none,
Or, in my case, add a comment : src = tu es intelligent res = tu es intailigen (ratio = 0.7619047619047619 for intailigen, under needed ratio) corrected_result = u'tu es intailigen,bad' my bot read sentence, speaks that it heard a bad question, and save sentence to a log, for model corrections)
I could also compare complete sentence : the ratio would be very different, and I could completly replace the whole sentence by the dic one : ex: src = alfred diriges toi vers le salon res = dirige toi ver le salo alfre ratio = 0.7333333333333333 of course, it't a possible case for a limited corpus, and all possible sentences written in words.txt
It seems a bit heavy with words loops, but python is strong with this...
Is there a better method (deep one ?), that I could learn ? (python!)
Thanks all