فهرست منبع

use specified costs when one of the strings is empty

If one of the strings is empty it's not correct to assume that the cost is the length of the other string since we the insertion and deletion costs are configurable and not necessarily 1.
Motti Lanzkron 5 سال پیش
والد
کامیت
106489c6b5
1فایلهای تغییر یافته به همراه3 افزوده شده و 2 حذف شده
  1. 3 2
      strsimpy/weighted_levenshtein.py

+ 3 - 2
strsimpy/weighted_levenshtein.py

@@ -18,6 +18,7 @@
 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 # SOFTWARE.
 
+from functools import reduce
 from .string_distance import StringDistance
 
 
@@ -52,9 +53,9 @@ class WeightedLevenshtein(StringDistance):
         if s0 == s1:
             return 0.0
         if len(s0) == 0:
-            return len(s1)
+            return reduce(lambda cost, char: cost + self.insertion_cost_fn(char), s1, 0)
         if len(s1) == 0:
-            return len(s0)
+            return reduce(lambda cost, char: cost + self.deletion_cost_fn(char), s0, 0)
 
         v0, v1 = [0.0] * (len(s1) + 1), [0.0] * (len(s1) + 1)