Procházet zdrojové kódy

WIP #266 Fix regression reading multi byte utf8

Fixes a bug reading utf8 files, where the size of the field was being
passed to a function instead of the number of utf8 characters.
Probably a regression post 702.

Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Gavin Halliday před 13 roky
rodič
revize
ed041f2ea8
1 změnil soubory, kde provedl 11 přidání a 0 odebrání
  1. 11 0
      ecl/hqlcpp/hqltcppc.cpp

+ 11 - 0
ecl/hqlcpp/hqltcppc.cpp

@@ -2466,6 +2466,17 @@ IHqlExpression * CCsvColumnInfo::getColumnExpr(HqlCppTranslator & translator, Bu
 
     type.setown(makeReferenceModifier(type.getClear()));
 
+    if (isUnicodeType(type))
+    {
+        //This is an ugly fix to change the size to the number of utf8-characters.
+        //Better would be to either perform the mapping (and validation) in the engines, or
+        //give it a string of encoding utf8 and extend the code generator to correctly handle those
+        //string/unicode conversions by using the codepage to codepage mapping function.
+        StringBuffer temp;
+        temp.appendf("rtlUtf8Length(%s,%s)", lenText.str(), dataText.str());
+        lenText.swapWith(temp);
+    }
+
     OwnedHqlExpr length = createQuoted(lenText.str(), LINK(sizetType));
     OwnedHqlExpr data = createQuoted(dataText.str(), type.getClear());