Explorar o código

WIP #266 Fix regression reading multi byte utf8

Fixes a bug reading utf8 files, where the size of the field was being
passed to a function instead of the number of utf8 characters.
Probably a regression post 702.

Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Gavin Halliday %!s(int64=14) %!d(string=hai) anos
pai
achega
ed041f2ea8
Modificáronse 1 ficheiros con 11 adicións e 0 borrados
  1. 11 0
      ecl/hqlcpp/hqltcppc.cpp

+ 11 - 0
ecl/hqlcpp/hqltcppc.cpp

@@ -2466,6 +2466,17 @@ IHqlExpression * CCsvColumnInfo::getColumnExpr(HqlCppTranslator & translator, Bu
 
     type.setown(makeReferenceModifier(type.getClear()));
 
+    if (isUnicodeType(type))
+    {
+        //This is an ugly fix to change the size to the number of utf8-characters.
+        //Better would be to either perform the mapping (and validation) in the engines, or
+        //give it a string of encoding utf8 and extend the code generator to correctly handle those
+        //string/unicode conversions by using the codepage to codepage mapping function.
+        StringBuffer temp;
+        temp.appendf("rtlUtf8Length(%s,%s)", lenText.str(), dataText.str());
+        lenText.swapWith(temp);
+    }
+
     OwnedHqlExpr length = createQuoted(lenText.str(), LINK(sizetType));
     OwnedHqlExpr data = createQuoted(dataText.str(), type.getClear());