Kaynağa Gözat

Added small section on Unicode

Swaroop C H 12 yıl önce
ebeveyn
işleme
82849ee9e8
1 değiştirilmiş dosya ile 56 ekleme ve 2 silme
  1. 56 2
      14-io.md

+ 56 - 2
14-io.md

@@ -201,9 +201,63 @@ module. This process is called *pickling*.
 Next, we retrieve the object using the `load` function of the `pickle`
 module which returns the object. This process is called *unpickling*.
 
+## Unicode ##
+
+So far, when we have been writing and using strings, or reading and
+writing to a file, we have used simple English characters only. If we
+want to be able to read and write other non-English languages, we need
+to use the `unicode` type, and it all starts with the character `u`:
+
+~~~
+>>> "hello world"
+'hello world'
+
+>>> type("hello world")
+str
+
+>>> u"hello world"
+u'hello world'
+
+>>> type(u"hello world")
+unicode
+~~~
+
+We use the `unicode` type instead of `strings` to make sure that we
+handle non-English languages in our programs. However, when we read or
+write to a file or when we talk to other computers on the Internet, we
+need to convert our unicode strings into a format that can be sent and
+received, and that format is called "UTF-8". We can read and write in
+that format, using a simple keyword argument to our standard `open`
+function:
+
+~~~python
+# encoding=utf-8
+
+f = open("abc.txt", "wt", encoding="utf-8")
+f.write("नमस्ते दुनिया")
+f.close()
+
+text = open("abc.txt", encoding="utf-8").read()
+~~~
+
+How It Works:
+
+Whenever we write a program that uses Unicode literals like we have
+used above, we have to make sure that Python itself is told that our
+program uses UTF-8, and we have to put `# encoding=utf-8` comment at
+the top of our program.
+
+Whenever, we read or write from a file, we specify `encoding="utf-8"`
+and then Python knows how to read or write the Unicode strings.
+
+You can learn more about this topic by reading the
+[Unicode howto](http://docs.python.org/3/howto/unicode.html) and
+watching
+[Nat Batchelder's Pragmatic Unicode talk](http://nedbatchelder.com/text/unipain.html).
+
 ## Summary ##
 
-We have discussed various types of input/output and also file handling
-and using the pickle module.
+We have discussed various types of input/output, about file handling,
+about the pickle module and about Unicode.
 
 Next, we will explore the concept of exceptions.