Learning 2000 Chinese characters is not enough for reading newspapers, and the number of characters is not the problem
The statistic is that, in modern standard Chinese, such as articles in newspapers, the most frequently used 2000 characters account for roughly 95-98% of the total character occurring. A conclusion is that, if students learn these 2000 most frequently used Chinese characters, they could read Chinese newspapers.
Well, that really depends on how we interpret the word “read”. If we mean students are able to go through newspapers and recognise almost all of the Chinese characters there, then “yes”, they can read newspapers. If we mean students are able to comprehend what the articles are saying, then “no”, they still can’t read Chinese newspapers.
What the statistic says and what it doesn’t say
What the statistic tells us is normally like this:
Researchers typically identify 4000 - 5000 Chinese characters in prints or online. The breakdown of Chinese character frequency is normally like the following:
It seems that 2000 Chinese characters are quite sufficient. Then, why students still could not understand what the newspapers are saying even though they have gone through a deck of flashcards containing 2000 characters? The reason is that statistic does not tell the whole truth.
These statistics only count characters. They do not count words/combinations. It is as if we feed English texts into this statistic machine, and the result is that all we need to learn is 26 letters! Unfortunately, nobody can read English newspaper with the knowledge of 26 letters. They would be able to recognise each and every letter, but fail to understand the text completely.
Therefore, what we need to know is the statistic of how many words/combinations these 2000 Chinese characters can create, which enable us to understand articles in newspapers. So far I have not seen any such statistics around. However, we have something in a much smaller scale, as it is easy to count all the Chinese characters and words/combinations the Chinese Reading and Writing series.
Let’s take a look at the breakdown of each book in the series:
From one book to the next, when the increase of Chinese characters stays the same, the increase of words/combinations is huge. 320 Chinese characters have created 1,299 words/combinations, which are used to write all the sentences, conversations, and narratives in the series. There are also a lot words/combinations which these 320 characters can make up but I did not include in the series for the reason that I wanted the books to be more level appropriate for beginner students who just start learning how to read and write Chinese.
Let’s get back to the 2000 Chinese character statistic, from the trend of the exponential increase of words/combinations for a modest increase of Chinese characters, we can deduce that the 2000 frequently used characters could possibly make up for tens of thousands words/combinations. And people who write for Chinese newspapers are free to use all of them. No wonder students still can not read Chinese newspaper even when they can recognise almost all of the characters.
Learn more words, not just characters
Chinese texts are written with words/combinations, not individual characters. Therefore, accumulating 2000 individual characters won’t lead to comprehending Chinese newspapers.
Also when accumulating individual characters becomes the only goal of learning Chinese, students are likely to spend huge amount of hours and energy for tiny progress. They’ll probably spend too much time to study radicals. Read why learning radicals can be a waste of time here. Or they will resort to flash cards and hope to learn Chinese characters quickly. Read the problems of using flash cards here.
Therefore, if your goal is to be able to read Chinese newspapers or novels, you need to focus on words and combinations, as the key to read any Chinese text is the ability to deconstruct the text into words.
(852) 9739 8065
3/F, Dah Sing Life Building
99-105 Des Voeux Road Central
© 2022 MSL Master. All Rights Reserved