Skip to main content

Learning 2000 Chinese characters is not enough for reading newspapers, and the number of characters is not the problem

The statistic is that, in modern standard Chinese, such as articles in newspapers, the most frequently used 2000 characters account for roughly 95-98% of the total character occurring. A conclusion is that, if students learn these 2000 most frequently used Chinese characters, they could read Chinese newspapers. 

Well, that really depends on how we interpret the word “read”. If we mean students are able to go through newspapers and recognise almost all of the Chinese characters there, then “yes”, they can read newspapers. If we mean students are able to comprehend what the articles are saying, then “no”, they still can’t read Chinese newspapers. 

What the statistic says and what it doesn’t say

What the statistic tells us is normally like this:

Researchers typically identify 4000 - 5000 Chinese characters in prints or online. The breakdown of Chinese character frequency is normally like the following:

  • Top 250 characters 57.1% - 64.4%
  • Top 500 characters 72.1% - 79.2%
  • Top 1000 characters 86.2% - 91.1%
  • Top 1500 characters 92.4% - 95.7%
  • Top 2000 characters 95.6% - 97.9%
  • Top 3000 characters 98.3% - 99.4%
  • The rest of characters  0.6% - 1.7% 

It seems that 2000 Chinese characters are quite sufficient. Then, why students still could not understand what the newspapers are saying even though they have gone through a deck of flashcards containing 2000 characters? The reason is that statistic does not tell the whole truth. 

These statistics only count characters. They do not count words/combinations. It is as if we feed English texts into this statistic machine, and the result is that all we need to learn is 26 letters! Unfortunately, nobody can read English newspaper with the knowledge of 26 letters. They would be able to recognise each and every letter, but fail to understand the text completely. 

Therefore, what we need to know is the statistic of how many words/combinations these 2000 Chinese characters can create, which enable us to understand articles in newspapers. So far I have not seen any such statistics around. However, we have something in a much smaller scale, as it is easy to count all the Chinese characters and words/combinations the Chinese Reading and Writing series.

Let’s take a look at the breakdown of each book in the series:

  • Chinese Reading and Writing 1, there are 70 characters, 158 words/combinations
  • Chinese Reading and Writing 2, there are 50 new characters, 166 new words/combinations
  • Chinese Reading and Writing 3, there are 50 new characters, 179 new words/combinations
  • Chinese Reading and Writing 4, there are 50 new characters, 221 new words/combinations
  • Chinese Reading and Writing 5, there are 50 new characters, 250 new words/combinations
  • Chinese Reading and Writing 6, there are 50 new characters, 325 new words/combinations

From one book to the next, when the increase of Chinese characters stays the same, the increase of words/combinations is huge. 320 Chinese characters have created 1,299 words/combinations, which are used to write all the sentences, conversations, and narratives in the series. There are also a lot words/combinations which these 320 characters can make up but I did not include in the series for the reason that I wanted the books to be more level appropriate for beginner students who just start learning how to read and write Chinese. 

Let’s get back to the 2000 Chinese character statistic, from the trend of the exponential increase of words/combinations for a modest increase of Chinese characters, we can deduce that the 2000 frequently used characters could possibly make up for tens of thousands words/combinations. And people who write for Chinese newspapers are free to use all of them. No wonder students still can not read Chinese newspaper even when they can recognise almost all of the characters. 

Learn more words, not just characters

Chinese texts are written with words/combinations, not individual characters. Therefore, accumulating 2000 individual characters won’t lead to comprehending Chinese newspapers. 

Also when accumulating individual characters becomes the only goal of learning Chinese, students are likely to spend huge amount of hours and energy for tiny progress. They’ll probably spend too much time to study radicals. Read why learning radicals can be a waste of time here, it is not necessary to spend too much time studying radicals. Or they will resort to flash cards and hope to learn Chinese characters quickly. Also read the problems of using flash cards to learn Chinese.

Therefore, if your goal is to be able to read Chinese newspapers or novels, you need to focus on words and combinations, as the key to read any Chinese text is the ability to deconstruct the text into words. (Read more here: Deconstructing Chinese texts is the key to learn how to read Chinese)