Estimating Known Characters/Words (Measuring Chinese Reading Proficiency)

In short, there’s a few nice websites:

  1. https://hanzitest.herokuapp.com/
    27 July 2023 update: Now at hanzitest.ericjiang.com
  2. https://www.arealme.com/how-many-chinese-characters-do-i-know/cn/
  3. http://hanzishan.com/
  4. https://www.hsklevel.com/
  5. https://wordswing.com/how-many-characters-do-you-know/lets_see

So I’ve been working hard at improving my Chinese while I’m at home and my parents have the patience and skills to help me out on a daily basis. I remember going to visit my relatives in China and not being able to express more than that I was still in grad school and also cared about the economy (or something vague like that). It was really embarrassing, as it forced me to realize how much I relied on the English part of Chinglish with my parents, and made me really wish I could express myself better.
(For context, I’m an ABC – American Born Chinese – and learned Chinese through Saturday school, which was organized by local Chinese parents at a nearby Church that had lent us some space. OMG and they had an ice cream vending machine so tasty).

Anyway, depressingly, I probably only know about 1000-1500 characters :'( I guess the nice goal point would be 5,000 characters… I wonder how many I knew when I took my HSK test like a decade ago. Only one of them attempted to estimate words known (vs characters).

Current Status

Anyway for my reference (aka I will redo these estimators again in a year and hopefully see that I get more correct)

https://hanzitest.herokuapp.com/
https://www.arealme.com/how-many-chinese-characters-do-i-know/cn/ — This one was impossible. Clearly aimed at native speakers (well, the UI being in Chinese was a strong sign lol).

The Chinese at the bottom says “汉语局外人 秩序白银” or Pinyin “Hànyǔ júwàirén zhìxù báiyín) = “Chinese Outsider, Order Silver”. Taking this test was definitely An Experience, highly recommend… they have devilish multiple choice questions like, given four (handwritten) characters, “which of these is NOT a real character” 0:

https://www.hsklevel.com/ I guess I was a little too kind to myself grading. But note that they are saying words here, not characters.
https://wordswing.com/how-many-characters-do-you-know/

Speaking of Wordswing, besides this character test, it has a fun new way to learn Chinese: Interactive Fiction! Love the idea. Specifically, there’s WordSwing Chinese (Website). It’s just a normal interactive fiction interface, but with a built-in popover Chinese dictionary. Very cool!

Back to WordSwing’s character test, if you click on “stats”:

https://wordswing.com/how-many-characters-do-you-know/stats

Apps

If you use “Dong Chinese” it will also track for you progress on characters read / writing.

Dong Chinese app

Reading Practice

Pleco Screen Reader + NYTimes Chinese is a great combination!!
(Note: I did pay for Pleco so I’m not sure if this “Screen Reader” tool is a default free feature)

1. NYTimes Chinese App (I selected Dual English/Chinese mode).
Select “Screen Reader” Pleco tool
2. It pops up all the detected words, as “selectable” text
3. Press a word and get the Pleco dictionary result!

Anki

Anki has been eating like an hour a day to go over my cards (this is because I’m learning writing, so I have to physically write out the word!), + 3-4 hours once a week to pick out my mispelled words, add pinyin, and create cards… Finally today I got around to adjusting settings. This is unsustainable – I want closer to 20 mins a day (so 15 mins by the estimated time).

New Anki settings to take less time:
New cards: 3 (instead of 20)
Max reviews per day: 20 (instead of 200)
Interval modifier: 150% (instead of 100%)
Lapses > New Interval: 70% (instead of 0%)
Tools > Preferences > Scheduling: New cards after Reviews (not sure default)

Compare to today, where i spent 45 minutes, studied 200 cards (out of 366 total cards). Of those 200, I reviewed 100 cards, and learned 60 cards (? I guess this is different than new cards, which was limited to 20), and relearned 39 cards.

So to 1/3 the time (45 -> 15 mins), I need closer to 30 review cards. I’ll be conservative for now and put it to 20 reviews. Save some time that I lost from all this over the next few days (probably use it all on making cards lol, but hopefully this will be more long-term manageable and I can spend more time on stuff I enjoy like reading).

Diary

Was supposed to be daily, but really lost it over the last week (due to elections / volunteering for the elections). I have not been putting them online because some of it might be kind of private / boring.

Pic for sense of scale. Sorry for the loading circle, screen-capped this from a video. Purposefully blurry so my embarrassingly misspelled diary isn’t there. That’s my dad’s hand as he helps fix my diary.

Reading a Novel

It has been my dream to read a full Chinese novel. Although this seems unlikely anytime soon, I did find a copy of MDZS which is the novel version of a TV drama I watched (and also read the wiki for). (Note: It’s immensely popular ‘cos the guys are cute, and there’s a lot of behind-the-scenes content to fan over, but to be honest the plot is questionable). Having watched the drama makes it SO much easier to follow what is going on, since I’ve had 50 hours to have the character names drilled into my heads (each of the 30 characters has 3 different names or something…).

Other useful bits:

  • Pleco has specific “Reader” settings: Turn on “Paginate Text” – otherwise it felt like floating in a sea of characters.
  • Also, you can set (Day) background color to be a lot less offensive light yellow color. And a better font was INCREDIBLY important!!
  • I spent $3 on a quality Chinese font for my Samsung phone(buying fonts is part of the built-in Samsung store) called “方正萤雪体“。 Somehow, it make reading large amount of texts SO much more bearable than the default squareish font.
  • Then I used it in Pleco reader (had to set Fonts > Built in Fonts) to get Pleco to use the new font.
Compare the $3 font vs the default font… The expensive font is so much more readable for me! Definitely worth it.

I also got it to work in Anki, though I won’t go through how I got it to work.

Immersion

I also switched fonts on Ubuntu to use the UKai font. And also installed SunPinYin IME which is soooo much better than whatever Ubuntu IME I was using before (sudo apt install ibus-sunpinyin, I think). It has a much better “autocomplete” which is critical for typing Chinese — before, the IME was so bad, that it was faster for me to type Chinese on my phone!!

I’ve started occasionally toggling my cellphone to be in full Chinese for the settings. I also set my Switch (which I got recently — story for another time) game to be in Chinese settings.

Switch: Ring Fit Adventure

Google Lens helps a lot with that. I can point my cellphone at the screen, take a picture, and copy the text out using google lens. Then put it into pleco. Much faster than writing it all out.

Podcast?

I’ve still not found a good Chinese podcast that hits my interest points. So far I just occasionally listen to a radio broadcast of news from some city in New Zealand, Waikato Chinese Voices.

Conclusion

I definitely have more patience for reading Chinese – I can get through a paragraph or two of a newspaper article by myself, and the whole article with Pleco helping. But it would to be able to quantify that, or have a goal to work towards. (like … learning 5000 characters, but I am apparently so far from that, that I don’t know that I will be motivated to just brute force learn characters). I just can’t help but feel some more targeted learning would help me progress faster. But for now, I’m just happy I’ve been able to be fairly consistent about actively learning Chinese for over a month now.

Whew this blog post turned out WAY longer and took way more time than I anticipated. Happy 2021 everyone!


Appendix Anki

Imagine that you notice you’re hitting Easy all the time. You’re seeing cards again too soon.
Now you could just have patience and know that if you keep on hitting Easy, the intervals will grow, and eventually you’ll end up with appropriate intervals. But a better option is to increase the interval modifier from 100% to maybe 150%.

https://eshapard.github.io/anki/target-an-80-90-percent-success-rate-in-anki.html

https://andrewzah.com/posts/2019/better-anki-usage-guide/
https://faqs.ankiweb.net/the-anki-2.1-scheduler.html

IIRC the anki defaults seem to be geared more for short term, dense studying (like for tests). In those situations you would want a lapse to reset the interval to 0.
For longer term learning, you ideally want an 80-90% recall rate.

https://news.ycombinator.com/item?id=22386479

Imagine that you notice you’re hitting Easy all the time. You’re seeing cards again too soon.
Now you could just have patience and know that if you keep on hitting Easy, the intervals will grow, and eventually you’ll end up with appropriate intervals. But a better option is to increase the interval modifier from 100% to maybe 150%.

https://eshapard.github.io/anki/target-an-80-90-percent-success-rate-in-anki.html
https://massimmersionapproach.com/table-of-contents/anki/low-key-anki/low-key-anki-summary-and-installation/

4 thoughts on “Estimating Known Characters/Words (Measuring Chinese Reading Proficiency)”

  1. Hi there, I’m the creator of HSKlevel. Thanks for the reviews of the test 🙂 I’ve added the characters count in the results section of the test, and made significant improvements on the estimation algorithm, maybe you’ll want to take it again! Hope you’ve progressed since you wrote this article and good luck with your Chinese learning 🙂

    1. Wow, neat re: changes to HSKlevel! And thanks for the reminder, I’ll take the tests again sometime soon.

  2. Hey, thanks for the link to my hanzi test! FYI because Heroku ended their free tier I updated the URL to hanzitest.ericjiang.com – the old URL will likely stop working at some point so it’d be great if you could update your link. Hope you made some good progress on your Chinese since then!

    1. Thanks for the update! Sorry that I just saw this so many months later, I’ll update the post. And yes, I’ve been diligently “studying” Chinese by watching rom-coms and playing tears of the kingdom in Chinese.

Comments are closed.