Estimating Known Characters/Words (Measuring Chinese Reading Proficiency)

In short, there’s a few nice websites:

  1. https://hanzitest.herokuapp.com/
    27 July 2023 update: Now at hanzitest.ericjiang.com
  2. https://www.arealme.com/how-many-chinese-characters-do-i-know/cn/
  3. http://hanzishan.com/
  4. https://www.hsklevel.com/
  5. https://wordswing.com/how-many-characters-do-you-know/lets_see

So I’ve been working hard at improving my Chinese while I’m at home and my parents have the patience and skills to help me out on a daily basis. I remember going to visit my relatives in China and not being able to express more than that I was still in grad school and also cared about the economy (or something vague like that). It was really embarrassing, as it forced me to realize how much I relied on the English part of Chinglish with my parents, and made me really wish I could express myself better.
(For context, I’m an ABC – American Born Chinese – and learned Chinese through Saturday school, which was organized by local Chinese parents at a nearby Church that had lent us some space. OMG and they had an ice cream vending machine so tasty).

Anyway, depressingly, I probably only know about 1000-1500 characters :'( I guess the nice goal point would be 5,000 characters… I wonder how many I knew when I took my HSK test like a decade ago. Only one of them attempted to estimate words known (vs characters).

Current Status

Anyway for my reference (aka I will redo these estimators again in a year and hopefully see that I get more correct)

https://hanzitest.herokuapp.com/
https://www.arealme.com/how-many-chinese-characters-do-i-know/cn/ — This one was impossible. Clearly aimed at native speakers (well, the UI being in Chinese was a strong sign lol).

The Chinese at the bottom says “汉语局外人 秩序白银” or Pinyin “Hànyǔ júwàirén zhìxù báiyín) = “Chinese Outsider, Order Silver”. Taking this test was definitely An Experience, highly recommend… they have devilish multiple choice questions like, given four (handwritten) characters, “which of these is NOT a real character” 0:

https://www.hsklevel.com/ I guess I was a little too kind to myself grading. But note that they are saying words here, not characters.
https://wordswing.com/how-many-characters-do-you-know/

Speaking of Wordswing, besides this character test, it has a fun new way to learn Chinese: Interactive Fiction! Love the idea. Specifically, there’s WordSwing Chinese (Website). It’s just a normal interactive fiction interface, but with a built-in popover Chinese dictionary. Very cool!

Back to WordSwing’s character test, if you click on “stats”:

https://wordswing.com/how-many-characters-do-you-know/stats

Apps

If you use “Dong Chinese” it will also track for you progress on characters read / writing.

Dong Chinese app

Reading Practice

Pleco Screen Reader + NYTimes Chinese is a great combination!!
(Note: I did pay for Pleco so I’m not sure if this “Screen Reader” tool is a default free feature)

1. NYTimes Chinese App (I selected Dual English/Chinese mode).
Select “Screen Reader” Pleco tool
2. It pops up all the detected words, as “selectable” text
3. Press a word and get the Pleco dictionary result!

Anki

Anki has been eating like an hour a day to go over my cards (this is because I’m learning writing, so I have to physically write out the word!), + 3-4 hours once a week to pick out my mispelled words, add pinyin, and create cards… Finally today I got around to adjusting settings. This is unsustainable – I want closer to 20 mins a day (so 15 mins by the estimated time).

New Anki settings to take less time:
New cards: 3 (instead of 20)
Max reviews per day: 20 (instead of 200)
Interval modifier: 150% (instead of 100%)
Lapses > New Interval: 70% (instead of 0%)
Tools > Preferences > Scheduling: New cards after Reviews (not sure default)

Compare to today, where i spent 45 minutes, studied 200 cards (out of 366 total cards). Of those 200, I reviewed 100 cards, and learned 60 cards (? I guess this is different than new cards, which was limited to 20), and relearned 39 cards.

So to 1/3 the time (45 -> 15 mins), I need closer to 30 review cards. I’ll be conservative for now and put it to 20 reviews. Save some time that I lost from all this over the next few days (probably use it all on making cards lol, but hopefully this will be more long-term manageable and I can spend more time on stuff I enjoy like reading).

Diary

Was supposed to be daily, but really lost it over the last week (due to elections / volunteering for the elections). I have not been putting them online because some of it might be kind of private / boring.

Pic for sense of scale. Sorry for the loading circle, screen-capped this from a video. Purposefully blurry so my embarrassingly misspelled diary isn’t there. That’s my dad’s hand as he helps fix my diary.

Reading a Novel

It has been my dream to read a full Chinese novel. Although this seems unlikely anytime soon, I did find a copy of MDZS which is the novel version of a TV drama I watched (and also read the wiki for). (Note: It’s immensely popular ‘cos the guys are cute, and there’s a lot of behind-the-scenes content to fan over, but to be honest the plot is questionable). Having watched the drama makes it SO much easier to follow what is going on, since I’ve had 50 hours to have the character names drilled into my heads (each of the 30 characters has 3 different names or something…).

Other useful bits:

  • Pleco has specific “Reader” settings: Turn on “Paginate Text” – otherwise it felt like floating in a sea of characters.
  • Also, you can set (Day) background color to be a lot less offensive light yellow color. And a better font was INCREDIBLY important!!
  • I spent $3 on a quality Chinese font for my Samsung phone(buying fonts is part of the built-in Samsung store) called “方正萤雪体“。 Somehow, it make reading large amount of texts SO much more bearable than the default squareish font.
  • Then I used it in Pleco reader (had to set Fonts > Built in Fonts) to get Pleco to use the new font.
Compare the $3 font vs the default font… The expensive font is so much more readable for me! Definitely worth it.

I also got it to work in Anki, though I won’t go through how I got it to work.

Immersion

I also switched fonts on Ubuntu to use the UKai font. And also installed SunPinYin IME which is soooo much better than whatever Ubuntu IME I was using before (sudo apt install ibus-sunpinyin, I think). It has a much better “autocomplete” which is critical for typing Chinese — before, the IME was so bad, that it was faster for me to type Chinese on my phone!!

I’ve started occasionally toggling my cellphone to be in full Chinese for the settings. I also set my Switch (which I got recently — story for another time) game to be in Chinese settings.

Switch: Ring Fit Adventure

Google Lens helps a lot with that. I can point my cellphone at the screen, take a picture, and copy the text out using google lens. Then put it into pleco. Much faster than writing it all out.

Podcast?

I’ve still not found a good Chinese podcast that hits my interest points. So far I just occasionally listen to a radio broadcast of news from some city in New Zealand, Waikato Chinese Voices.

Conclusion

I definitely have more patience for reading Chinese – I can get through a paragraph or two of a newspaper article by myself, and the whole article with Pleco helping. But it would to be able to quantify that, or have a goal to work towards. (like … learning 5000 characters, but I am apparently so far from that, that I don’t know that I will be motivated to just brute force learn characters). I just can’t help but feel some more targeted learning would help me progress faster. But for now, I’m just happy I’ve been able to be fairly consistent about actively learning Chinese for over a month now.

Whew this blog post turned out WAY longer and took way more time than I anticipated. Happy 2021 everyone!


Appendix Anki

Imagine that you notice you’re hitting Easy all the time. You’re seeing cards again too soon.
Now you could just have patience and know that if you keep on hitting Easy, the intervals will grow, and eventually you’ll end up with appropriate intervals. But a better option is to increase the interval modifier from 100% to maybe 150%.

https://eshapard.github.io/anki/target-an-80-90-percent-success-rate-in-anki.html

https://andrewzah.com/posts/2019/better-anki-usage-guide/
https://faqs.ankiweb.net/the-anki-2.1-scheduler.html

IIRC the anki defaults seem to be geared more for short term, dense studying (like for tests). In those situations you would want a lapse to reset the interval to 0.
For longer term learning, you ideally want an 80-90% recall rate.

https://news.ycombinator.com/item?id=22386479

Imagine that you notice you’re hitting Easy all the time. You’re seeing cards again too soon.
Now you could just have patience and know that if you keep on hitting Easy, the intervals will grow, and eventually you’ll end up with appropriate intervals. But a better option is to increase the interval modifier from 100% to maybe 150%.

https://eshapard.github.io/anki/target-an-80-90-percent-success-rate-in-anki.html
https://massimmersionapproach.com/table-of-contents/anki/low-key-anki/low-key-anki-summary-and-installation/

Well so much for productivity, nearly 4,000 Americans died yesterday (Pandemic Diary #31)

Elections: Hot potato we fricking did it!!! Turned Georgia blue, swept BOTH seats! Canvassed 49 doors (for no good reason TBH but I figured better than sitting at home, got some exercise and got to see more parts of GA) on Tuesday. Not much mention of AAPI in the press but that’s okay. Watched results on wbstv / c-span.

Met someone who took leave like me – but for an analytics position lol. Ain’t no one asked me to deal with the pandemic, elections like this. Hardly making any progress on trafficking research. Honestly not too proud of myself right now.

Still think a lot more can be do in a decentralized manner, but hard to argue with results. (e.g. competition to have voter registration drives in every high school – though I think that’s moot since you’re registered by default when you get your driver’s license now in GA! Which is pretty darn cool.) (But another goal would be more like “adopt a neighborhood” rather than this turf-cutting. Just have people you talk to every year. Instead of once an election. Feel cut-off from other volunteers. Maybe exacerbated by pandemic. No watch party to see first wasserman, then AP, etc. call the election for Warnock and Ossoff!!)

Capitol riots insurrection: Frickin surreal to see. Heard rumors but dismissed them. Members of congress in plastic gas masks crouched under low balconies.

Todo: Write letters

  • support 25th or impeachment moveon.org/removetrump
  • ask that all insurrectionists be arrested, at the very least have this on their record in some form. there should be thousands of arrests… cannot allow people to think they have this amount of entitlement and privilege!!
  • thank secretary of state of GA Raff. and Sterling
  • thank local election officials / staff
  • congrats to bordeaux, ossoff, warnock!!

Do research?! Write cat book? T__T I really need to get off news sites.

So glad not to have to listen to Trump’s ranting on Twitter while elector confirmation finished. Watching on https://www.pbs.org/newshour/ . Real world consequences of disinformation…

Too much reddit, 538, twitter. Got to get my own work done. Finish projects so I feel productive again. 2021 off to a shaky start in productivity.

Thoughts: on police: and privilege:

As my friend put it, going to sleep with lack of shooting-ness: Good
Waking up to lack of arrests still: Bad
https://fivethirtyeight.com/features/the-polices-tepid-response-to-the-capitol-breach-wasnt-an-aberration/ 

Was hoping lack of instant arrests vs BLM protests was 1. Trump denying the use of National Guard 2. Pentagon or whomever learning from BLM protests what it means to de-escalate.

But lack of arrests… lack of people on no-fly lists… And then;

“Protesters in Kansas entered the state Capitol building, said Tom Day, the director of legislative administrative services, but they were allowed to stay and remained peaceful as of the late afternoon.”


https://www.nbcnews.com/news/us-news/protesters-gather-outside-state-capitols-nationwide-chaos-sweeps-congress-n1253125

What in the everloving f*ksticks? I’m ashamed. Real ashamed. Ain’t no pretending it’s “bad apples” now hopefully. The contrast is just too severe. I am glad to come to my senses on BLM this year.

(Hope to do some primary research on China, HK and Uighur this weekend. Form some opinions finally, not just feelings)

Those who are peddling lies:

https://www.nytimes.com/interactive/2021/01/07/us/elections/electoral-college-biden-objectors.html

Looks pretty white.

What draws people to power and to try to keep power like this? Why did they decide to become politicians? It’s certainly not a pleasant job. Bunch of 80 year olds up until 4AM holed up with Covid. (maybe they got vaccines already)

In the meantime, COVID, COVID.
Me: Nearly 4,000 Americans died yesterday
Parents: What? No, the world would be in uproar over that, we would’ve heard! No it was four people.
Me: … of COVID
*crickets*

Deadliest day of the pandemic so far.

GEORGIA. SENATE. ELECTIONS. Finally!!! The end (of 2020 election season) is nigh! (Pandemic Diary #30)

FRICKIN FINALLY

I CAN GET SOME WORK DONE

Okay okay looking okay so far oh god so nervous

Are we going to be able to take action on climate change? COVID? Stimulus checks? So hopeful

Went canvassing today which was 100% useless but at least I got some exercise vs. lying in bed checking twitter, I guess

ahhhh GA elections ahhhhhhhhhhhhhhhhhh

Okay I might go check out some zoom watch parties now? for a little bit? Then hopefully get some work done

22:16

Ugh 538 last 2 or 3 days trending upward for dems, widening lead from 0.5 to 1.5 pts or something like that. But hearing of split perdue/warnock voters was nervous. And after 2016 trust no polls. And not sure if trump recorded call would distract people from GOTV. And not sure if trump’s base would actually not vote despite all the speculation. But right now looking promising on nytimes needle??? and half an hour ago someone already called the race for warnock

https://www.nytimes.com/interactive/2021/01/05/us/elections/forecast-georgia-senate-runoff.html

22:30 Warnock +1.8, Ossoff +1.0

From twitter, Cohn: A lot of GOP vote trickling in slowly over the last half hour but it’s mostly been a hair better for Democrats than expected. Warnock win probability now over 90%, and do remember this is accounting for the possibility of some unlikely kinds of errors”

–> omg remember at the end of general (well like after we knew there would be runoffs) people were like eh it’s probably republican, and washpost had those shaded red for forever

Lol so much canvassing and I forgot to sign up for the virtual watch parties before hand and now signups are closed and i have no zoom link T__T cannot meet other volunteers oh well i had dreams

Also I was madly trying to figure out what was wrong with my computer (nojs?) turns out wsb-tv coverage is just offline until 11pm (https://www.wsbtv.com/) (https://www.c-span.org/video/?507707-1/wsb-tv-atlanta-georgia-senate-election-night-coverage)

“If the Dems exceed Biden’s margins in blue counties, that seems to support Stacey Abrams’s argument that the key is not persuading swing voters, it’s getting left-leaning voters to the polls.” Shaila Dewan 4m ago

“Dave Wasserman @Redistrict
Although a lot of TV chatter right now is focused on the big prizes (DeKalb, Cobb, Gwinnett, etc.), lot of it is overlooking what tipped the Warnock/Loeffler race in favor of a call, in my view: Rockdale Co. #GASEN 10:06 PM · Jan 5, 2021 “

“Greg Bluestein @bluestein 28m
Rockdale is among the Democratic strongholds where the #gasen candidates are improving upon Biden’s November margins. #gapol”

Washpost: “9:35PMDemocrats make gains among Hispanic and Black voters, early exit polls find9:32PMIn Fulton County, more people voted in person Tuesday than on Election Day in November9:11PMWhat historic early voting totals tell us about Georgia runoffs’

https://www.ajc.com/politics/live-what-the-ajc-is-watching-during-todays-georgia-runoffs/TJFSEFBCDRCXJA4RWRIHRQKOBU/:

Republicans currently have the lead in the tabulated vote, but the outstanding votes are mainly from Democratic strongholds Results are pouring in much faster than they were in November because of the shorter ballot and a new requirement that county election officials process absentee ballots ahead of time. Counting will still stretch into Wednesday for some counties”

Nate Cohn@Nate_Cohn· Ossoff up to a 92% chance to win, according to our estimates. Warnock is on track for victory with a greater than 95 percent chance to win, according to our estimates. The fundamental GOP problem at this point: the Republican vote is basically exhausted. Ossoff favored to win what’s left by 36 points (he needs to win by 26 to win)

23:38: WSBTV back on at 11pm, watched for 10 mins. Was good to catch some actual live speeches from the candidates. I think RBG was more motivating to me than specific campaign promises to be honest.

omg > Decision Desk HQ Projects @ReverendWarnock (D) has won the Georgia Special Senate Runoff Election Race Called: 11:13PM EST 01.05.21 All Results: https://results.decisiondeskhq.com

23:44 WSBTV: gabriel sterling: knew it would be faster (smaller pages). Preparing for tomorrow. More than expected (over 1 million votes today!!!! wow I thought expectation was half a million)

23:46 be kind to those on both sides, there will be another election in 2 years. also something about fulton county, vehicles blocking absentee ballot delivery o__o wat

Now the question is margin… and I actally have no idea vs general I had some idea for some reason. Don’t remember. I expected 6k margin, back when GA was still under, and ended up 12k.

“Burton said DeKalb had more people voting in person today than the number of people who voted in person in November. Even so, he stressed “this is working seamlessly and it’s working the way it’s intended to work.”” word! https://www.ajc.com/politics/live-what-the-ajc-is-watching-during-todays-georgia-runoffs/TJFSEFBCDRCXJA4RWRIHRQKOBU/

00:01 AWW YEA at 11:53PM:

Dave Wasserman@Redistrict·17m

Fact: Whitfield Co., where Trump held his pre-election rally, turned out at just 86.1% of November levels. The state as a whole is on track to exceed 89% of November levels.

Dave Wasserman@Redistrict·17m