Towards a System for Learning to Read Japanese

This article has been moved to; post any new comments there.

Learning Japanese through reading practice can be very frustrating (probably a familiar theme by now). Instead of simply learning just new vocabulary or new characters, you’re often trying to learn new characters, new radical/sub-component characters that made up the character you’re studying (in order to help you understand how its components combined to provide the meaning it has), 4 or 5 or 10 possible pronunciations this character might have depending on a variety of contexts, new vocabulary and compounds that use this character, and maybe even some new, previously unfamiliar grammar points or nuances to the phrase you’re trying to study. I often feel that attempting to learn vocabulary and characters by reading through Japanese text is an exercise in sheer force of will: just pushing through and tackling all these items simultaneously, just to learn sufficiently about one character in one word, and having to start it all over again for the next tiny fraction of a phrase I’m trying to both comprehend and learn for future benefit.

I’ve used a variety of techniques to help me expand my repertoire of kanji, and none of them are a panacea. I’ll spend some time trying to read through grade-school books. This has the advantage of having a much lower density of kanji to learn at one time, but a disadvantage in that kanji-derived vocabulary is actually much quicker to learn if I understand them from their component kanji, giving me insight to why it’s pronounced that way, and associating those sounds to the word’s meaning. So I’ll get into some heavier reading, in order to focus more on learning kanji than on learning vocabulary, but of course I run into the problems I described in the paragraph above. Then I’ll turn to a Japanese reader for learners, where it provides the vocab and character readings for me right there (but of course only the readings for what I happen to be reading right then and there). Then I’ll try learning kanji through means other than reading them in a sentence; for instance, going through a list of kanji by order of frequency-of-use. That way, each new character I learn will be applicable in a maximum of reading situations: I’ll get the biggest bang for my learning buck. Or I’ll spend some time learning more about characters that are also components of other, more complex characters; they may not be especially frequently-used characters, but knowing them will enable me to better understand and learn about the more frequently-used characters.

But I get to feeling like I’m studying each of these methods through sheer force of will, and not at all feeling like I’m learning fluidly, easily, or naturally, fighting my way tooth-and-nail for every inch of gained territory, banging my head against the same wall until I get a headache, switching to pounding against it with my fist until the skin of my hands is too raw. The problem with all these methods is, I’ll learn all I need to know about what I need for that specific situation, and maybe some related studying for my future understanding—but I won’t get to practice that knowledge, so I stand a pretty solid chance of losing it.

What I really liked about Miller’s A Japanese Reader (see my previous article) is that it starts from zero, and builds up to a fairly complete repertoire of characters. This is great, because I’m not so much jumping into a sea of characters, trying to grab each fish with my hands (while struggling to keep the fish I already had), as sitting with a net by a stream, nabbing each fish as it comes past me one by one, and tossing it in with the others. However, the text moves too fast, and doesn’t require enough repeat encounters with the same information to keep me from losing what I’ve learned so far (if I stick to the net-and-stream analogy, then there’s a gaping tear in my net). Also, it’s all old texts so it’s significantly out-of-date (some of the fish aren’t particularly edible?).

The Reading Japanese book by Eleanor Jorden (again, see the article) does a much better job of letting you become familiar with each new character before you have to deal with the next. It frequently seems like it brings up some previous character at just the right time. Even with the kana (syllabic) characters it introduces as few as possible as are necessary to start using them to form words. It’s a pretty ideal way to learn; unfortunately, it doesn’t teach as many characters as it might (425) and it too is fairly out-of-date in some of its information. Both Reading Japanese and A Japanese Reader suffer from the fact that they depend on other books to teach the grammar they use in the book, and are designed to be used in lock-step with other materials that are no longer in print today (despite the fact that these books themselves are).

There are some things I really wish had been explained to me earlier on in my Japanese learning career. One of the earliest things I learned was that the verb “miru” can be used for any of “to see, to watch, to look at”. When I started to learn kanji, I learned that “miru” is written 見る. No one bothered to explain to me that it might be more appropriate to write as 観る in some situations where “to watch” is meant (such as “I forgot to watch The Daily Show last night”), or as 視る if I mean “to view”. Every time I come across a new character for a verb or an adjective, I have to do some legwork to find out whether there are other characters used for the same (or nearly the same, and homophonic) word for subtly different contexts. But none of the textbooks or learning resources I had taught me to do this; I’ve learned to do so through experience. Worse (much, much worse!) there are no Japanese-to-English dictionaries I’ve ever encountered that even bother to differentiate properly about which one to use and when! Usually all the variant writings are just listed alongside each other for the same definition. If there are examples, they may illustrate some of the uses properly, but not in wide enough range to give the learner a clue. This can be a major problem! It would be much, much, much better if this had been explained to me from the start, and if every time a new character was introduced for words that also used other characters in different situations, some context was given to make it quite clear in which contexts I should and should not choose this particular character to express a word.

For about the last year or so, ideas have been percolating in my head about what an effective system for learning to read Japanese might be. I would love a system that combined the strengths of some of my favorite systems, while avoiding as many of their weaknesses as possible. Learning shouldn’t have to be an impressive display of abilities in concentration and determination. It should introduce characters a few at a time, building only on previous repertoire (unlike typical “reader/collection” texts, which tend to dump the characters of a normal block of text on you, without building on previous studies), and reiterate previously studied material often enough to keep it fresh in your mind, without moving too quickly, and providing a sufficient repertoire of characters before the end to allow the student to be able to recognize and understand most of the characters they encounter in other, more natural Japanese texts.

A lot of students are under the impression that they must learn all 1,945 (soon to be more) of the Jōyō Kanji (those required to be taught to children at school from 1st grade through high school), but I’m not convinced that’s the case. All kanji characters are not created equal. According to the Kodansha Kanji Learner’s Dictionary, from an analysis of data collected from a year’s worth of issues of the Japanese newspaper Asahi Shin’bun, understanding just the 500 most frequently-used kanji provides eighty percent coverage of the papers’ text! One quarter of the nearly two thousand characters in the Jōyō Kanji set provide coverage for four out of five of every kanji in the text! And knowing the top thousand kanji gives you 95% coverage!

Of course, 80% coverage still means that you’re having to struggle with one out of every five characters, which is still a bit too frequent to read comfortably. 95% is a great deal better, but you’re still looking up one out of twenty. Still, it at least provides enough of a foundation that you may only have to look up one or two in each sentence (more on some, maybe none on others—it’s an average, after all). Enough that studying the remaining kanji required to get truly comfortable won’t be such an incredibly daunting task.

Also, keep in mind that any sort of frequency analysis from just one source is liable to be biased. Looking over frequency lists taken from analysis of newspapers, I see characters related to politics, business and government at much higher positions than I imagine they’d appear in, say, literature, or manga. Still, quite a lot of it should correspond to the most important characters from a wider variety of sources, and pretty much the same principles of coverage should apply.

Ideally, characters that are component parts of more complex characters should be introduced before those characters (even if they occur less frequently). We may be wasting time on “lesser” characters where the more complex character will be handier in reading real-world texts, but the less frequently-encountered character could make it easier to understand and remember the more frequently-encountered character, which could make the detour worthwhile.

Of course, if we introduce too many obscure component characters just to provide study keys for the more complex characters, we may be spending too much energy learning them. And the value of learning these component characters may depend a great deal on whether we want to focus on simply recognizing characters on site, or be able to actually write them out correctly. It takes much greater familiarity with the character’s form and components to write than it does to recognize.

I took a look at a few sort-by-usage-frequency lists, and made some gut decisions about what might be the minimum number of characters I’d need to know in order to feel fairly comfortable in reading a book with a dictionary in hand. The first 500 characters from newspaper-based lists have many many characters that I recognize to be essential, but there were still quite a lot of essential characters that are outside of that group. I started looking through the list, noting how often I’d see characters that I was familiar with and knew to be pretty crucial. Around the thousand-character mark it started feeling like a tolerably low leverl of important characters, but I was still seeing a lot of important characters beyond there. I made up a list based on the first thousand most frequently-used characters as identified by Monash University’s KANJIDIC, which used data collected by Alexandre Girardi from four years of Mainichi Shin’bun issues; and hand-picked about eighty additional characters from the next 700 frequently-used characters (positions 1001-1700 in the list) that I knew I’d personally encountered more than once during reading (yeah, I know: not very scientific).

Alright, so given all this, here’s my thoughts on what I currently consider to be an ideal system for improving Japanese reading abilities.

  • Reading exercises should be fun, and not drone on mindlessly (perhaps unlike this article—sorry). It should be interesting, gripping, or at least humorous… anything to make it fun, providing enough motivation to keep learning.
  • It should be as natural as possible. Instead of requiring you to try to come up with sufficiently clever mnemonics and visualizations that will help you finally remember the character (but ultimately fail to provide you with keys for the various pronunciations in various contexts), it will rely on repetition and reinforcement to keep the information fresh in your mind. New concepts should be hammered in at first, and only gradually drift into the general background, being sure to make frequent reappearances.
  • The complete array of a given character’s common pronunciations and uses should be studied together. It doesn’t help much to learn just one portion of a character’s typical uses, if that’s not the use you happen to encounter when you’re reading “real” Japanese texts.
  • Anything that gets in the way of actual reading (where reading includes comprehension, of course) should be avoided as much as possible. Studying vocabulary lists (or flipping back and forth between them and the text) puts a halt to the reading; new words and kanji should be introduced in the text itself, accompanied by an annotation providing the translation for this context… it should become apparant through varied use what the more detailed definition is.
  • The most frequently-encountered kanji should appear before less frequent ones.
  • It should introduce a sufficient set of characters that reading “real” Japanese texts is not hampered by lack of kanji knowledge.
  • Ideally, it would also familiarize the student with characters as they are commonly used for names of people and places.

Unfortunately, as far as I know, no system matching these criteria is available, to my knowledge. If I want to use a system like this, I’ll have to create it. It would be much, much better if someone with a much, much greater knowledge of the Japanese language would create this, as I’m prone to make plenty of mistakes, teach some things that aren’t true, and especially, write Japanese that would look a little strange to actual Japanese readers. But no one else has done it yet, and I (and I think others) need this, even if it’s far from perfect, so I believe I’ll make an attempt. If I make it as open a process as possible, perhaps I can find collaborators that would be willing to help me edit and polish, and to correct my mistakes and my poor Japanese.

But in order to create such a work in a remotely feasible amount of time, I’ll have to make certain compromises:

  • It will focus almost exclusively on learning new kanji. Kana-only vocabulary, and enhanced knowledge of grammar, will not be priorities, though they will receive light treatment.
  • The focus will be on reading, not writing. Information about stroke order, and differentiating handwritten forms from printed forms, will not be included. Since these days Japanese is more often typed than written by hand, this seems a reasonable priority decision. However, the ability to choose the right character to express a given word is just as important when typing as when writing, so reading exercises will attempt to give students the tools they need in order to make the right choices.
  • Characters that form components of other characters won’t be added to the list of ~1,080 characters I plan to include. However, such characters that are already among the highest-frequency characters that were slated for learning, will be moved toward the front so that they occur before the characters that incorporate them.
  • A certain level of understanding of Japanese grammar and vocabulary will be presumed. This is not for beginning students, but for intermediate students that wish to come to an advanced-intermediate level of kanji knowledge.

I guess we’ll find out what comes of this. I’ll be doing this for my own benefit primarily, because I really, really need for this to exist. But I’ll need someone else to come along and fix it.