Category Archives: Japanese Language

Japanese Mail Archive Character-Set Transcoder

I have no idea if this will be helpful to anyone else, but I’ll just throw this out there for the search engines to pick up, just in case. I wrote a Perl script to take a Sendmail mbox archive of email messages, and transcode the text of all their bodies, and the Subject and From headers, to UTF-8. This script may be had here. It reads in the archive on standard input, and spits the transcoded archive on standard output.

I’m subscribed to a few Japanese-language mailing lists (well, more accurately, one Japanese-language mailing list, a daily mail of the Slashdot Japan headlines, and Google Alerts for wget and tmux). The idea for this was for me to get Japanese practice by reading regular Japanese content on subjects I’m interested in.

The problem is, I just don’t read it when it comes in. The best time for me to practice Japanese is on the train, during my work commute. Which is why I have a Kindle 3, so I can browse Japanese websites on the train. So, I need these mails web-accessible. No problem, I can use a mail web archive tool, like hypermail.

But hypermail doesn’t like dealing with mbox files that consist of messages that are in various incompatible encodings; some of my mail arrives in UTF-8 (unicode), and others in ISO-2022-JP (a popular encoding for Japanese-language text). Hypermail doesn’t deal with encoded characters in the mail headers, and also I didn’t know what character encoding to configure Apache to tell the browser, because it differed from one mail to the next.

So I wrote this transcoder tool in Perl. It just scans through all the mails, decodes the Subject and From headers (currently leaves the others), and transcodes all ISO-2022-JP (or anything that’s not UTF-8), so all the messages use the same encoding, and I just configure Apache to use UTF-8 for all of them. The best part is that, now that the actual content and the server-specified character encoding agree consistently, I can use online tools like Hiragana Megane to process this web mail archive, and automatically provide pronunciations for words I’m not very familiar with.

The script requires the following Perl modules: MIME::Tools (for parsing email format), Mail::Mbox::MessageParser (for parsing mbox), and Text::Iconv (to handle the transcoding).

New Japanese Blog

I’ve split off a new blog, www.JapaneseReader.com, from the handful of articles on this blog about learning Japanese; and also added a new post examining readings of the kanji character 「見」 within compounds, and a number of different kanji which can be used to write the verb 「みる」 and when to use them.

The plan is to focus on learning tips, groups of kanji that are commonly used to write essentially the same word (such as in the new article), and learning kanji through reading practice.

Towards a System for Learning to Read Japanese

This article has been moved to www.JapaneseReader.com; post any new comments there.

Learning Japanese through reading practice can be very frustrating (probably a familiar theme by now). Instead of simply learning just new vocabulary or new characters, you’re often trying to learn new characters, new radical/sub-component characters that made up the character you’re studying (in order to help you understand how its components combined to provide the meaning it has), 4 or 5 or 10 possible pronunciations this character might have depending on a variety of contexts, new vocabulary and compounds that use this character, and maybe even some new, previously unfamiliar grammar points or nuances to the phrase you’re trying to study. I often feel that attempting to learn vocabulary and characters by reading through Japanese text is an exercise in sheer force of will: just pushing through and tackling all these items simultaneously, just to learn sufficiently about one character in one word, and having to start it all over again for the next tiny fraction of a phrase I’m trying to both comprehend and learn for future benefit.

I’ve used a variety of techniques to help me expand my repertoire of kanji, and none of them are a panacea. I’ll spend some time trying to read through grade-school books. This has the advantage of having a much lower density of kanji to learn at one time, but a disadvantage in that kanji-derived vocabulary is actually much quicker to learn if I understand them from their component kanji, giving me insight to why it’s pronounced that way, and associating those sounds to the word’s meaning. So I’ll get into some heavier reading, in order to focus more on learning kanji than on learning vocabulary, but of course I run into the problems I described in the paragraph above. Then I’ll turn to a Japanese reader for learners, where it provides the vocab and character readings for me right there (but of course only the readings for what I happen to be reading right then and there). Then I’ll try learning kanji through means other than reading them in a sentence; for instance, going through a list of kanji by order of frequency-of-use. That way, each new character I learn will be applicable in a maximum of reading situations: I’ll get the biggest bang for my learning buck. Or I’ll spend some time learning more about characters that are also components of other, more complex characters; they may not be especially frequently-used characters, but knowing them will enable me to better understand and learn about the more frequently-used characters.

But I get to feeling like I’m studying each of these methods through sheer force of will, and not at all feeling like I’m learning fluidly, easily, or naturally, fighting my way tooth-and-nail for every inch of gained territory, banging my head against the same wall until I get a headache, switching to pounding against it with my fist until the skin of my hands is too raw. The problem with all these methods is, I’ll learn all I need to know about what I need for that specific situation, and maybe some related studying for my future understanding—but I won’t get to practice that knowledge, so I stand a pretty solid chance of losing it.

What I really liked about Miller’s A Japanese Reader (see my previous article) is that it starts from zero, and builds up to a fairly complete repertoire of characters. This is great, because I’m not so much jumping into a sea of characters, trying to grab each fish with my hands (while struggling to keep the fish I already had), as sitting with a net by a stream, nabbing each fish as it comes past me one by one, and tossing it in with the others. However, the text moves too fast, and doesn’t require enough repeat encounters with the same information to keep me from losing what I’ve learned so far (if I stick to the net-and-stream analogy, then there’s a gaping tear in my net). Also, it’s all old texts so it’s significantly out-of-date (some of the fish aren’t particularly edible?).

The Reading Japanese book by Eleanor Jorden (again, see the article) does a much better job of letting you become familiar with each new character before you have to deal with the next. It frequently seems like it brings up some previous character at just the right time. Even with the kana (syllabic) characters it introduces as few as possible as are necessary to start using them to form words. It’s a pretty ideal way to learn; unfortunately, it doesn’t teach as many characters as it might (425) and it too is fairly out-of-date in some of its information. Both Reading Japanese and A Japanese Reader suffer from the fact that they depend on other books to teach the grammar they use in the book, and are designed to be used in lock-step with other materials that are no longer in print today (despite the fact that these books themselves are).

There are some things I really wish had been explained to me earlier on in my Japanese learning career. One of the earliest things I learned was that the verb “miru” can be used for any of “to see, to watch, to look at”. When I started to learn kanji, I learned that “miru” is written 見る. No one bothered to explain to me that it might be more appropriate to write as 観る in some situations where “to watch” is meant (such as “I forgot to watch The Daily Show last night”), or as 視る if I mean “to view”. Every time I come across a new character for a verb or an adjective, I have to do some legwork to find out whether there are other characters used for the same (or nearly the same, and homophonic) word for subtly different contexts. But none of the textbooks or learning resources I had taught me to do this; I’ve learned to do so through experience. Worse (much, much worse!) there are no Japanese-to-English dictionaries I’ve ever encountered that even bother to differentiate properly about which one to use and when! Usually all the variant writings are just listed alongside each other for the same definition. If there are examples, they may illustrate some of the uses properly, but not in wide enough range to give the learner a clue. This can be a major problem! It would be much, much, much better if this had been explained to me from the start, and if every time a new character was introduced for words that also used other characters in different situations, some context was given to make it quite clear in which contexts I should and should not choose this particular character to express a word.

For about the last year or so, ideas have been percolating in my head about what an effective system for learning to read Japanese might be. I would love a system that combined the strengths of some of my favorite systems, while avoiding as many of their weaknesses as possible. Learning shouldn’t have to be an impressive display of abilities in concentration and determination. It should introduce characters a few at a time, building only on previous repertoire (unlike typical “reader/collection” texts, which tend to dump the characters of a normal block of text on you, without building on previous studies), and reiterate previously studied material often enough to keep it fresh in your mind, without moving too quickly, and providing a sufficient repertoire of characters before the end to allow the student to be able to recognize and understand most of the characters they encounter in other, more natural Japanese texts.

A lot of students are under the impression that they must learn all 1,945 (soon to be more) of the Jōyō Kanji (those required to be taught to children at school from 1st grade through high school), but I’m not convinced that’s the case. All kanji characters are not created equal. According to the Kodansha Kanji Learner’s Dictionary, from an analysis of data collected from a year’s worth of issues of the Japanese newspaper Asahi Shin’bun, understanding just the 500 most frequently-used kanji provides eighty percent coverage of the papers’ text! One quarter of the nearly two thousand characters in the Jōyō Kanji set provide coverage for four out of five of every kanji in the text! And knowing the top thousand kanji gives you 95% coverage!

Of course, 80% coverage still means that you’re having to struggle with one out of every five characters, which is still a bit too frequent to read comfortably. 95% is a great deal better, but you’re still looking up one out of twenty. Still, it at least provides enough of a foundation that you may only have to look up one or two in each sentence (more on some, maybe none on others—it’s an average, after all). Enough that studying the remaining kanji required to get truly comfortable won’t be such an incredibly daunting task.

Also, keep in mind that any sort of frequency analysis from just one source is liable to be biased. Looking over frequency lists taken from analysis of newspapers, I see characters related to politics, business and government at much higher positions than I imagine they’d appear in, say, literature, or manga. Still, quite a lot of it should correspond to the most important characters from a wider variety of sources, and pretty much the same principles of coverage should apply.

Ideally, characters that are component parts of more complex characters should be introduced before those characters (even if they occur less frequently). We may be wasting time on “lesser” characters where the more complex character will be handier in reading real-world texts, but the less frequently-encountered character could make it easier to understand and remember the more frequently-encountered character, which could make the detour worthwhile.

Of course, if we introduce too many obscure component characters just to provide study keys for the more complex characters, we may be spending too much energy learning them. And the value of learning these component characters may depend a great deal on whether we want to focus on simply recognizing characters on site, or be able to actually write them out correctly. It takes much greater familiarity with the character’s form and components to write than it does to recognize.

I took a look at a few sort-by-usage-frequency lists, and made some gut decisions about what might be the minimum number of characters I’d need to know in order to feel fairly comfortable in reading a book with a dictionary in hand. The first 500 characters from newspaper-based lists have many many characters that I recognize to be essential, but there were still quite a lot of essential characters that are outside of that group. I started looking through the list, noting how often I’d see characters that I was familiar with and knew to be pretty crucial. Around the thousand-character mark it started feeling like a tolerably low leverl of important characters, but I was still seeing a lot of important characters beyond there. I made up a list based on the first thousand most frequently-used characters as identified by Monash University’s KANJIDIC, which used data collected by Alexandre Girardi from four years of Mainichi Shin’bun issues; and hand-picked about eighty additional characters from the next 700 frequently-used characters (positions 1001-1700 in the list) that I knew I’d personally encountered more than once during reading (yeah, I know: not very scientific).

Alright, so given all this, here’s my thoughts on what I currently consider to be an ideal system for improving Japanese reading abilities.

Reading exercises should be fun, and not drone on mindlessly (perhaps unlike this article—sorry). It should be interesting, gripping, or at least humorous… anything to make it fun, providing enough motivation to keep learning.
It should be as natural as possible. Instead of requiring you to try to come up with sufficiently clever mnemonics and visualizations that will help you finally remember the character (but ultimately fail to provide you with keys for the various pronunciations in various contexts), it will rely on repetition and reinforcement to keep the information fresh in your mind. New concepts should be hammered in at first, and only gradually drift into the general background, being sure to make frequent reappearances.
The complete array of a given character’s common pronunciations and uses should be studied together. It doesn’t help much to learn just one portion of a character’s typical uses, if that’s not the use you happen to encounter when you’re reading “real” Japanese texts.
Anything that gets in the way of actual reading (where reading includes comprehension, of course) should be avoided as much as possible. Studying vocabulary lists (or flipping back and forth between them and the text) puts a halt to the reading; new words and kanji should be introduced in the text itself, accompanied by an annotation providing the translation for this context… it should become apparant through varied use what the more detailed definition is.
The most frequently-encountered kanji should appear before less frequent ones.
It should introduce a sufficient set of characters that reading “real” Japanese texts is not hampered by lack of kanji knowledge.
Ideally, it would also familiarize the student with characters as they are commonly used for names of people and places.

Unfortunately, as far as I know, no system matching these criteria is available, to my knowledge. If I want to use a system like this, I’ll have to create it. It would be much, much better if someone with a much, much greater knowledge of the Japanese language would create this, as I’m prone to make plenty of mistakes, teach some things that aren’t true, and especially, write Japanese that would look a little strange to actual Japanese readers. But no one else has done it yet, and I (and I think others) need this, even if it’s far from perfect, so I believe I’ll make an attempt. If I make it as open a process as possible, perhaps I can find collaborators that would be willing to help me edit and polish, and to correct my mistakes and my poor Japanese.

But in order to create such a work in a remotely feasible amount of time, I’ll have to make certain compromises:

It will focus almost exclusively on learning new kanji. Kana-only vocabulary, and enhanced knowledge of grammar, will not be priorities, though they will receive light treatment.
The focus will be on reading, not writing. Information about stroke order, and differentiating handwritten forms from printed forms, will not be included. Since these days Japanese is more often typed than written by hand, this seems a reasonable priority decision. However, the ability to choose the right character to express a given word is just as important when typing as when writing, so reading exercises will attempt to give students the tools they need in order to make the right choices.
Characters that form components of other characters won’t be added to the list of ~1,080 characters I plan to include. However, such characters that are already among the highest-frequency characters that were slated for learning, will be moved toward the front so that they occur before the characters that incorporate them.
A certain level of understanding of Japanese grammar and vocabulary will be presumed. This is not for beginning students, but for intermediate students that wish to come to an advanced-intermediate level of kanji knowledge.

I guess we’ll find out what comes of this. I’ll be doing this for my own benefit primarily, because I really, really need for this to exist. But I’ll need someone else to come along and fix it.

Resources That Gave Me A Leg Up In Japanese

This article has been moved to www.JapaneseReader.com; post any new comments there.

As I mentioned before (here and here), learning Japanese can be very frustrating. Here are some resources that I found to be better-than-average at helping me find the next footing. Many of them have unique qualities that made them enormously helpful in some way; but nearly all of them also have major shortcomings.

Essential Japanese by Samuel E. Martin. I can’t even begin to describe how enormously helpful this book is. It taught me more in one, literally pocket-sized, 460-page volume than I had previously understood from 18 years of studying Japanese on-and-off through classes at a Japanese Buddhist church, university courses, and various resources I used for self-teaching.

Unfortunately, it’s also incredibly outdated, and much the vocabulary is specialized for post-World War II military basemen and Christian missionaries, who were the primary target market for this book (probably making up the majority of Americans living in Japan). It’s also out of print; all these things make it somewhat difficult to recommend for general use. It teaches some extremely formal language that has nothing to do with “Essential Japanese” any longer—it’s pretty much only spoken around royalty, and most people aren’t dining with the Emperor’s family. It’s great for understanding super-formal speech in a Samurai movie, maybe, not much else. But on the other hand, you won’t easily find explanations of these speech modes anywhere else, and in addition it explains language that really is crucial to understand in a very straightforward manner, that I hadn’t seen addressed in other textbooks I’d used. It is at the same time the most complete and most concise book on Japanese grammar I’ve had the pleasure to know.

It’s worth noting that it does not use any sort of Japanese writing system other than romaji (latin letters).

Here’s my review of it on Amazon.

Nintendo DS + Dictionary software. The Nintendo DS is, of course, a video game system, but Nintendo and other publishers have actually made available plenty of great software for Japanese language studies on the DS. The dictionary software I use is this one: 漢字そのまま・楽引き辞典 (kanji sono mama/rakubikijiten), which means something like “Kanji As-It’s-Written” Easy-Lookup Dictionary. It’s a Japanese-Japanese and Japanese-English dictionary, intended for Japanese users (the interface is in Japanese), but the great thing about it is that you can just draw in the characters you want to look up. This makes it easy to look up words, even if they’re made up of characters you’ve never seen before! Without a tool like this, it used to take me several minutes just to look up one character; if I was trying to look up a word made up of several characters that weren’t familiar to me, I could spend around 10 minutes on each character (finding the primary radical, and counting the total strokes, perhaps getting the stroke-count wrong, checking the other stroke-count entries…), and then spend a few minutes on the dictionary while I try each of the several pronunciations that character might have to see which one is being used in this case. Looking it in while writing it is just loads easier!

The character-recognition is very forgiving, and frequently recognizes the character I’m trying for, even when I’m writing on a bumpy train or shuttle, and can’t really even recognize the character I wrote myself! If it guesses wrong, it gives you the chance to choose from close alternatives. Occasionally I have trouble making it understand the character I wanted; this is most often due to getting the ordering of strokes wrong (most of the time it forgives this as well, but there are times it doesn’t). In general, though, I spend seconds performing a lookup that could’ve taken me twenty minutes!

Unfortunately, the quality of the Japanese-English dictionary entries themselves leaves much to be desired; often, I’ll use the Nintendo DS to identify the word, and then look up the word in a different dictionary in order to actually gain understanding of it. The Japanese-Japanese entries tend to be better, but of course that can have the tendency of sending me looking up other words from the definition… not ideal. Still, it’s well worth the money I spent; I might leave my paper dictionary at home, but I always have my DS with me if I plan to do any Japanese reading.

Reading Japanese by Eleanor Jorden.

Here’s the text of my review on Amazon:

This is a truly excellent resource for learning written Japanese. Great pains were taken to introduce the characters in such a way that they can be used immediately and repeatedly from that point forward. For instance, when beginning with the Katakana characters, rather than teaching the characters in canonical order, it starts with just the two characters “su” and “mi”, and from those teaches you to write “Sue”, “Smith”, “Miss Sue Smith”, etc. It then quickly builds on these, ensuring at each step that the next small set of characters introduces a large array of new things you can immediately learn to write.

Accusations that the material is out-of-date, are not wrong (this is the reason I must give the book four stars rather than five). The book was published in 1976! Much of the kanji is used a little differently, or has been replaced in certain uses by other characters. Of course, most of it is still applicable, and when no newer resources come even close to being as effective, you learn you must make do with information that may be out-of-date–better to have slightly-dated but solid knowledge of the most common uses of several hundred kanji than to continue to struggle to learn your first hundred or so.

Note that the author has written a more recently-published set of books, Japanese: The Written Language: Part 1, Volume 1 (Workbook) (Yale Language Series); I have not examined these but I suspect they may correspond to much of the same material, but perhaps more recently-updated. It might be worthwhile to look into those.

This book, Reading Japanese, is intended to be used in conjunction with a companion grammar book, Beginning Japanese: Part 1 (Yale Language Series) (Pt. 1). However, if you are already familiar with basic Japanese grammar, you will probably find that you can do without the companion.

A note on romanization: you should not be scared off by the fact that it uses “si” instead of “shi”, or “hu” instead of “fu”. Many Japanese will romanize similarly, and a serious student of the language will need to become comfortable with systems such as Kun’rei-shiki in addition to the more popular (at least among English speakers) Hepburn romanization system. Recognizing “si” and “shi” as the same phoneme with the same pronunciation will help the student become stronger in the language.

Weighing in at only 425 kanji, this book will clearly not be enough on its own to give you command of the written language; but it provides a very excellent start. Follow it up with something like A Japanese Reader: Graded Lessons for Mastering the Written Language (Tuttle Language Library) (another somewhat-dated but excellent book), which covers a much fuller set.

Reading Japanese With A Smile

Here’s the text from my review on Amazon:

What I really love about this book is it provides a lot of meat in a very small package. The book is both small enough and complete enough, that you can simply grab it on your way out the door, to work on when you’re standing in line, or on the bus, etc. Each story is less than two small pages, so you get your sense of accomplishment quicker. 🙂

On the pages opposing the Japanese stories are the english translations; but I don’t find the translations so useful as the sentence-by-sentence destruction (which repeats the Japanese, but with furigana) and commentary that follows after each story. Each sentence is further decomposed almost word-by-word, and includes such things as explanations of common idioms, and even pointing out puns and wordplay.

Since the decompositions provide all the readings for the kanji and explanations of the vocabulary, the book is really all you need to read the passages. You don’t need to grab your kanji and wa-ei dictionaries (though I tend to anyway, in case I want to gain a little more insight).

Because of the furigana, I don’t think strong knowledge of kanji is necessary to enjoy this book (though of course it will make it easier: you may not need to flip to the commentary as often). A working, intermediate knowledge of Japanese grammar, however, is important, as you’re generally assumed to understand various verb forms and sentence patterns.

A Japanese Reader by Roy A. Miller.

You can find this book here on Amazon.

The main drawing feature of this book is that it starts from zero understanding of Japanese writing systems (but not of grammar), and builds up to bring you to a complete repertoire of not only the kana syllabaries, but the full set of the 1,850 tōyō kanji characters (the precursor to the current set of 1,945 jōyō kanji that are required to be considered “literate” in Japanese, which didn’t exist yet when this was originally written).

As you might guess from that last sentence, this book is old. It was first written in 1962, and in fact, some of the earlier lessons are intended to be used alongside Martin’s Essential Japanese book at the top of the list; in fact, I originally bought my copy of Essential Japanese for exactly that reason (I had the Reader first).

I admire the aspirations, and the book has been very useful to me, but I’ll admit I’ve never gotten much beyond lesson 30 (of 75). It is very fast-paced, very demanding, and doesn’t really reiterate often enough in my opinion, so there’s a good chance of losing what you’ve gained.

Kanji ABC

Here’s the text of my review on Amazon; hopefully it will explain what I appreciate about it.

This book is a great help for finding tricks to learn kanji characters more quickly. However, I don’t think it’s sufficient on its own to really complete learning these characters.

The book is divided into two parts; Part I is a repertoire of around 250 “graphemes”, kanji “pieces” that are used to build up actual kanji characters, but may not necessarily form characters of their own. If that sounds like the definition of a “radical”, well good: they’re closely related. However, there are various graphemes that are not officially considered radicals, so you might consider the graphemes to be a superset of the radicals.

Each grapheme is associated with an english word or phrase. The book is fairly careful to use different words for very similar meanings, so that you can manage to keep them separate.

Part II is a list of two thousand kanji characters, ordered in such a way as to make full use of the graphemes learned. The kanji are ordered so that the characters only use those graphemes that have already been introduced in the associated group from Part I. Each character is listed along with only its very most common readings (kun and on), and a list of the english words representing the graphemes from which it has been built (which appear in an index at the back of the book).

The book is intended to be used in one of two ways: one way (the way I’ve chosen to use it) is to learn all of the graphemes in Part I (or at least a large number), and then use Part II to look up characters you wish to learn, and see which graphemes it is made up of. Of course, in reality, you wouldn’t normally need to look them up to begin with if you know all the graphemes: you’ll recognize them in the characters themselves.

The other way this book is intended to be used, is to systematically learn all the characters of Part II, by learning one group of graphemes, and then studying all the characters from Part II for that same group (which will be ordered appropriately). According to the preface, this is the “ideal” way to use the book. However, I don’t really see that as practically possible, without the use of a more detailed kanji dictionary (such as The Kodansha Kanji Learner’s Dictionary. Because, for on readings, you can’t really get a feel for a character without seeing in what compounds it appears, and how; and learning kun readings can be very misleading, since often a single adjective (atatakai) or verb (hajimeru) may be written using multiple alternative kanji, depending on the context and subtle differences in meaning that are intended. Thus, Kanji ABC might be adequate by itself to learn to _read_ the most common cases where these characters appear, but is quite inadequate for learning when to _write_ them.

The nice thing about this book is that it provides just the tools you need to help grasp the components of a given kanji character, and little else. It doesn’t bog you down with _why_ these components have been associated with a given meaning. In the end I think this helps you to learn them more quickly. Other books that may focus more on a character’s etymology (such as A Guide to Remembering Japanese Characters (Tuttle language library) (Japanese Edition) can be very enlightening, but in the end they tend to just confuse, as the original etymology of the characters can often have little to do with the modern form and meaning. On the other hand, the trade-off is that you often don’t get the “true, original meaning” of a radical or grapheme, just the one that makes it easiest to combine it with other graphemes to learn a kanji.

A Guide to Remembering the Japanese Characters

Here‘s the link on Amazon.

I mentioned this book in the review above for Kanji ABC. Basically, this is the book I used to use to accomplish about the same things for which I now use the tricks from Kanji ABC. That doesn’t mean that Kanji ABC is better; they both have their points. Kanji ABC assigns words/concepts to individual “graphemes” without really explaining them; the Guide will dissect a character into its original ideographic components, referring to historical forms and meanings. But often, it will end up by saying “but its current meaning is totally different”, and all that build-up for grasping its history may well turn out to be for nothing as far as actually finding the keys for remembering the character is concerned.

Kanji ABC is pretty much oriented around learning hundreds of “graphemes” and their associated concepts, which afterward can be recognized in actual characters, and using it in the reverse to look up an arbitrary character and then break it into its individual concepts doesn’t really work as effectively (in my opinion). The Guide works a bit better for that, since you don’t have to go looking up the other components it broke into (apart from components which are themselves other characters), and it provides you a workable mnemonic phrase right there in the entry.

Learning Japanese

(Amazon link.) This is a fairly academic, university student-oriented book. I don’t know how it compares on the whole with other Japanese textbooks, since these are the ones I happened to use in University and in my classes as a teen. However, I will say, the one thing I really, really have come to appreciate about it is that it focuses a lot of its energy and book-space on drills, lots and lots of drills. Reading drills, question/response drills, transformation drills (where you start with a word and build or modify a sentence as it introduces new words to add to your “sentence”), etc. It gives you a firm stepping-stool into that crucial skill of thinking in Japanese, rather than translating to and from English. Although I can’t claim to really be doing that, I’m much closer to it than I would otherwise be, thanks to this book (and I’d probably be much closer if I just went back and went through these drills again).

Japanese books for school-age children.

When I want to get into reading Japanese without straining myself over kanji, it helps to find some Japanese material intended for school-age children (who themselves are still learning kanji). I’ve picked up a few books from the Kinokuniya bookstore I’m fortunate to have nearby in San Jose from the series, イッキによめる！ (ikki ni yomeru! “Read in one go!” Amazon.co.jp link), which are collections of Japanese folk tales, with one book for each year of grade school. One nice thing about this is that these stories tend, more than most other books I’ve read, to use a lot of colloquialisms, which is a great way to learn some areas of Japanese you might not otherwise be familiar with.

Why Learning Japanese Can Be Frustrating, Part Two

This article has been moved to www.JapaneseReader.com; post any new comments there.

This is part two of a series on the frustrations of learning the Japanese language. The foregoing assumes you’ve already read part one (though you should be able to get along without it: it’ll just start somewhat abruptly).

Learning Through Reading Japanese

I believe that my voracious appetite for reading is directly responsible for an enhanced understanding of English; and when you want to master a language, it stands to reason that you should expose yourself to as much reading material written in it as possible. This makes it all the more frustrating, that learning Japanese through exposure to its written language is extremely difficult, time-consuming, and tedious.

For instance, suppose you’re confronted with a Japanese sentence such as:

明日図書館へ行きましょうか?

Assuming that you’ve studied your approximate hundred hiragana characters (half of which are just systematic modifications of the basic 46 forms), you’ll have no problems with the へ and きましょうか bits, and since hiragana is a syllabery, you instantly know how they sound, too: even though they don’t represent much in terms of meaning until you know what are the more complicated-looking characters that remain (the kanji).

But what can you do with the kanji (明日図書館, and 行), if you’ve never encountered them before? It’s not like you can look each up in a dictionary by its name, since you don’t know it. Literally all you know at this point is its shape. And that’s what you end up having to look it up by. Many characters can be broken into separate constituent parts that might be common to other characters, or are possibly complete characters in their own rights—for instance, 明 can be broken into 日 and 月, 館 can be broken into 食 and 官. So you count the number of brush/pen strokes it would take to write that character, look it up by that number, then find that component (called a “radical”), and then find the character you’re looking for (possibly by further looking under the total number of strokes to find a character). Once you find the character that matches, flip to the page number that describes that character.

So, we’ve found the page holding the description of the meaning for (say) 行, and its pronunciation. Except there’s not one pronunciation: there are several. Fortunately, in the case of 行, it’s followed by a string of hiragana which, after even a small amount of training, a student will quickly recognize as the final portion of a verb. The list of possible readings for 行 will include various verbs that start with that character, and an indication of what characters would be expected to follow it when performing its function as a verb. Verbs, whether written with initial kanji or not, will virtually always be followed by a string of hiragana which indicate how the verb is being conjugated, and usually indicate which reading the kanji itself has (if one is already familiar with the character, of course). You’ll determine that in this case it should be read simply as the “i” in “ikimashouka?”, which means “shall we go?”.

The 行 had hiragana to help indicate its reading (when filling this role, they are called “okurigana”). No such luck with the others. If I look up 明, I see 14 possible readings, and 10 different verbs/adjectives it could form part of (depending on the okurigana that follows it, and fortunately most of them have the same reading for 明). I can rule out the ones that require okurigana, because there isn’t any in this case. And the remaining readings can be divided into two groups, one group that is used mainly when the character appears as its own separate word, and another that is used mainly when the character appears in combination with other kanji to form a compound word (but there are no guarantees). Since it appears with other characters, it’s reasonable to assume they form some sort of compound, and fortunately the group of readings more commonly used for compounds comprises just 3 of the 14. Ideally, the compound I’m looking for will appear in the list of common compound words that will appear on that page of my kanji dictionary: then I will have identified not only the correct reading for this character, but for the remaining characters in the word, too. Otherwise, I’ll most likely need to look up the next character the same way I did the first, find its list of possible readings, and try looking up various possible reading combinations between the two characters in my Japanese dictionary (the one for looking up words, and not kanji).

Alas! In this particular case, none of the possible readings of 明 are correct here! 明日 is a relatively special case where two characters join together to form a word which is pronounced based on the characters’ combined meaning, rather than their individual readings. Unless it was listed as one of the example compounds (fortunately, this is likely), I would probably have a very, very hard time discovering that the word is “ashita”, which means “tomorrow”. Fortunately, even without knowing this, I probably would have settled on the alternative reading “myōnichi”, which is a legitimate reading of the characters 明日, and also means “tomorrow”, but combines the actual readings of the characters, making it possible to find using the dictionary approach. It wouldn’t be wrong, and I’d be able to continue reading from that point—even understanding the correct meaning—but “myōnichi” is a very formal word, and I’d sound pretty funny to any Japanese person that would hear me use that word in the middle of a relatively casual sentence. There’s even one more possible reading of 明日 (still with the same meaning): “asu”!

Perhaps you’ve noticed at this point that I’ve now said 明日 is a single word. Of course, it’s a word that appears in a sea of kanji: 明日図書館. How the hell are you supposed to figure out where one word ends and another begins? The answer: brute force. If, when trying to look up what word might be formed from 明 and 日, you see that there’s a word 明日, well then the next kanji might start a new word, so you react accordingly. (The remaining characters turn out to form the word for “library”; the meaning of the full sentence was: “Shall we go to the library tomorrow?”.)

Of course, it’ll suck if it turns out that there’s another word that continues with that kanji… the other day, someone on a Japanese-language IRC channel wrote “今日本語を勉強しています”, which means “I’m studying Japanese now”. But there’s a bit of ambiguity in the start of the sentence: 今 is the word for “now”, and 日本語 is the word for “Japanese language”. But 今日 is a word meaning “today”, and 本 by itself means “book”: “今日本を読みました” would mean “I read a book today.” Notice that the first three characters (kanji) of both sentences are the same, even though they have no words in common (not counting the particle “word” を). You don’t know whether the first couple of words are “today, book” or “now, Japanese” (obviously, the word order is completely different from that of English), until you reach the character 語. If you’d never seen any of these characters before, and were looking each of them up individually, it might take you quite a while to figure out what was written if you first started down the track of 今日 being the first word, and trying to figure out what word might be formed by 本語, only eventually discovering your mistake after an hour or so of banging your head against the table! A well-placed comma will go a long way in such a case, but by now I think you’ve got a very good picture as to why learning Japanese through reading it, has the potential to shorten your lifespan.

As you saw with 明日, “ashita/asu/myōnichi”, a given string of kanji can sometimes have multiple possible readings. This is not particularly uncommon. However, sometimes a string of kanji can not only have multiple readings, but those readings can actually have different meanings! Take 見物, which can be read either as “kenbutsu/to go sightseeing”, or as “mimono/thing worth seeing”. Only the surrounding context can make it clear which is meant. (In practice, 見物 is a rarer reading, and a writer might choose to write 物 in kana instead, to alleviate any ambiguity.)

When I sit down to make an attempt at reading a Japanese book, I’ll often end up with several other books nearby to help me do so. There’s the actual book I’m trying to read, the kanji dictionary for looking up unfamiliar characters, and the vocabulary dictionary. I’ll also often have a secondary kanji book that is less useful for discovering what a character is, but more helpful for providing advice and information on how to remember the character. And, I might even have another book or two to look up points of grammar that I’m having trouble grasping.

As I read a passage, I look up my characters, possibly look them up again for tips on what the individual components might mean, look up unfamiliar words, and if necessary look up an unusual verb or adjective use. I can easily spend twenty minutes, maybe even up to an hour, on a single sentence. These days I try to rely more on electronic dictionaries, which save both time and disk space, but the process remains largely tedious. As I learn more characters, vocabulary and grammar, each additional passage takes me less time, but it can still be an ordeal.

But, after mastering reading and gorging myself on Japanese literature, will I finally have gained an intimate familiarity with the language that I can take and apply directly in using the spoken language in conversation? Well… yes and no. Obviously, Japanese is Japanese, and a lot of phrases I get from books can go straight into conversations too. But, the usual style of written Japanese is still significantly different in tone, level of formality, etc, from spoken Japanese. Nobody speaks the way a typical Japanese book reads, except maybe radio announcers and other people who aren’t actually speaking to a particular person. If you spoke sentences you’d gotten from a book, in many cases you’d sound like you were quoting (which, of course, you would be). Fortunately, reading (modern) novels means that you may pick up some actual conversations in the form of dialog, and manga (Japanese graphic novels) are a popular way of learning the colloquial language—but note that both of these can often depict very casual conversation styles that would be considered quite rude if you employed them in conversation outside of family and close friends.

Wrapping things up

I was going to follow this section up with some ideas about learning Japanese vocabulary and kanji efficiently; but after working on that for a while, I’ve decided that the ideas are still too immature to put here yet. Meanwhile, I think I’ll write an article that talks about my favorite Japanese-learning resources instead, and where they still manage to fall frustratingly short, and what I think the perfect Japanese course might look like. Probably, some of the ideas I had for efficient learning will seep into that article.

Why Learning Japanese Can Be Frustrating

This article has been moved to www.JapaneseReader.com; post any new comments there.

I’ve always been fascinated by the study of foreign languages. This fascination probably sprang from a childhood love of codes and ciphers. The languages I was most fascinated with were those with beautiful and exotic writing systems. If I had encountered Devanagārī before Japanese, perhaps that would have become the primary object of my dedicated language studies.

I was 11 years old, trying to learn to write in Russian, when my buddy Laban showed me some Japanese characters he’d been learning—the Hiragana (ひらがな). These characters have a simple, graceful beauty to them; from that moment I was hooked.

My buddy and I learned the two sound-based, syllabic writing systems, Hiragana and Katakana (カタカナ), and had good fun doing so. We even incorporated them into the ciphers we’d use to write letters to each other.

The funny thing is, we both learned both writing systems, without actually knowing a lick of Japanese—so we had little to write in it, apart from Japanized English. It didn’t take us long to decide that we really wanted to start learning the language to go with the writing. We both tried starting from library books, but quickly switched to some weekly classes that were being taught on Saturdays at a Japanese Buddhist church. I later took several classes at CSU Sacramento, and beyond that, both of us have been struggling off and on to continue our study of the language over the last twenty years.

Now, obviously, during the twenty years that have passed, there has not been twenty years’ worth of work put in. Nevertheless, I definitely would have hoped to be further along at this point than I am. In particular, it frustrates me that I can’t pick up a book, newspaper or magazine and just start reading along, stopping briefly every so often to look up a word. An explanation for how this is possible is the focus of part two of this article.

I’ve recently entered into another spurt of Japanese study, and I am happily finding that I am making significantly more progress this time than I have in the past. Partly because I’m motivated by the growing realization that I’ve spent so much time on this language, but am still as yet unable to put it to reasonable use: I would like to put enough additional investment into it to avoid all my past effort having been little more than wasted time. And also partly because I’m beginning to understand what sorts of techniques have the best effects for me, and am learning to avoid the sorts of practices that have in the past caused me to burn out, and set aside studying for large spaces of time.

Grammar and Pronunciation

Thankfully, neither Japanese grammar nor its pronunciation are particularly difficult. Japanese pronunciation has sometimes been compared to Spanish (though there are of course some important differences), and the structure of the language is controlled by a set of rules which, though very different from English, are at least consistently followed, so that you are not required to memorize a plethora of exceptions (as you might find necessary if you were learning English as a second language). In fact, if you do not intend to learn anything beyond actually speaking and comprehending spoken Japanese, you might find it to be much easier to learn than some Western languages.

For one thing, it lacks the concept of “gendered nouns”, that many Western languages have (but not English), requiring English learners to memorize the apparently random words that must be prefaced with (e.g.) “la” and “las” instead of “el” and “los”. In fact, Japanese doesn’t even have a word for “the” or “a”.

Japanese also nearly lacks plural forms—that is, Japanese does have a couple different ways to express plurality, but they are generally only used when there is a reason to stress the fact that we’re talking about more than one. The Japanese word “hon” could mean any of “a book”, “the book”, “the books”, “some books”, or “books”.

Some Asian languages, such as Chinese and Thai, distinguish meaning via the use of variously pitched tones. A rising, falling, or stationary high or low pitch can seriously change the meanings of words. Fortunately, Japanese has no such thing—or almost none; pitch is very occasionally crucial for distinguishing one word from another, but it is rare for a textbook to explain these, and virtually unheard-of for a dictionary to include them.

Japanese’s Three Writing Systems

Japanese is written using three distinct writing systems. There are the two syllabaries I’ve already mentioned, and an additional form called “kanji”. The kanji are characters that come directly from Chinese (though there are a few characters that were invented in Japan), and in fact “kanji” means “Chinese characters”. Kanji are used primarily for two things: to write Japanese words that were taken from Chinese, and to represent the meaning behind many Japanese words. The “hiragana” syllabic system is used to fill in the gaps, serving the purpose of allowing Japanese grammar to flow around Chinese ideographs. The “katakana” syllabic system is mainly used to represent foreign words and non-word sounds, and also to provide emphasis.

As an example of how hiragana is used together with the kanji, take the Chinese character for “come”: 来. But when used as a verb, especially as the primary verb in a sentence, it won’t usually be used to express the verb “to come” by itself: it needs to be accompanied by some hiragana characters to indicate the tense of the verb—and even how you should pronounce the character 来. If you want to say, “I will come”, you might write “kimasu” 来ます; if you want to say, “I have come”, you might instead write “kimashita” 来ました. The character 来 provides the initial “ki” sound, and the rest must be written using hiragana characters. (Note: the pronoun “I” is ommitted in the Japanese examples, and isn’t necessary for these complete sentences, which could also have been translated as “she will come”, and “it came”, etc, depending on the surrounding context.)

The two syllabaries, hiragana and katakana (together known as the “kana”), each have 46 basic forms used in modern Japanese, and modified versions of these forms are used to represent another fifty sounds or so. In contrast, there are several thousands of kanji characters. Students in Japan are expected to know some two thousand kanji by the time they’ve finished high school.

A natural question might occur to you: why are kanji even necessary, given how daunting a task it is to learn them, and that there are two complete syllabic writing systems, either of which could be used by themselves to write any Japanese phrase without resorting to kanji? For instance, take the following sentence:

晩御飯を食べてから疲れていたので、映画を見に行かずに寝る事にしました。

ばんごはんをたべてからつかれていたので、えいがをみにいかずにねることにしました。

bangohan o tabete kara tsukarete ita no de, eiga o mi ni ikazu ni neru koto ni shimashita.

After eating dinner, I found I was tired, and so decided to go to bed instead of going to see a movie.

As you can see, the second line, which is written entirely in hiragana and is perfectly comprehensible to a Japanese reader (barring any mistakes I may have made in writing it), is a whopping five characters longer than the first line, which uses eleven kanji, most of which are quite a bit more complex than any hiragana. If I had written these using a pen (or a brush), the first line could possibly have taken nearly twice as long to write! So why bother with them at all?

Well, probably the main justification is that this is simply how the written language happened to evolve. Of course, that’s hardly consolation, and it’s far from the whole story. After World War II, the government made some fairly extensive changes to the written language, simplifying the use of kana, and restricting the number of kanji required to be a literate reader of the language. If they could do these things, they could probably have eliminated the kanji altogether; but they did not. As you’ll see shortly, there are good reasons not to use kana alone, but surely something else would have been more expedient… though in losing the kanji altogether it would inescapably have lost much of its beauty as well.

But the kanji aren’t useless, either. The Japanese language is filled with more than its fair share of homonyms; which can easily lead to confusion. Remember “kimasu” 来ます? But “kimasu” can mean “I will wear” as well as “I will come”. In the vast majority of cases, context will make it quite clear which is meant; but when reading, the judicial use of kanji will eliminate all doubt: the one meaning “will wear” is written 着ます.

An extreme (and famous) example:

李も桃も桃の内。

すもももももももものうち。

sumomo mo momo mo momo no uchi.

Both the plum and the peach are members of the prunus family. (Literally: Both plum and peach are within “peach”.)

The second line, written only with hiragana, is unreadable. As you can see, written Japanese does not use spaces to separate words (though these may be used in romanized Japanese, as in the third line). The word “momo” appears twice, the word “sumomo” once at the beginning, and between these are a couple of small words “mo”, which are called particles, and serve a grammatical purpose (the “both … and …”). Someone speaking this phrase out loud could easily distinguish the words and particles from each other with the use of inflection and light pauses; but as a string of hiragana characters all you see is a sea of も (“mo”).

Kanji comes to save the day! The top line uses kanji very effectively to distinguish one word from another. “SUMOMOmoMOMOmoMOMOnoUCHI”, where each string of capital letters is a single word written in kanji. It’s very easy to pick them apart now!

A Foreign Mode of Thought

Another obstacle to learn Japanese, is the fact that not only isn’t it very similar to English and other Western languages, but it isn’t really very similar to even the way we tend to think about things.

Take for instance the phrase, “I like sports”. In Japanese this might be expressed as “watashi wa supootsu ga suki desu” (the “oo” in “supootsu” is not pronounced like “boot”, but as a long o sound “oh”, which makes “supootsu” a tolerable approximation of the English word “sports”, especially since both u’s are actually silent/whispered). “Mary likes Jon” could be “Meari wa Jon ga suki desu”. In the first few lessons of a typical Japanese textbook, a student will usually learn that “A likes B” is given as “A wa B ga suki desu” in Japanese. Unfortunately, this is a simplification of the truth.

The truth is, a Japanese sentence tends to rely very heavily on its context. “A wa B ga suki desu” doesn’t quite mean that A likes B… in some cases, it can actually mean B likes A. I was recently chatting on IRC with someone, and she said (in English) that she likes “Pocky” (a popular Japanese sweet). My response in Japanese was “Pocky wa nihon no minna-san ga suki desu”. In this case, the meaning is “Everyone in Japan (nihon no minna-san) likes Pocky”; not “Pocky likes everyone in Japan”!

Well, if “A wa B ga suki desu” can mean either one of “A likes B” or “B likes A”, then how can you know when it means one and when the other? The answer is that it doesn’t really mean either one, but something else entirely that doesn’t map all that cleanly onto English modes of thought. How to translate it into something that makes sense to an English speaker depends heavily on what else had been said before it!

“A wa” means something like “as for A”, or “speaking of A”; it marks “A” as “the thing I’m going to say something about now”, and implies that the actual interesting part of the sentence is the rest of it (whatever follows the “wa”). The particle “ga” marks the word or phrase before it as being the grammatical subject of the sentence, and “suki” means something similar to “likeable” except that it doesn’t entirely make clear whether the subject is the sentence is the thing liked (the typical situation), or the thing that does the liking (the case in my Pocky sentence). So, “Watashi wa supootsu ga suki desu” really means something like, “Speaking of myself, sports (are) likeable”. The “watashi wa” bit says, “I’m speaking of myself, but ‘myself’ is not the interesting thing I’m saying: the interesting part of the sentence is that I like sports.”

Without surrounding context, “Minna-san wa Pocky ga suki desu” can mean what most people are taught to think it means: “Everyone likes Pocky”. However, its implication is that the thing we’re talking about isn’t Pocky, it’s everyone. It makes a great answer to a question such as “What kind of thing is loved by everyone?”, because in that case, the interesting part is “what kind of thing”, and not “everyone”, which is only the topic of conversation (the interesting thing is what you have to say about the conversation’s topic). But in most cases that you want to say “everyone loves Pocky”, you’re not talking aobut everyone, you’re talking about Pocky. The interesting part of the sentence is what you have to say about Pocky, so Pocky gets the “wa” after it, and the “ga” goes after “everyone”. “Pocky wa Minna-san ga suki desu.”

This is an extremely difficult concept for many Westerners to grasp about Japanese. Unfortunately, it’s also a very important concept: the Japanese language demands the use of “wa” and “ga”, and knowing when to use which, all the time. The concept is sufficiently complex that there are whole chapters, and even whole books, written on just the subject of how “wa” should (and shouldn’t) be used.

A somewhat similar example would be the Japanese phrase “mita hito” (見た人), which could mean either “the person who saw (it)”, or “the person who was seen/the person whom I (or someone else) saw”. In Japanese, to modify a noun by a phrase you simply drop the phrase right before the noun: “saw-person”. This makes it pretty clear what is being modified, but it doesn’t provide any clues as to whether the modified noun is the subject or the object of the modifying phrase. The astonishing thing, though, is that it generally doesn’t need to: since you never say either “the person who saw it” or “the person I saw” without having already been talking about somebody having seen something, in actual use you would never be without the context that explains which is meant. So there isn’t actually any need to distinguish between them: if the previous sentence was “I spied an old friend when I went to the store today”, then the meaning of “saw-person” in a following sentence, “The ‘saw-person’ was Tommy Jenkins” is obvious. Similarly, if the previous sentence had been “I think somebody saw you when you were chewing out Susan”, then “saw-person” in a following sentence would be in reference to whoever witnessed the scolding.

Of course, when context is insufficient or missing, then one can resolve ambiguity by rephrasing or clarifying. There’s nothing wrong with saying “sono koto o mita hito” (the person who saw that taking place) or “watashi no mita hito” (the person that I saw) when the need arises; it’s just that such phrases are fairly rare, precisely because they’re unnecessary when context provides enough information for us to know which was meant, without explicitly saying. Similarly, if by “Mary wa Jon ga suki desu” we mean that Jon likes Mary rather than the reverse, and the context doesn’t clarify this, we can explicitly say “Mary is liked by Jon”: “Mary wa Jon ni suki desu”.

Note: so far, the best explanation of wa versus ga that I’ve read, is in Making Sense of Japanese: What the Textbooks Don’t Tell You by Jay Rubin; but I also feel that he makes some fairly extreme statements. In particular, I think his explanation could tend to scare someone off from using “wa”, and that despite his attempts to say otherwise, the reader is left with the general feeling that “wa” should only be used in special circumstances, which is far from the case. In the end, the best way to understand wa is to be exposed to a lot of Japanese sentences where it is used (and practice making your own), to find good textbooks that give useful explanations of how it affects emphasis and meaning, and to avoid grossly oversimplified descriptions such as it being the equivalent to English “the”, or expressing the real gramattical subject in intransitive verbs.

The second part to this set of articles continues here.

micah.cowan.name/blog/

The random ramblings of Micah Cowan. Programmer, musician, typesetting enthusiast, gamer…

Category Archives: Japanese Language

Japanese Mail Archive Character-Set Transcoder

New Japanese Blog

Towards a System for Learning to Read Japanese

Resources That Gave Me A Leg Up In Japanese

Why Learning Japanese Can Be Frustrating, Part Two

Learning Through Reading Japanese

Wrapping things up

Why Learning Japanese Can Be Frustrating

Grammar and Pronunciation

Japanese’s Three Writing Systems

A Foreign Mode of Thought