Douglas Triggs (doubt72) wrote,
Douglas Triggs

Apropos Of Nothing

I suppose some of you know about the programs I wrote to practice Japanese on and all that -- I've rewritten them all in Ruby::Tk over the last year or so, and finally (maybe a few weeks ago), I finished all the kanji programs (which I never quite did in TCL/TK).

Well, of course, that means I have to finish the vocabulary side, building a dictionary for that (which, let me tell you, it's a mess, especially compared to the kanji dictionary), and writing a few programs (your basic flashcard kinda thing, and more specialized number, adjective, and verb inflection practice programs).

Anyway, status on the dictionary:

209 I-adjectives
345 na-adjectives
218 adverbs
1 conjunction
59 particles
111 nouns
311 proper names
30 pronouns
412 numbers
431 type 1 (five-step) verbs
216 type 2 (one-step/-ru) verbs
323 type 3 "Irregular" (mostly suru) verbs [although I suppose you could say that the suru verbs are actually quite regular considered as a class]

2613 total words (by "level": 108, 261, 403, 546, 654, 641)

Of course, those numbers don't add up, because a number of words have definitions that fit into multiple parts of speech, and they're sorted by kanji (which is sometimes just kana, say for something like あなた or something). So "juubun" the NA-adj, adv ("enough") and "juufun" ("ten minutes") are all under the same entry, since the kanji is the same. I might need to redo some things, though, because the current structure has some issues, since things with the same kanji -- but pronunciations of varying rarity, say, much less meanings -- currently all have the same "level."

Of course, you might note a couple of holes in the dictionary -- 111 nouns isn't very many (and currently all of the nouns are either time-related, or also counting words and the like), and 1 conjunction? Yeah. And why so many numbers, you might ask? It includes all the counting words, because they're often irregular, so all the things like 一本 (ippon), 二本 (nihon), and 三本 (sanbon) are also in there.

The dictionary's never really going to be pretty, I'm afraid. The data going into it isn't the best (i.e., frequency data and the like), and having a dictionary edited by someone who doesn't really know a language is always a recipe for success, isn't it? I suppose the same could be said for the kanji dictionary, except (1) kanji is a much, much simpler problem than vocabulary (2) the relevant data for kanji is more or less standardized and available, so there's a lot less in the way of "judgement" calls than I made for the vocabulary stuff.

So, things that need to be finished before I'm "done" with the whole drill program thing:

1. finish stroke diagrams for the kanji sketch program (namely, jouyou 5 and 6, everything up to 4 is done).
2. finish building the dictionary (which will probably be by grabbing vocab lists for the JLPT 1-4 to fill the holes)
3. write the vocabulary flash card program (partly dependent on #2 there, but could go anyway).
4. Write a verb, an adjective, and a number drill program -- got the data for that, so I could finish that at any time.

Plus, of course, the time I need to spend studying the grammar book, and the language CDs. Lots to do, lots to do.

