Feb 04, 2026
It had been raining all week. I (Jonah) had been hard at work designing Yiddish-themed trading cards inspired by Leyzer Ran’s from 1963. The series would be called חרטומי אשכנז and would be bigger than the Yu-Gi-Oh! franchise. Talk about fleshpots of Egypt! We were going to be rich. Earlier that morning I had been on the phone with Yoram Globus discussing the anime adaptation, then I was compiling the stats for the latest addition to the deck, and I lost track of time. I ran out and purchased ingredients for a caprese salad. I ate quickly—there was important business to attend to. I received a text from Raphi: “Home?” He wanted to know where he should pick me up for our interview with Harry Bochner, who lives in Medford, MA. We stopped for gas and talked strategy.
Harry Bochner runs the Yiddish dictionary project verterbukh.org, the most trusted lexicographical Yiddish resource on the web. If you’ve ever emailed [email protected] to purchase 2,000 definitions or apply a student discount, you’ve emailed Harry Bochner. Bochner got a PhD in linguistics at Harvard, where he became interested in South Slavic languages and eventually wrote a dissertation on morphology. It was during this time that he began to study Yiddish formally, although he describes himself as a “native listener” of the language. During grad school he began to work for Harvard IT and, after graduating, stayed on with them for about eight years. He then worked as a programmer at a biotech company and most recently with a textbook publisher.
He is an active participant in the Boston Yiddish scene. That’s how Raphi knows him. This profile was the result of an encounter between the two a few weeks earlier, at a symposium on indigenous and endangered languages at Harvard. That same night I had accidentally helped a foreign dissident move. My back still hurt.
“What should we ask him?”
“I don’t know.”
About as far as we got. Raphi screwed in the gas cap and the two of us headed for Medford. We arrived and opened the screen door to knock. Bochner opened and invited us in. We were brought to a dark wooden table, in a room adjacent to a spacious kitchen. There was nothing on it but a small laptop running Ubuntu and a “pottery garlic,” a gift from very good friends on a no-gift birthday. It rattled. At first, the conversation revolved around the large quantities of garlic that Bochner and his wife produce every year. They grow it in their community garden plot and cure it in the basement. A cold sweat broke out: could he know of my vampiric past? I clutched tighter the amulet of the demon-god Echman-Azartoth, which I carry always. Then Bochner spoke a sentence that I knew would be the scoop of our career:
“Originally, its name was the Boston Dictionary Project.”
It felt as though all the air had been sucked out of the room and then forcibly reintroduced. I was ecstatic. I looked over to Raphi—his hair was stuck out on edge, like he had both hands on one of those static balls. A dangerous fire played in his eyes.
The story went like this: it all began a year after Niborski’s Yiddish-French dictionary was published. That would make the year 2003. Solon Beinfeld, who was deeply involved with the goings-on at the Medem Yiddish Center ever since his stay in Paris on a Fulbright in the 1950s, began to campaign for an English version. Beinfeld was the first person on the project. Bochner was the second. As soon as Bochner got involved, it became certain that the dictionary would have an online version as well. The third initial collaborator was Barry Goldstein, the prolific English-Yiddish translator of works like Lord of the Rings, Moby Dick, and The House at Pooh Corner.
It is important to note that the English dictionary is not a direct translation of the Yiddish-French dictionary. It officially considers itself an “adaptation.” Rather than translate the definitions exclusively from the French, the English version used the French version primarily as a lexicographical template: for each entry in the French dictionary English definitions were compiled from various sources by volunteers.
This was a labor-intensive process. Bochner instructed the volunteers as follows: “For each definition, check Harkavy, check Weinreich, check it against the French. If the French is the only source, translate the French. Make a note of it. If you’re stumped, write a comment as to what you found and what you’re puzzled about. We had two people do that for each letter independently.” These definitions were then reconciled by a third volunteer. If the proposed definitions agreed, or if there was a clear way to combine them, then this third volunteer had the authority to produce a final definition. If not, then the problem was raised and discussed by the editorial group. In this way, each definition was looked over at least three times. But in the end, only Bochner looked through every single entry. It was important to him, as a trained linguist, that none of the details got lost.
This was the process for producing the final English definitions, but the French definitions, of course, were not lost to history—nor, excitingly enough, were the non-final attempts at English definitions that the volunteers produced. From the text file of the French dictionary that Medem had sent him (cut-and-pasted from a PageMaker file—Middle East Edition—and encoded in some lost Macintosh demotic), Bochner created an XML document (eXtensible Markup Language, a customizable data format not so terribly different from HTML) with a structure that captured this history of attempts and revisions. In order to allow his revisers to interact with the data without having to deal with finicky XML, he designed a program that displayed the XML data visually, with the Yiddish lemma on top, followed by the French, followed by the attempts at definitions. The reviser could then choose one of the proposed definitions, or combine and rewrite them as they pleased. In this graphical form of the XML, we might read the Urform of verterbukh.org. It took a lot of effort to work out the XML schema and write the program that would allow the workers to interface with it, but it was worth it to produce well-structured XML that could be used to power the online version.
This master XML document includes a lot of data that is never displayed on verterbukh.org. For example, the sources used for producing the English definitions have been preserved in the XML for each entry. When we heard this, we demanded that Bochner display this information on the website. Our demands were not heeded. We had no leverage. But Bochner assured us that all of this data would be preserved in the archive for future generations.
The first step in turning this big XML document into the verterbukh.org that we all know and love is taking out this extra information. It is then styled—think blue, red, beige; or Light Stargate (#c8d1dd), Snake Fruit (#d8261b), and, apparently, Straßenköterblond (#d0cda8)—as HTML and served to the website.
That takes care of the way each entry is displayed, but we were also curious to learn how parsing works on verterbukh.org. How are conjugated forms handled? When a user looks up “zingt” or “gezungen,” how does verterbukh.org redirect them to the lemma “zingen”? Bochner informed us that all conjugated forms for each lemma are generated in advance, based on information present in the entry. This means that for each lemma, there is a pre-generated list of conjugated forms, all of which redirect to that lemma. These are generated in both Yiddish script and transliteration.
Verterbukh.org also uses a GNU program called Aspell, a spell checker which helps redirect your Latin script search for “fishnoger” directly to “fisnoge” (which the dictionary defines as “dial. leg (of a ruminant)”), and your Yiddish script search of “פֿישנאָגער” to a list of options: ״פֿינגער״, ״פֿינגערן״, פֿיטש נאַס״, ״פֿיסנאָגע״ ("finger," "to finger," "sopping wet," and finally "leg (of a ruminant)"). When combined with the pre-generated lists of conjugated forms, Aspell is able to fuzzily parse unseen forms. The word “faryentsheter,” for example, is not indexed anywhere in the dictionary, but Aspell is able to link it to “faryentshet,” which is indexed as a conjugated form of “faryentshen” (“to tire (s.o.) with one’s tales of woe”). Thus, thanks to Aspell, a search for “faryentsheter” will land on the relevant lemma (“faryentshen”). This functions up to an edit distance of two from a given indexed word, which means that “faryentsheters” will not return any results since it is at an edit distance of three (the appended “-ers”) from the closest indexed word, “faryentshet.” The “-er” suffix is in essence being treated as a spelling error.
Though this works rather well, Bochner is trying to move towards something a little more subtle. Hunspell, which is the spellchecker of LibreOffice and Firefox, was designed with Hungarian in mind, and is therefore particularly adept at parsing highly-suffixed forms. Once provided with the proper suffixes (e.g. -dik, -er), Hunspell should be able to redirect a search for “faryentsheter” to “faryentshen” programmatically, without the spookiness of edit-distance spellchecking. Bochner also intends to index the example sentences provided along with definitions, which at present do not figure into the search.
We were also excited to hear about verterbukh.org’s long-standing collaboration with the Forverts. As our readers may know, if you double click on a word on the Forverts, a pop-up is triggered which provides a definition of the word from verterbukh.org (see illustration below). This means that there is an infrastructure in place that would allow verterbukh.org to interface with other Yiddish websites. There is, however, no payment plan in place for such a model. The Forverts receives this feature for free, as a thank-you for the large donation they contributed during the dictionary’s early fundraising days: about $50,000 of the total $70,000 raised.
“What’s up with the word of the day?” Raphi asked. Raphi had noticed certain, shall we say, patterns . . . The words displayed on verterbukh.org’s home page would sometimes correspond to events, holidays, that sort of thing. And then it seemed to stop. But Raphi could not let go. He began to believe that the words still corresponded to the days on which they were posted, but in a subtler, more esoteric way. If you could stand the odor of his room, a cursory examination of his writing surfaces would reveal countless scraps of paper, black with gematriac calculation.
Raphi continued, trembling: “But the word, am I wrong? . . . But the word of the day, it corresponds to the calendar in some way, sometimes . . .”
Bochner responded: “I haven’t actually done that in recent years. But sometimes for Jewish holidays or something like that, I preselect it. So if it’s not already in the database, it selects randomly.”
Raphi was shattered. “Yeah, I see” was the only response he could muster.
The word of the day, inspired by the Esperanto Facebook page, is at this point completely random, although it excludes any words labeled obscene or pejorative. These are, of course, labels included in the entry for each word. An idea for a new sort of Yiddish word of the day began to take shape . . .
Though, as we write this, the word of the day is hefker, verterbukh.org and its sister site, that is also managed by Bochner, englishyiddishdictionary.com—the Comprehensive English–Yiddish dictionary edited by Gitl Schaechter-Viswanath and Paul Glasser—still require daily attention. Every morning Bochner wakes up, prepares some non-caffeinated tea, opens his computer, and reviews a server summary of the last twenty-four hours. He investigates all minor and—should there be any—fatal errors. This takes about fifteen or twenty minutes and, usually, all is well in digital Yiddish dictionary land. He also checks his two dictionary emails ([email protected] and [email protected]) and fields questions from students anxious about their discount, users who are not getting any results because they have forgotten to click “From Yiddish,” and elderly הערי פּאָטער readers embarking on their maiden voyage, perplexed as to why “האָגװאַרץ” cannot be found.
Sometimes, however, Bochner encounters users whose behavior, malicious or otherwise, overburdens the site. This usually happens when users submit search queries faster than the search-rate limit, which is capped at around 3 queries in 2 seconds and 30 queries in 60 seconds. Should you exceed this, your screen will present you with a charitable message along the lines of, “You are going too fast. Please slow down”; or something more terrifying like, “You have exceeded your quota.” Sometimes it is not the query rate that is suspicious, but the connection request rate. This is the kind of request sent when you first enter the verterbukh.org URL. Once, “someone from New Jersey or something like that” was submitting up to 32 simultaneous connection requests within one second. These requests would come from two separate IP addresses and they were often blank, without an actual word query. Bochner deduced it was likely a commuter with about sixteen verterbukh.org tabs open, “taking his laptop between home and work,” opening it up at either location, prompting a refresh of all sixteen tabs. A special static page—“Hey, you’re going too fast.”—was designed for him and delivered as a response to his blank requests. Once two hundred new accounts with valid email addresses were made overnight, in a case of apparent “mail bombing.” And once, though Bochner seldom reaches out to users, he reached out to a user submitting queries at an industrial rate with a simple question: “Why? Why are you doing this?” The user apologized. They had not noticed that their textbook was leaning devilishly against the “Enter” key.
At this point, we decided to get down to brass tacks. We wanted to know about the pay-by-word subscription model. As it turned out, a big motivator for using this sort of model (instead of, for example, a pay-by-the-year model with unlimited searches) was that it hinders bad actors (read: us) from writing a script that would go through the entire dictionary and download each word. This is also the reason that there is no button to proceed to the next word alphabetically. This would make it very easy for someone with malicious intent to navigate through the entries and download them programmatically. Medem was particularly concerned with potential piracy of its French dictionary. Other anti-piracy measures include a per-connection rate limit. Though the English–Yiddish dictionary is a monthly subscription model, and not pay-by-word, Bochner still checks for excessive usage that might indicate some kind of malicious automated process. It is important that this rate limit is per-connection, and not per-user, because institutional accounts (like the ones provided by universities) are, in essence, user accounts. If there were a per-user rate limit, then any university account being used by more than three people at any given time would likely be barred from entry. Institutional accounts also differ from single-user accounts in terms of search quotas. Like private individuals, universities purchase a certain number of words per month. The primary difference between the two is: if an institution exceeds its allotted number of searches per month, it is not barred from making more. Instead, Bochner waits to see if that institution continues to exceed its allotted amount, and, if so, he suggests that they purchase more searches in the future.
By this point, we were nearing the third hour of the interview. We began to reflect on the effect verterbukh.org has had on people interested in Yiddish. Is there any resource that has so shaped contemporary knowledge of the language? When we asked Bochner about the impact of the dictionary, he told us that the project would be his legacy. He is even training an apprentice to inherit the responsibility. Bochner is overjoyed with the project’s impact. “How else could I answer 10,000 questions a day?” he asked. Jonah asked him if he had ever considered using his powers for evil. Considering the reach of the dictionary, modifications to the definitions could have a significant, negative effect on many Yiddish learners. He told us, however, that evil was not his style and “besides, people would notice.”
“Slowly,” Raphi responded.