jamdict APIs

An overview of jamdict modules.

Warning

👉 ⚠️ THIS SECTION IS STILL UNDER CONSTRUCTION ⚠️ Help is much needed.

class jamdict.util.Jamdict(db_file=None, kd2_file=None, jmd_xml_file=None, kd2_xml_file=None, auto_config=True, auto_expand=True, reuse_ctx=True, jmnedict_file=None, jmnedict_xml_file=None, memory_mode=False, **kwargs)[source]

Main entry point to access all available dictionaries in jamdict.

>>> from jamdict import Jamdict
>>> jam = Jamdict()
>>> result = jam.lookup('食べ%る')
# print all word entries
>>> for entry in result.entries:
>>>     print(entry)
# print all related characters
>>> for c in result.chars:
>>>     print(repr(c))

To filter results by pos, for example look for all “かえる” that are nouns, use:

>>> result = jam.lookup("かえる", pos=["noun (common) (futsuumeishi)"])

To search for named-entities by type, use the type string as query. For example to search for all “surname” use:

>>> result = jam.lookup("surname")

To find out which part-of-speeches or named-entities types are available in the dictionary, use Jamdict.all_pos and Jamdict.all_ne_type.

Jamdict >= 0.1a10 support memory_mode keyword argument for reading the whole database into memory before querying to boost up search speed. The database may take about a minute to load. Here is the sample code:

>>> jam = Jamdict(memory_mode=True)

When there is no suitable database available, Jamdict will try to use database from jamdict-data package by default. If there is a custom database available in configuration file, Jamdict will prioritise to use it over the jamdict-data package.

all_ne_type(ctx=None)List[str][source]

Find all available named-entity types

Returns

A list of named-entity types (a list of strings)

all_pos(ctx=None)List[str][source]

Find all available part-of-speeches

Returns

A list of part-of-speeches (a list of strings)

lookup(query, strict_lookup=False, lookup_chars=True, ctx=None, lookup_ne=True, pos=None, **kwargs)jamdict.util.LookupResult[source]

Search words, characters, and characters.

Keyword arguments:

Parameters
  • query – Text to query, may contains wildcard characters. Use ? for 1 exact character and % to match any number of characters.

  • strict_lookup (bool) – only look up the Kanji characters in query (i.e. discard characters from variants)

  • pos (list of strings) – Filter words by part-of-speeches

  • ctx – database access context, can be reused for better performance. Normally users do not have to touch this and database connections will be reused by default.

  • lookup_ne (bool) – set lookup_ne to False to disable name-entities lookup

Param

lookup_chars: set lookup_chars to False to disable character lookup

Returns

Return a LookupResult object.

Return type

jamdict.util.LookupResult

>>> # match any word that starts with "食べ" and ends with "る" (anything from between is fine)
>>> jam = Jamdict()
>>> results = jam.lookup('食べ%る')
lookup_iter(query, strict_lookup=False, lookup_chars=True, lookup_ne=True, ctx=None, pos=None, **kwargs)jamdict.util.LookupResult[source]

Search for words, characters, and characters iteratively.

An IterLookupResult object will be returned instead of the normal LookupResult. res.entries, res.chars, res.names are iterators instead of lists and each of them can only be looped through once. Users have to store the results manually.

>>> res = jam.lookup_iter("花見")
>>> for word in res.entries:
...     print(word)  # do somethign with the word
>>> for c in res.chars:
...     print(c)
>>> for name in res.names:
...     print(name)

Keyword arguments:

Parameters
  • query – Text to query, may contains wildcard characters. Use ? for 1 exact character and % to match any number of characters.

  • strict_lookup (bool) – only look up the Kanji characters in query (i.e. discard characters from variants)

  • pos (list of strings) – Filter words by part-of-speeches

  • ctx – database access context, can be reused for better performance. Normally users do not have to touch this and database connections will be reused by default.

  • lookup_ne (bool) – set lookup_ne to False to disable name-entities lookup

Param

lookup_chars: set lookup_chars to False to disable character lookup

Returns

Return an IterLookupResult object.

Return type

jamdict.util.IterLookupResult

property krad

Break a kanji down to writing components

>>> jam = Jamdict()
>>> print(jam.krad['雲'])
['一', '雨', '二', '厶']
property memory_mode

if memory_mode = True, Jamdict DB will be loaded into RAM before querying for better performance

property radk

Find all kanji with a writing component

>>> jam = Jamdict()
>>> print(jam.radk['鼎'])
{'鼏', '鼒', '鼐', '鼎', '鼑'}
property ready: bool

Check if Jamdict database is available

class jamdict.util.LookupResult(entries, chars, names=None)[source]

Contain lookup results (words, Kanji characters, or named entities) from Jamdict.

A typical jamdict lookup is like this:

>>> jam = Jamdict()
>>> result = jam.lookup('食べ%る')

The command above returns a LookupResult object which contains found words (entries), kanji characters (chars), and named entities (names).

text(compact=True, entry_sep='。', separator=' | ', no_id=False, with_chars=True)str[source]

Generate a text string that contains all found words, characters, and named entities.

Parameters
  • compact – Make the output string more compact (fewer info, fewer whitespaces, etc.)

  • no_id – Do not include jamdict’s internal object IDs (for direct query via API)

  • entry_sep – The text to separate entries

  • with_chars – Include characters information

Returns

A formatted string ready for display

property chars: Sequence[jamdict.kanjidic2.Character]

A list of found kanji characters

Returns

a list of Character object

Return type

Sequence[Character]

property entries: Sequence[jamdict.jmdict.JMDEntry]

A list of words entries

Returns

a list of JMDEntry object

Return type

List[JMDEntry]

property names: Sequence[jamdict.jmdict.JMDEntry]

A list of found named entities

Returns

a list of JMDEntry object

Return type

Sequence[JMDEntry]

class jamdict.util.IterLookupResult(entries, chars=None, names=None)[source]

Contain lookup results (words, Kanji characters, or named entities) from Jamdict.

A typical jamdict lookup is like this:

>>> res = jam.lookup_iter("花見")

res is an IterLookupResult object which contains iterators to scan through found words (entries), kanji characters (chars), and named entities (names) one by one.

>>> for word in res.entries:
...     print(word)  # do somethign with the word
>>> for c in res.chars:
...     print(c)
>>> for name in res.names:
...     print(name)
property chars

Iterator for looping one by one through all found kanji characters, can only be used once

property entries

Iterator for looping one by one through all found entries, can only be used once

property names

Iterator for looping one by one through all found named entities, can only be used once

class jamdict.jmdict.JMDEntry(idseq='')[source]

Represents a dictionary Word entry.

Entries consist of kanji elements, reading elements, general information and sense elements. Each entry must have at least one reading element and one sense element. Others are optional.

XML DTD <!ELEMENT entry (ent_seq, k_ele*, r_ele+, info?, sense+)>

class jamdict.kanjidic2.Character[source]

Represent a kanji character.

<!ELEMENT character (literal,codepoint, radical, misc, dic_number?, query_code?, reading_meaning?)*>

property components

Kanji writing components that compose this character

meanings(english_only=False)[source]

Accumulate all meanings as a list of string. Each string is a meaning (i.e. sense)

jamdict.krad is a module for retrieving kanji components (i.e. radicals)