jamdict APIs¶
An overview of jamdict modules.
Warning
👉 ⚠️ THIS SECTION IS STILL UNDER CONSTRUCTION ⚠️ Help is much needed.
- class jamdict.util.Jamdict(db_file=None, kd2_file=None, jmd_xml_file=None, kd2_xml_file=None, auto_config=True, auto_expand=True, reuse_ctx=True, jmnedict_file=None, jmnedict_xml_file=None, memory_mode=False, **kwargs)[source]¶
Main entry point to access all available dictionaries in jamdict.
>>> from jamdict import Jamdict >>> jam = Jamdict() >>> result = jam.lookup('食べ%る') # print all word entries >>> for entry in result.entries: >>> print(entry) # print all related characters >>> for c in result.chars: >>> print(repr(c))
To filter results by
pos
, for example look for all “かえる” that are nouns, use:>>> result = jam.lookup("かえる", pos=["noun (common) (futsuumeishi)"])
To search for named-entities by type, use the type string as query. For example to search for all “surname” use:
>>> result = jam.lookup("surname")
To find out which part-of-speeches or named-entities types are available in the dictionary, use
Jamdict.all_pos
andJamdict.all_ne_type
.Jamdict >= 0.1a10 support
memory_mode
keyword argument for reading the whole database into memory before querying to boost up search speed. The database may take about a minute to load. Here is the sample code:>>> jam = Jamdict(memory_mode=True)
When there is no suitable database available, Jamdict will try to use database from jamdict-data package by default. If there is a custom database available in configuration file, Jamdict will prioritise to use it over the
jamdict-data
package.- all_ne_type(ctx=None) → List[str][source]¶
Find all available named-entity types
- Returns
A list of named-entity types (a list of strings)
- all_pos(ctx=None) → List[str][source]¶
Find all available part-of-speeches
- Returns
A list of part-of-speeches (a list of strings)
- lookup(query, strict_lookup=False, lookup_chars=True, ctx=None, lookup_ne=True, pos=None, **kwargs) → jamdict.util.LookupResult[source]¶
Search words, characters, and characters.
Keyword arguments:
- Parameters
query – Text to query, may contains wildcard characters. Use ? for 1 exact character and % to match any number of characters.
strict_lookup (bool) – only look up the Kanji characters in query (i.e. discard characters from variants)
pos (list of strings) – Filter words by part-of-speeches
ctx – database access context, can be reused for better performance. Normally users do not have to touch this and database connections will be reused by default.
lookup_ne (bool) – set lookup_ne to False to disable name-entities lookup
- Param
lookup_chars: set lookup_chars to False to disable character lookup
- Returns
Return a LookupResult object.
- Return type
>>> # match any word that starts with "食べ" and ends with "る" (anything from between is fine) >>> jam = Jamdict() >>> results = jam.lookup('食べ%る')
- lookup_iter(query, strict_lookup=False, lookup_chars=True, lookup_ne=True, ctx=None, pos=None, **kwargs) → jamdict.util.LookupResult[source]¶
Search for words, characters, and characters iteratively.
An
IterLookupResult
object will be returned instead of the normalLookupResult
.res.entries
,res.chars
,res.names
are iterators instead of lists and each of them can only be looped through once. Users have to store the results manually.>>> res = jam.lookup_iter("花見") >>> for word in res.entries: ... print(word) # do somethign with the word >>> for c in res.chars: ... print(c) >>> for name in res.names: ... print(name)
Keyword arguments:
- Parameters
query – Text to query, may contains wildcard characters. Use ? for 1 exact character and % to match any number of characters.
strict_lookup (bool) – only look up the Kanji characters in query (i.e. discard characters from variants)
pos (list of strings) – Filter words by part-of-speeches
ctx – database access context, can be reused for better performance. Normally users do not have to touch this and database connections will be reused by default.
lookup_ne (bool) – set lookup_ne to False to disable name-entities lookup
- Param
lookup_chars: set lookup_chars to False to disable character lookup
- Returns
Return an IterLookupResult object.
- Return type
- property krad¶
Break a kanji down to writing components
>>> jam = Jamdict() >>> print(jam.krad['雲']) ['一', '雨', '二', '厶']
- property memory_mode¶
if memory_mode = True, Jamdict DB will be loaded into RAM before querying for better performance
- property radk¶
Find all kanji with a writing component
>>> jam = Jamdict() >>> print(jam.radk['鼎']) {'鼏', '鼒', '鼐', '鼎', '鼑'}
- property ready: bool¶
Check if Jamdict database is available
- class jamdict.util.LookupResult(entries, chars, names=None)[source]¶
Contain lookup results (words, Kanji characters, or named entities) from Jamdict.
A typical jamdict lookup is like this:
>>> jam = Jamdict() >>> result = jam.lookup('食べ%る')
The command above returns a
LookupResult
object which contains found words (entries
), kanji characters (chars
), and named entities (names
).- text(compact=True, entry_sep='。', separator=' | ', no_id=False, with_chars=True) → str[source]¶
Generate a text string that contains all found words, characters, and named entities.
- Parameters
compact – Make the output string more compact (fewer info, fewer whitespaces, etc.)
no_id – Do not include jamdict’s internal object IDs (for direct query via API)
entry_sep – The text to separate entries
with_chars – Include characters information
- Returns
A formatted string ready for display
- property chars: Sequence[jamdict.kanjidic2.Character]¶
A list of found kanji characters
- property entries: Sequence[jamdict.jmdict.JMDEntry]¶
A list of words entries
- class jamdict.util.IterLookupResult(entries, chars=None, names=None)[source]¶
Contain lookup results (words, Kanji characters, or named entities) from Jamdict.
A typical jamdict lookup is like this:
>>> res = jam.lookup_iter("花見")
res
is anIterLookupResult
object which contains iterators to scan through found words (entries
), kanji characters (chars
), and named entities (names
) one by one.>>> for word in res.entries: ... print(word) # do somethign with the word >>> for c in res.chars: ... print(c) >>> for name in res.names: ... print(name)
- property chars¶
Iterator for looping one by one through all found kanji characters, can only be used once
- property entries¶
Iterator for looping one by one through all found entries, can only be used once
- property names¶
Iterator for looping one by one through all found named entities, can only be used once
- class jamdict.jmdict.JMDEntry(idseq='')[source]¶
Represents a dictionary Word entry.
Entries consist of kanji elements, reading elements, general information and sense elements. Each entry must have at least one reading element and one sense element. Others are optional.
XML DTD <!ELEMENT entry (ent_seq, k_ele*, r_ele+, info?, sense+)>
- class jamdict.kanjidic2.Character[source]¶
Represent a kanji character.
<!ELEMENT character (literal,codepoint, radical, misc, dic_number?, query_code?, reading_meaning?)*>
- property components¶
Kanji writing components that compose this character
jamdict.krad is a module for retrieving kanji components (i.e. radicals)