Dictionaries
Dictionaries
Resources
Consulting
Order
on line
Japanese
Chinese
Korean
Arabic
Websites
Articles/papers
What is CJKI?
What is KDPS?
Jack Halpern
Links
|
Principal Arabic Lexical Resources
كُلُّ عَام وَأَنْتُم بخَيْر
Assalaamu `Alaykum -- welcome to our page for Arabic lexical resources.
Arabic, one of the six official languages of the United Nations, is spoken by 246 million speakers worldwide -- not only in North Africa and the Middle East, but also in many other countries since it is the language of the Koran.
Though Arabic is playing an increasingly important role in the world today, few lexical and linguistic resources are available for it. The CJK Dictionary Institute has been engaged in the development of comprehensive Arabic lexical databases, with a special focus on proper nouns. These resources, described below, are designed for machine translation (MT) and various natural language processing (NLP) applications such as named entity recognition (NER) and anti-money laundering (AML) programs.
- Database of Arab Names (DAN).
A comprehensive database covering approximately 2.4 million Arab names and variants, including
OFAC names, based on authoritative resources and extensively proofread by a team of Arabic native speaker editors.
- NEW The CJKI Arabic Learner’s Dictionary (CALD). A new concept dictionary enables learners gain a fully understanding MSA core vocabulary. An Arabic summary is available at القاموس العربي الإنجليزي للمتعلمين .
- UPDATED Database of Arab Names in Arabic (DANA). A one-of-a-kind resource of Arab personal names and variants, in the original Arabic script. This database covers several hundred thousand Arabic script variants, along with common spelling mistakes.
- NEW Database of Arabic Business
Names (DABNA).Arabic Companies and Organizations. A database of Arabic
company and organization names is now under development.
- EXPANDED Expanded OFAC (XOFAC).
To address the shortcomings of OFAC's SDN List, CJKI has developed a comprehensive "Expanded OFAC" database of OFAC full name variants, the vast majority of which are not listed in OFAC.
- NEW Database of Foreign Names in Arabic (DAFNA). A database of non-Arab names transcribed to Arabic, including Arabic orthographic variants and common orthographic errors.
- Dictionary of Arabic Place Name Variants (DAPNA). A database of
Arabic-English place names including systematic coverage for orthographic
variants and common orthographic errors.
- The ARAN and NANA systems automatically transcribeCJK and Latin names to/from Arabic.
- Dictionary of Arabic Proper Nouns. A database of
Arabic-English proper nouns covering surnames, given names, and
place names in both vocalized and unvocalized Arabic with romanized
transcriptions.
-
Arabic Broken Plurals (.doc file, 95K). A comprehensive database of broken plurals
(unpredictable) in Arabic given in three versions -- voweled, unvoweled,
and transcription -- with cross-references from plural to singular.
Essential for morphological analysis and NLP applications.
-
Arabic Transcription and Transliteration. An overview of some linguistic
issues related to transliteration and transcription, with
special focus on our Arabic transcription technology.
- Arabic Lexical Database(ALD). We are now developing a comprehensive Arabic monolingual
lexical database, which contains detailed grammatical and phonological
attributes such as POS codes, conjugation patterns and verb transitivity,
suitable for such applications as NLP, MT systems and morphological
analysis.
| CJKI
Home |
|