The Israel Innovation Authority and Israel National Digital Ministry will fund the Establishment of the Association of Natural Language Processing (NLP) Technology Companies in Hebrew and Arabic
The budget of the program for the first 3 years:
Approximately NIS 7.5 million.
The Israel Innovation Authority, in collaboration with the National Digital Ministry of Israel, have approved the establishment of the Association of Natural Language Processing (NLP) Technology Companies, which will contribute to promoting the comprehension of the Hebrew and Arabic languages in computerized systems. Amongst the companies included in the Association are: Rafael, Ginger Software, Melingo, AudioCodes, and others.
Asher Bitton, the Director General of the National Digital Ministry: “The public sector deals with information in Hebrew and Arabic on a daily basis, most of which is not structured. One of the major challenges in the digitization of public services is to enable operational efficiency, available to the public at no cost, along with high productivity.”
Dror Bin, CEO, Israel Innovation Authority: “The Association that we established this week should allow the industry to lead the definition of needs and help close technological gaps that will make it possible to use unstructured databases in Hebrew and provide insights based upon such, that will serve as leverage for products and services provided by Israeli companies.”
The Association of Natural Language Processing (NLP) Technology Companies, in which the Innovation Authority will invest approximately NIS 7.5 million over the next three years, has been established due to the poor and insufficient quality of Hebrew and Arabic speech recognition in various types of computerized systems, as compared to the recognition of speech in other languages.
The reason for the poor quality and the difficulty in development lies in the fact that Hebrew and Arabic are Semitic languages that are more challenging and thus are more difficult to analyze and therefore the quality of understanding and recognizing human speech in Hebrew and Arabic is lower and constitutes a barrier to realizing and applying advanced and quality services.
The objective of the Association will be to produce an R & D infrastructure that will enable an empirical basis not only for the identification of the structural elements and models that make up the linguistic system, but also for the mapping of the manner of how these systems are used. These are syntax, semantic and morphological characteristics for R & D purposes in the field of natural language processing. In order to allow for as diverse and broad improvements as possible, the corpora tagged in Hebrew and Arabic will be from diverse fields, including: News, archives, movies, books, articles, customer service, transcribed radio and television broadcasts etc., from various sectors of industry.
In addition, the Association will examine the possibility of modifying third-party tools (Open Source) and / or open source code tools will be developed for testing and improving the quality of understanding Hebrew and Arabic speech by various computerized systems. Through this infrastructure, it will be possible to improve and increase the quality of the various solutions for identifying human speech in Hebrew and Arabic. The infrastructure that will be established by the Association, will be set up in the cloud and will enable the secure sharing of the corpora and running a management system and algorithms for all the partners in the Association.
The group of users who will use the organization’s products will consist of members of the Association, who hail from various fields in Israeli industry and who will use the infrastructure to develop services, applications and software to improve customer service, management, knowledge, decision-making and the implementation of advanced applications that require the understanding of natural Hebrew and Arabic languages.
Amongst the companies and participants in the organization are companies that develop infrastructure solutions (research and development in the fields of speech comprehension and companies that develop algorithms used as building blocks for various applications in the field); and, of course, companies engaged in the development of services and products in the areas of speech recognition. Potential consumers of products and services based on natural speech recognition technologies come from a wide range of sectors and services: Hi-tech, banking, insurance, communications, health, education, tourism, job placement, Government Ministries, security and intelligence systems, etc.
The List of Companies and Organizations who are Members of the Association:
No. | Company Name | Description of the Activity |
1 | AudioCodes | AudioCodes Ltd. is a vendor of advanced voice networking and media processing solutions for the digital workplace. |
2 | Rafael | Rafael deals with the development of various weapons systems. The company deals with R & D related to natural language processing and Speech Recognition. |
3 | Bank Hapoalim | The Bank Group operates in Israel in all various domains of banking and parallel activities in the capital market, through three main divisions: The Business Division, the Retail Division and the Financial Markets and International Banking Division. |
4 | Basis | Basis Technology Israel (BTI) was established in 2014 as a subsidiary of Basis Technology, which was established 22 years ago in the United States. BTI serves as the research and development organization for all products of Basis Technology, and maintains a sales and support arm in Israel. The Basis Company develops tools for natural language analysis in a wide range of languages for different levels of analysis, such as morphological analysis and entity extraction. BTI has extensive experience in developing language analysis tools using machine learning algorithms, based on tagged information. For example, as of this writing, BTI is successfully completing an extended project to tag 300 economic articles in Hebrew, for the purpose of extracting entities and training statistical models for the extraction mission. The Intellectual Property generated as part of the Israeli company’s activities belongs to the global company, but approvals have been received to support the authority’s project within the framework of KORIL and the intellectual property of this project belongs to the Israeli company. |
5 | K-Dictionary | K-Dictionary specializes in the development of linguistic content for a variety of languages, originally academic and multilingual dictionaries, collaborates with industry and academia around the world, and participates in EU projects (the H2020 consortium and research networks as part of COST). Over the past decade, the company has expanded its activities with the integration of natural language processing, and has shifted its focus to interoperability between linguistic databases and a variety of language technologies systems, machine learning, etc. systems, for example, while implementing Linked Data methodologies and Semantic Web technologies, with emphasis on empowering automated processes, such as creating corpora and developing tools for analyzing them. |
6 | Infoneto | The company has been active since 2003 in the field of natural language processing in Hebrew and is engaged in analyzing the resumes of job applicants and provides a system for managing a database of job applicants with an accurate semantic search for candidates. Amongst the company’s customers are: Israel Chemicals, Delek, Migdal Capital Markets, Renoir, Aman, Taldor and others. |
7 | TSG | TSG IT Advanced Systems Ltd. (TSG) is a global provider of C4ISTAR, Intelligence, HLS and Cyber Security solutions with a track record of over 50 years in successful development, integration and delivery of mission-critical, turnkey solutions to various military forces, Government agencies and corporations worldwide |
8 | Intel | The company deals, inter alia, with analyzing machine-based systems and speech recognition in Israel. |
9 | Walla | Walla the Israeli news site, operated by the “Walla! Communications” company, a part of the Bezek Group. It is one of the most viewed websites in Israel. |
10 | Ynet | An Israeli news site and content portal, which is part of the Yedioth Ahronoth Group. As of July 2020, Ynet is the most viewed news site in Israel. |
11 | Ginger Software | Ginger Software is an Israeli start-up company that has developed language enhancement technology that uses statistical algorithms in conjunction with natural language processing, aiming to improve written communications. Ginger has over 10 years of experience in AI-powered grammar and spell-checking tools. It gathered feedback that allows machine learning to fine-tune the results and to understand what people really want to write even in complex cases, like for people with dyslexia or with very poor English. |
12 | Melingo | Melingo Ltd., a subsidiary of the Encyclopedia Britannica, is the leading company in Israel in online dictionaries and smart search products, and the world leader in computational linguistics in Semitic languages. Melingo specializes in the development and marketing of products and services based on natural language applications, in the most complex languages in terms of computing – Hebrew and Arabic, and the company also offers unique solutions in various languages, in the private and institutional markets. Amongst the company’s products are: Smart search engines in Hebrew, Arabic and Persian, unique solutions for retrieving information in an organizational environment, online dictionaries and systems for automatic reading of text in Hebrew. The products are designated for the Internet, for internal intranets in organizations, for telephony networks and cellular communications. Melingo is the owner of the following websites: ‘Morfix’ – a free Hebrew-English-Hebrew dictionary, ‘Rav-Milim’ – the most comprehensive Hebrew dictionary on the web, ‘Nakdan Morfix’ – a website for the automatic addition of Hebrew vowels to text, ‘Colfix’ – text reading software, and bilingual dictionaries In other languages. |
Non Structured Content Providers
- Walla
- Ynet
- The Kan Corporation – Television
- Galei Zahal – Radio
- The Knesset Archives
- Dikta
- The National Institute for Testing and Evaluation
- The Ministry of Health – the Timna Project
- The Maccabi Health Care Fund Research Center
- Ha’aretz Newspaper
- Bank Hapoalim
Academic Researchers:
- Prof. Reut Tzarfati, Bar Ilan University
- Prof. Alon Itai, the Technion – The Israel Institute of Technology
- Prof. Shuli Wintner, Haifa University
- Effi Levy, The Hebrew University of Jerusalem
- Shai Fein, the IDC Herzliya