Major artificial intelligence (AI) news focusing on natural language processing (NLP) for the first half of 2022

2024-07-04

Here's a summary of major AI news from the first half of the past year. In particular, I collected news from the field of natural language processing, which is also closely related to LETR. We also include related references for those who want to learn more.

‍

January

Development of Data2Vec, a self-supervised learning algorithm that simultaneously recognizes meta, voice, images, and characters

‍

Image: Meta AI

‍

Meta (Meta) AI developed an AI self-guided learning algorithm that simultaneously recognizes speech, text, and images. This changed the paradigm of traditional algorithm research, which was conducted in different ways in the fields of speech, text, and image. Meta researchers were confident that this would be the cornerstone of the development of General Model Architectures (General Model Architectures).

‍

reference

https://ai.facebook.com/blog/the-first-high-performance-self-supervised-algorithm-that-works-for-speech-vision-and-text/

https://github.com/facebookresearch/fairseq/tree/main/examples/data2vec

https://arxiv.org/abs/2202.03555

‍https://www.technologyreview.com/2022/01/20/1043885/meta-ai-facebook-learning-algorithm-nlp-vision-speech-agi/

https://byline.network/2022/01/21-168/

http://www.aitimes.com/news/articleView.html?idxno=142722

‍

February

Development of AI alpha codes for deepmind and coding

Image: DeepMind

‍

We have developed an AI AlphaCode (AlphaCode) coded by DeepMind. Alpha Code is known to have above average abilities, which is equivalent to the top 54% of human developers. This goes beyond the limitations of traditional large-scale language models that can't do more than simply translate instructions into code.

‍

reference

https://www.deepmind.com/blog/competitive-programming-with-alphacode

https://alphacode.deepmind.com/

https://arxiv.org/abs/2203.07814

http://www.aitimes.com/news/articleView.html?idxno=142892

https://byline.network/2022/02/3-108/

‍

OpenAI develops a new version of InstructGPT that improves GPT-3 issues

‍

Image: OpenAI

‍

reference

https://openai.com/blog/instruction-following/#moon

https://github.com/openai/following-instructions-human-feedback

https://arxiv.org/abs/2203.02155

https://www.technologyreview.kr/new-gpt3-openai-chatbot-language-model-ai-toxic-misinformation/

https://littlefoxdiary.tistory.com/101

‍

March

Stanford releases Human-Centered Artificial Intelligence Research Institute 2022 Annual Report

‍

Image: Stanford University HAI

‍

Stanford Human-Centered Artificial Intelligence (Human-Centered Artificial Intelligence. HAI) announced the “AI Index 2022 (AI Index 2022).” The topic of this report was “Industrialization of Artificial Intelligence and Growing Ethical Issues (Industrialization of AI and Ethical Mounting).” In particular, there are nine key points: “▷ Surging private investment in AI, strengthening investment focus, ▷ leading cooperation between the US, China, and countries on AI ▷ language models are more competent than ever before, but they are more biased. “The rise of AI ethics everywhere ▷ AI is getting cheaper and improving performance ▷ data, data, data ▷ more global legislation on AI than ever before ▷ robotic arms are getting cheaper” was included.

‍

reference

https://aiindex.stanford.edu/report/

https://hai.stanford.edu/news/state-ai-9-charts

https://hai.stanford.edu/news/2022-ai-index-ais-ethical-growing-pains

https://hai.stanford.edu/news/2022-ai-index-industrialization-ai-and-mounting-ethical-concerns

‍

April

Google AI Unveils 21 Multilingual Massive Corpus as Open Source CVSS

‍

Image: Google AI

‍

Google AI has released a large-scale multilingual voice-to-voice translation corpus (CVSS) as an open source. It is known that this is to promote a new generation of S2ST (Speech-To-Speech Translation) research and the development of artificial intelligence speech conversion applications. CVSS includes two S2ST data sets (1,872 hours and 1,937 hours of speech, respectively) along with the source voice, and in addition to the translated voice, it also provides translated text such as normalized numbers, calls, and words that match the pronunciation of the translated voice

‍

reference

https://ai.googleblog.com/2022/04/introducing-cvss-massively-multilingual.html

https://arxiv.org/abs/2201.03713

https://github.com/google-research-datasets/cvss

https://research.google/tools/datasets/speech-to-speech-translation-corpus/

https://www.marktechpost.com/2022/04/07/google-ai-introduces-a-common-voice-based-speech-to-speech-translation-corpus-cvss-that-can-be-directly-used-for-training-direct-s2st-models-without-any-extra-processing/

http://www.aitimes.kr/news/articleView.html?idxno=24706

‍

Google unveils supergiant language model PArM

‍

Image: Google AI

‍

Google has unveiled a new language model, PAlM (Pathways Language Model). It's an ultra-large language model with 540 billion parameters, about three times larger than Open AI's GPT-3. It is a single AI model with powerful performance that can solve various problems such as natural language understanding and generation as well as arithmetic by learning how to solve problems.

‍

reference

https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html

https://arxiv.org/abs/2204.02311

https://www.infoq.com/news/2022/04/google-palm-ai/

https://byline.network/2022/04/7-138/

http://www.aitimes.com/news/articleView.html?idxno=143840

‍

Open AI announces new version of DALL·E, an artificial intelligence for image generation

‍

Image: OpenAI

‍

OpenAI has released a new version of DALL·E, an AI that generates text as images. DALL·E 2 can not only create new images with high resolution, but it is also possible to edit images. In the future, it is expected that it can be used in various ways, such as providing new ideas to designers and artists.

‍

reference

https://openai.com/dall-e-2/

https://arxiv.org/abs/2204.06125

https://towardsdatascience.com/dall-e-2-explained-the-promise-and-limitations-of-a-revolutionary-ai-3faf691be220

http://www.aitimes.com/news/articleView.html?idxno=143854&page=4&total=638

https://byline.network/2022/04/8-127/

‍

May

Google unveils AI Test Kitchen to test AI language model LaMDA 2

‍

Image: Google

‍

Google I/O 2002 unveiled AI Test Kitchen, an app for beta testing such as error search for AI language model LaMDA 2. A total of 3 functions were introduced, such as imagining ideas with AI, talking about specific topics, and organizing to-do lists. This is a kind of crowdsourced test, which is expected to help improve issues related to AI language models later.

‍

reference

https://io.google/2022/intl/ko/

https://aitestkitchen.withgoogle.com

https://www.xda-developers.com/google-new-ai-test-kitchen-test-conversational-ai/

https://www.theverge.com/2022/5/11/23065072/google-ai-app-test-kitchen-future-io-2022

https://www.wired.kr/news/articleView.html?idxno=3929

http://www.aitimes.com/news/articleView.html?idxno=144546

‍

Meta unveils self-developed super-giant AI language model as open source

‍

Image: Meta AI

‍

Meta Artificial Intelligence Research Institute (Meta AI) has released 'Open Preliminary Transformer (hereafter, OPT-175B) ', a super-large AI language model with 175 billion parameters, as an open source. Furthermore, it is known that all pre-trained models and code are included. This is a bold and welcome move, which is expected to be particularly helpful in solving problems such as AI bias.

‍

reference

https://ai.facebook.com/blog/democratizing-access-to-large-scale-language-models-with-opt-175b/

https://github.com/facebookresearch/metaseq

https://arxiv.org/abs/2205.01068

https://www.technologyreview.kr/메타-자체-개발한-대형언어모델-무료-공개/

http://www.aitimes.kr/news/articleView.html?idxno=25025

‍

DeepMind unveils Gato, a new AI system that performs multiple tasks

‍

Image: DeepMind

‍

DeepMind has unveiled Gato (Gato), a general-purpose agent that can perform multiple tasks by processing various forms of data with a single neural network model. DeepMind revealed that Gato can perform 604 tasks, and that he is superior to human experts in 450 of them. However, the claim that this is a step forward towards general-purpose artificial intelligence has aroused much criticism and controversy.

‍

reference

https://www.deepmind.com/publications/a-generalist-agent

https://arxiv.org/abs/2205.06175

https://www.independent.co.uk/tech/ai-deepmind-artificial-general-intelligence-b2080740.html

https://www.technologyreview.kr/deepmind-gato-ai-model-hype/

https://towardsdatascience.com/gato-the-latest-from-deepmind-towards-true-ai-1ac06e1d18cd

http://scimonitors.com/딥마인드-새로운-ai-gato는-agi인가/

‍

Google unveils Imagen, an artificial intelligence for image creation

‍

Image: Google

‍

Google introduced Imagen, an AI system that generates text as an image. As a result of the benchmark evaluation, it was announced that it was preferred over competing models such as Open AI's DALL-E. However, like other models, it was not disclosed to the public due to concerns about side effects such as abuse, prejudice, and reflection of discriminatory attitudes.

‍

reference

https://imagen.research.google

https://arxiv.org/abs/2205.11487

https://www.assemblyai.com/blog/how-imagen-actually-works/

https://www.technologyreview.kr/dark-secret-cute-ai-animal-images-dalle-openai-imagen-google/

http://www.aitimes.com/news/articleView.html?idxno=144897

‍

June

GitHub officially launches No-Coding AI Co-Pilot

‍

Image: GitHub

‍

GitHub has officially launched the no-coding AI tool Copilot (Copilot). Copilot was built on OpenAI's Codex (Codex) and GitHub's code database, and was released about a year ago. However, since then, Amazon Web Services and Google DeepMind have also released AI for coding, but until now, they all only act as coding aids, and it is said that they do not write perfect code.

‍