Here's a summary of major AI news from the first half of the past year. In particular, I collected news from the field of natural language processing, which is also closely related to LETR. We also include related references for those who want to learn more.
January
Development of Data2Vec, a self-supervised learning algorithm that simultaneously recognizes meta, voice, images, and characters
Meta (Meta) AI developed an AI self-guided learning algorithm that simultaneously recognizes speech, text, and images. This changed the paradigm of traditional algorithm research, which was conducted in different ways in the fields of speech, text, and image. Meta researchers were confident that this would be the cornerstone of the development of General Model Architectures (General Model Architectures).
reference
https://ai.facebook.com/blog/the-first-high-performance-self-supervised-algorithm-that-works-for-speech-vision-and-text/
https://github.com/facebookresearch/fairseq/tree/main/examples/data2vec
https://arxiv.org/abs/2202.03555
https://www.technologyreview.com/2022/01/20/1043885/meta-ai-facebook-learning-algorithm-nlp-vision-speech-agi/
https://byline.network/2022/01/21-168/
http://www.aitimes.com/news/articleView.html?idxno=142722
February
Development of AI alpha codes for deepmind and coding
We have developed an AI AlphaCode (AlphaCode) coded by DeepMind. Alpha Code is known to have above average abilities, which is equivalent to the top 54% of human developers. This goes beyond the limitations of traditional large-scale language models that can't do more than simply translate instructions into code.
reference
https://www.deepmind.com/blog/competitive-programming-with-alphacode
https://alphacode.deepmind.com/
https://arxiv.org/abs/2203.07814
http://www.aitimes.com/news/articleView.html?idxno=142892
https://byline.network/2022/02/3-108/
OpenAI develops a new version of InstructGPT that improves GPT-3 issues
We have developed an AI AlphaCode (AlphaCode) coded by DeepMind. Alpha Code is known to have above average abilities, which is equivalent to the top 54% of human developers. This goes beyond the limitations of traditional large-scale language models that can't do more than simply translate instructions into code.
reference
https://openai.com/blog/instruction-following/#moon
https://github.com/openai/following-instructions-human-feedback
https://arxiv.org/abs/2203.02155
https://www.technologyreview.kr/new-gpt3-openai-chatbot-language-model-ai-toxic-misinformation/
https://littlefoxdiary.tistory.com/101
March
Stanford releases Human-Centered Artificial Intelligence Research Institute 2022 Annual Report
Stanford Human-Centered Artificial Intelligence (Human-Centered Artificial Intelligence. HAI) announced the “AI Index 2022 (AI Index 2022).” The topic of this report was “Industrialization of Artificial Intelligence and Growing Ethical Issues (Industrialization of AI and Ethical Mounting).” In particular, there are nine key points: “▷ Surging private investment in AI, strengthening investment focus, ▷ leading cooperation between the US, China, and countries on AI ▷ language models are more competent than ever before, but they are more biased. “The rise of AI ethics everywhere ▷ AI is getting cheaper and improving performance ▷ data, data, data ▷ more global legislation on AI than ever before ▷ robotic arms are getting cheaper” was included.
reference
https://aiindex.stanford.edu/report/
https://hai.stanford.edu/news/state-ai-9-charts
https://hai.stanford.edu/news/2022-ai-index-ais-ethical-growing-pains
https://hai.stanford.edu/news/2022-ai-index-industrialization-ai-and-mounting-ethical-concerns
April
Google AI Unveils 21 Multilingual Massive Corpus as Open Source CVSS
Google AI has released a large-scale multilingual voice-to-voice translation corpus (CVSS) as an open source. It is known that this is to promote a new generation of S2ST (Speech-To-Speech Translation) research and the development of artificial intelligence speech conversion applications. CVSS includes two S2ST data sets (1,872 hours and 1,937 hours of speech, respectively) along with the source voice, and in addition to the translated voice, it also provides translated text such as normalized numbers, calls, and words that match the pronunciation of the translated voice
reference
https://ai.googleblog.com/2022/04/introducing-cvss-massively-multilingual.html
https://arxiv.org/abs/2201.03713
https://github.com/google-research-datasets/cvss
https://research.google/tools/datasets/speech-to-speech-translation-corpus/
https://www.marktechpost.com/2022/04/07/google-ai-introduces-a-common-voice-based-speech-to-speech-translation-corpus-cvss-that-can-be-directly-used-for-training-direct-s2st-models-without-any-extra-processing/
http://www.aitimes.kr/news/articleView.html?idxno=24706
Google unveils supergiant language model PArM
Google has unveiled a new language model, PAlM (Pathways Language Model). It's an ultra-large language model with 540 billion parameters, about three times larger than Open AI's GPT-3. It is a single AI model with powerful performance that can solve various problems such as natural language understanding and generation as well as arithmetic by learning how to solve problems.
reference
https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html
https://arxiv.org/abs/2204.02311
https://www.infoq.com/news/2022/04/google-palm-ai/
https://byline.network/2022/04/7-138/
http://www.aitimes.com/news/articleView.html?idxno=143840
Open AI announces new version of DALL·E, an artificial intelligence for image generation
OpenAI has released a new version of DALL·E, an AI that generates text as images. DALL·E 2 can not only create new images with high resolution, but it is also possible to edit images. In the future, it is expected that it can be used in various ways, such as providing new ideas to designers and artists.
reference
https://openai.com/dall-e-2/
https://arxiv.org/abs/2204.06125
https://towardsdatascience.com/dall-e-2-explained-the-promise-and-limitations-of-a-revolutionary-ai-3faf691be220
http://www.aitimes.com/news/articleView.html?idxno=143854&page=4&total=638
https://byline.network/2022/04/8-127/
May
Google unveils AI Test Kitchen to test AI language model LaMDA 2
Google I/O 2002 unveiled AI Test Kitchen, an app for beta testing such as error search for AI language model LaMDA 2. A total of 3 functions were introduced, such as imagining ideas with AI, talking about specific topics, and organizing to-do lists. This is a kind of crowdsourced test, which is expected to help improve issues related to AI language models later.
reference
https://io.google/2022/intl/ko/
https://aitestkitchen.withgoogle.com
https://www.xda-developers.com/google-new-ai-test-kitchen-test-conversational-ai/
https://www.theverge.com/2022/5/11/23065072/google-ai-app-test-kitchen-future-io-2022
https://www.wired.kr/news/articleView.html?idxno=3929
http://www.aitimes.com/news/articleView.html?idxno=144546
Meta unveils self-developed super-giant AI language model as open source
Meta Artificial Intelligence Research Institute (Meta AI) has released 'Open Preliminary Transformer (hereafter, OPT-175B) ', a super-large AI language model with 175 billion parameters, as an open source. Furthermore, it is known that all pre-trained models and code are included. This is a bold and welcome move, which is expected to be particularly helpful in solving problems such as AI bias.
reference
https://ai.facebook.com/blog/democratizing-access-to-large-scale-language-models-with-opt-175b/
https://github.com/facebookresearch/metaseq
https://arxiv.org/abs/2205.01068
https://www.technologyreview.kr/메타-자체-개발한-대형언어모델-무료-공개/
http://www.aitimes.kr/news/articleView.html?idxno=25025
DeepMind unveils Gato, a new AI system that performs multiple tasks
DeepMind has unveiled Gato (Gato), a general-purpose agent that can perform multiple tasks by processing various forms of data with a single neural network model. DeepMind revealed that Gato can perform 604 tasks, and that he is superior to human experts in 450 of them. However, the claim that this is a step forward towards general-purpose artificial intelligence has aroused much criticism and controversy.
reference
https://www.deepmind.com/publications/a-generalist-agent
https://arxiv.org/abs/2205.06175
https://www.independent.co.uk/tech/ai-deepmind-artificial-general-intelligence-b2080740.html
https://www.technologyreview.kr/deepmind-gato-ai-model-hype/
https://towardsdatascience.com/gato-the-latest-from-deepmind-towards-true-ai-1ac06e1d18cd
http://scimonitors.com/딥마인드-새로운-ai-gato는-agi인가/
Google unveils Imagen, an artificial intelligence for image creation
Google introduced Imagen, an AI system that generates text as an image. As a result of the benchmark evaluation, it was announced that it was preferred over competing models such as Open AI's DALL-E. However, like other models, it was not disclosed to the public due to concerns about side effects such as abuse, prejudice, and reflection of discriminatory attitudes.
reference
https://imagen.research.google
https://arxiv.org/abs/2205.11487
https://www.assemblyai.com/blog/how-imagen-actually-works/
https://www.technologyreview.kr/dark-secret-cute-ai-animal-images-dalle-openai-imagen-google/
http://www.aitimes.com/news/articleView.html?idxno=144897
June
GitHub officially launches No-Coding AI Co-Pilot
GitHub has officially launched the no-coding AI tool Copilot (Copilot). Copilot was built on OpenAI's Codex (Codex) and GitHub's code database, and was released about a year ago. However, since then, Amazon Web Services and Google DeepMind have also released AI for coding, but until now, they all only act as coding aids, and it is said that they do not write perfect code.
reference
https://github.com/features/copilot
https://github.blog/2022-06-21-github-copilot-is-generally-available-to-all-developers/
https://www.techtarget.com/searchsoftwarequality/news/252521966/Code-completion-AI-bot-trend-continues-with-GitHub-Copilot
http://www.aitimes.com/news/articleView.html?idxno=145330
https://www.hani.co.kr/arti/economy/it/1049992.html