We have compiled major news on artificial intelligence from the first quarter. It seems like generative AI, including ChatGPT which was released late last year, is still all the rage. Aside from this, we will share noteworthy AI technologies and industry trends.
January
Google Expressed Its Official Position on ChatGPT
Google has opened an article titled 'OUR FOCUS - Why we focus on AI (and to whatend)' expressing their view on AI. To simply summarize, AI development requires the prudence and the responsibility. However, given the announcement timing, it seems to be greatly believed on the outside that Google has felt crisis and checked ChatGPT that's newly appeared.
References
https://ai.google/our-focus/
https://korea.googleblog.com/2023/01/ai-our-perspective-focus-principle.html
DeepMind Has Announced Adaptive Artificial Intelligence, AdA
DeepMind unveiled AdA, an adaptive artificial intelligence that solves problems as quickly and accurately as humans. Unlike previous reinforcement learning AI, it is said to have acquired how to learn through experiments. It can also improve the ability to perform tasks just as humans or animals do through play, and quickly adapt to the new tasks.
References
https://sites.google.com/view/adaptive-agent/?pli=1
https://arxiv.org/abs/2301.07608
https://www.techtimes.com/articles/287019/20230131/deepminds-ada-ai-system-solves-new-tasks-quickly-accurately-humans.htm
Google Has Revealed a New Image Creation AI, Muse
Muse creates high-quality images much faster than existing DALL-E or Imagen. In addition, quality and accuracy of the image are superior to other models. Google explained, “A detailed language understanding is possible, allowing Muse to understand visual concepts such as objects, spatial relationships, and poses, and mask using only text.”
References
https://arxiv.org/abs/2301.00704
https://muse-model.github.io
http://www.newstheai.com/news/articleView.html?idxno=3696
Microsoft Has Announced a Voice Synthesis Artificial Intelligence, VALL-E
VALL-E imitates the human voice, and even the emotional tone and recording environment with just 3 seconds of voice samples. This means that if a phone voice sample is used, the synthesized voice will sound like a phone call. MS called this Neural Codec Language Models, which is a method of generating individual audio codec codes from text and sound prompts rather than the existing voice synthesis method through waveform manipulation.
References
https://arxiv.org/abs/2301.02111
https://valle-demo.github.io
https://www.thedailypost.kr/news/articleView.html?idxno=91008
February
Google Has Unveiled Bard
Google has unveiled Bard, an experimental conversational artificial intelligence based on LaMDA. Bard was considered a counterpart to ChatGPT as a light version of LaMDA. However, there has been an error, which caused a sharp drop in stock prices, and it was recently released first in the U.S. and the U.K. as a separate chat service rather than a search.
References
https://bard.google.com
https://blog.google/technology/ai/try-bard/
https://blog.google/technology/ai/bard-google-ai-search-updates/
https://www.technologyreview.com/2023/03/21/1070111/google-bard-chatgpt-openai-microsoft-bing-search/
Microsoft Has Revealed an Upgraded Bing with ChatGPT
MS has unveiled an upgraded Bing with ChatGPT feature. They introduced is as “reinventing search, your copilot for the web.” In addition, Microsoft is known to be investing an additional $10 billion in OpenAI, and plans to apply GPT technology to all its product lines, including Office.
References
https://blogs.microsoft.com/blog/2023/02/07/reinventing-search-with-a-new-ai-powered-microsoft-bing-and-edge-your-copilot-for-the-web/
https://www.itworld.co.kr/news/276655#csidxa068bf634d0830c9b213c3120a547d9
Meta Has Opened an AI Language Model, Toolformer
META has released Toolformer, an AI language model that can learn how to use tools by itself. The API call function allows it to use external software tools such as search, calculator, calendar, and translator. This is an attempt to overcome the limitations of the previous language models, which have shown a great ability in natural language processing but struggled with other basic tasks like arithmetic and fact-checking.
References
https://arxiv.org/abs/2302.04761
https://arstechnica.com/information-technology/2023/02/meta-develops-an-ai-language-bot-that-can-use-external-software-tools/
https://www.aitimes.com/news/articleView.html?idxno=149518
March
OpenAI Has Released a Next-Generation Large Language Model, GPT-4
GPT-4 is a large language model with multimodal features. It can respond to both text and images, and is a larger and better model than ChatGPT. However, while there was a rush of adoption upon its launch and it aroused an explosive interest, OpenAI did not open most of the technical information unlike before and received a lot of criticism.
References
https://openai.com/product/gpt-4
https://openai.com/research/gpt-4
https://arxiv.org/abs/2303.08774
https://www.technologyreview.com/2023/03/14/1069823/gpt-4-is-bigger-and-better-chatgpt-openai/
Open AI, Has Made Public ChatGPT Plugins
As is already known, ChatGPT has learned from data up to 2021. It had the limitation in knowing the information after that, but this time, the ChatGPT plugin was released which can add various functions by linking external APIs. It is said that now it has an ability to perform much more functions than before as well as improved reliability and accuracy.
References
https://openai.com/blog/chatgpt-plugins
https://platform.openai.com/docs/plugins/introduction
https://github.com/openai/chatgpt-retrieval-plugin
https://www.zdnet.com/article/chatgpt-is-getting-access-to-the-internet-heres-what-that-means-for-you/
Google Has Opened a Multimodal Language Model, PaLM-E
Google has released PaLM-E, a multimodal language model with language and visual recognition. A vision model and robot control were added to the previous large-scale language model PaLM. Now, the language models are expected to be used more widely, going beyond understanding text to image, audio, and video information and controlling robots.
References
https://ai.googleblog.com/2023/03/palm-e-embodied-multimodal-language.html
https://palm-e.github.io
https://palm-e.github.io/assets/palm-e.pdf
Microsoft Has Unveiled a Multimodal Large Language Model, Kosmos-1
MS has unveiled a multimodal large language model, Kosmos-1, with visual capabilities as well as natural language processing function. This means that, while actively taking OpenAI technologies, it has been doing a development by itself. Cosmos-1 can answer questions by analyzing images, and it showed the potential of the language model to perform non-verbal reasoning by showing an outcome (22-26% correct answer rate) on the Raven's Progressive Matrices, which measures visual IQ.
References
https://arxiv.org/pdf/2302.14045.pdf
https://github.com/microsoft/unilm
https://techrecipe.co.kr/posts/51346
Bill Gates, “AI Is the Second Most Revolutionary Technology in My Lifetime”
Bill Gates, in his blog, cited artificial intelligence, especially generative AI, as the most important innovation of our time. He said that it was a revolutionary technology that had a great impact in his lifetime after the Graphic User Interface (GUI) which influenced his founding of Microsoft. Through this, he expressed his expectation that the world would fundamentally change saying, "Artificial intelligence is as revolutionary as mobile phones and the Internet", while emphasizing the need for rules to enjoy AI technology benefits equally for everyone.
References
https://www.gatesnotes.com/The-Age-of-AI-Has-Begun
https://www.bbc.com/news/technology-65032848
In conclusion
Not only Bill Gates, but people around the world are expressing both expectations and concerns about how generative AI will change the world we live in.
Recently, Fei-Fei Li, known as the godmother of deep learning, also said the generative AI as AI's Great Inflection Point through a report titled ‘Generative AI: Perspectives from Stanford HAI’ published by Stanford Human Centered Artificial Intelligence (HAI). * She, who has contributed greatly to the development of 'machines that can see what humans see', expressed great expectations, saying that now is the time to think about creating 'AI that can see what humans cannot see.' At the same time, however, she expressed concerns about the bias of AI and the malicious use. It's been pointed out that special caution and risk assessment are essential to fully realize new opportunities.
The recent flood of AI news has given me a lot to think about. With the rapid development of AI technology, will humans walk a brilliant and rosy path in the future, or will they face a dark and miserable future? We now seem to be at a critical crossroads that could decide this.
* https://hai.stanford.edu/sites/default/files/2023-03/Generative_AI_HAI_Perspectives.pdf