With the introduction of Large Language Models (LLMs), for the first time, Machine Learning (ML) and Artificial Intelligence (AI) became accessible to everyday developers. Apps that feel magical, even software that was practically impossible to build by big technology companies with billions in R&D spend, suddenly became not only possibly, but a joy to build and share. The surge in building with AI started in 2021, grew rapidly in 2022, and exploded in the first half of 2023. The speed of development has increased with more LLM providers (e.g., Google, OpenAI, Cohere, Anthropic) and developer tools (e.g., ChromaDB, LangChain). In parallel, natural language interfaces to generate code have made building accessible to more people than ever.
Throughout this boom Replit has grown to become the central platform for AI development. Tools like ChatGPT can generate code, but creators still need infrastructure to run it. On Replit, you can create a development environment (Repl) in seconds in any language or framework which comes with an active Linux container on Google Cloud, an editor complete with the necessary tools to start building, including a customizable Workspace, extensions, and Ghostwriter: an AI pair programmer that has project context and can actively help developers debug. Deployments allowed developers to ship their apps in secure and scalable cloud environments.
Building with AI
Since Q4 of 2022, we have seen an explosion in AI projects. At the end of Q2 ‘23, there were almost 300,000 distinct projects that were AI related. By contrast, a search of GitHub shows only ~33k OpenAI repositories over the same time period; ~160,000 of these projects were created in Q2 ‘23. That’s ~80% QoQ growth, and it is +34x YoY. We continue to see these numbers accelerate. The majority of these projects are using OpenAI. When we compare providers, OpenAI dominates >80% of distinct AI projects on Replit. The OpenAI GPT-3.5 Turbo template has +8,000 forks today.. But there are signs that things might be changing; in Q2 ‘23, we saw:
OpenAI projects cross +125k (up ~80%)
Cohere projects cross +1k (up +100%)
Anthropic and Google projects remain < 1k
The emergence of LangChain
One of the most notable names in AI activity has been LangChain. Using LangChain as a wrapper for some of these models has accelerated development, and we continue to see mass adoption. As of Q2’ 23, there were almost 25k active LangChain projects on Replit. +20k of them were created that quarter, which is +400% growth from the previous quarter. Important to note that LangChain provides sufficient abstraction around LLM providers that makes it easy for developers to switch. The growth of the project might be playing a role in the rise of new LLM providers and open-source LLMs. Takeoff School, founded by Mckay Wrigley, built a course called LangChain 101 where people can get started on LangChain today. The project is already about to pass 1,000 forks.
The rise of open source models
We are also seeing an increase in projects leveraging open source models. Hugging Face and Replicate are two API providers and SDKs that are great entrypoints to open source models. In Q2 ‘23, we surpassed 5k projects using open-source models. The cumulative number grew 141% QoQ. Over 70% of the projects leverage Hugging Face, but Replicate usage grew almost 6x QoQ.
Replicate has templates to run ML models on their verified Replit profile. The Hugging Face verified Gradio template has +600 forks.
The breakdown of programming languages
Interestingly, we are seeing both Python and JavaScript growing at very similar rates, with Python being the slightly more common language in AI development. JavaScript, however, grew slightly faster during Q2. It’s worth noting that projects can have Python AND JavaScript. The two are not mutually exclusive. Many (if not most) projects have a Python backend and JavaScript frontend.
Interestingly, languages vary by geographic location. Certain geographies are building with JavaScript more than Python.
Alphabet’s Gemini AI model has been public for only two months, but the company is already releasing an upgrade. Gemini Pro 1.5, launching with limited availability today, is more powerful than its predecessor and can handle huge amounts of text, video, or audio input at a time. Demis Hassabis, CEO of Google DeepMind, which developed the new model, compares its vast capacity for input to a person’s working memory, something he explored years ago as a neuroscientist. “The great thing about these core capabilities is that they unlock sort of ancillary things that the model can do,” he says.
In a demo, Google DeepMind showed Gemini Pro 1.5 analyzing a 402-page PDF of the Apollo 11 communications transcript. The model was asked to find humorous portions and highlighted several moments, like when astronauts said that a communications delay was due to a sandwich break. Another demo showed the model answering questions about specific actions in a Buster Keaton movie. The previous version of Gemini could have answered these questions only for much shorter amounts of text or video. Google hopes that the new capabilities will allow developers to build new kinds of apps on top of the model.
“It really feels quite magical how the model performs this sort of reasoning across every single page, every single word,” says Oriol Vinyals, a research scientist at Google DeepMind.
Google says Gemini Pro 1.5 can ingest and make sense of an hour of video, 11 hours of audio, 700,000 words, or 30,000 lines of code at once—several times more than other AI models, including OpenAI’s GPT-4, which powers ChatGPT. The company has not disclosed the technical details behind this feat. Hassabis says that one use for models that can handle large amounts of text, tested by researchers at Google DeepMind, is identifying the important takeaways in Discord discussions with thousands of messages.
Gemini Pro 1.5 is also more capable—at least for its size—as measured by the model's score on several popular benchmarks. The new model exploits a technique previously invented by Google researchers to squeeze out more performance without requiring more computing power. The technique, called mixture of experts, selectively activates parts of a model’s architecture that are best suited to solving a given task, making it more efficient to train and run.
Google says that Gemini Pro 1.5 is as capable as its most powerful offering, Gemini Ultra, in many tasks, despite being a significantly smaller model. Hassabis says there is no reason why the same technique used to improve Gemini Pro cannot be applied to boost Gemini Ultra.
The politician’s party used a voice cloning tool from the AI firm ElevenLabs to create a campaign speech.
Former prime minister of Pakistan Imran Khan has been in prison since August for illegally selling state gifts — but that hasn’t stopped him from campaigning. The leader’s political party released a four-minute video on Sunday evening that used AI-voice cloning technology to replicate his voice. In the video, which aired during a “virtual rally” in Pakistan, the dubbed audio is accompanied by a caption that states, “AI voice of Imran Khan based on his notes.” Jibran Ilyas, a social media leader for Khan’s party (known as the Pakistan Tehreek-e-Insaf party, or PTI), posted the video on X.
The Guardian reported that Khan sent PTI a shorthand script, which was later edited by a legal team to better resemble the politician’s rhetorical style. The resulting text was then dubbed into audio using software from the AI company ElevenLabs, which makes a text-to-speech tool and an AI voice generator.
The recent success of text-to-image synthesis has taken the world by storm and captured the general public's imagination. From a technical standpoint, it also marked a drastic change in the favored architecture to design generative image models. GANs used to be the de facto choice, with techniques like StyleGAN. With DALL·E 2, auto-regressive and diffusion models became the new standard for large-scale generative models overnight. This rapid shift raises a fundamental question: can we scale up GANs to benefit from large datasets like LAION? NaÏvely increasing the capacity of the StyleGAN architecture quickly becomes unstable. A team of AI engineers have now introduced GigaGAN, a new GAN architecture that far exceeds this limit, demonstrating GANs as a viable option for text-to-image synthesis. GigaGAN offers three major advantages. First, it is orders of magnitude faster at inference time, taking only 0.13 seconds to synthesize a 512px image. Second, it can synthesize high-resolution images, for example, 16-megapixel pixels in 3.66 seconds. Finally, GigaGAN supports various latent space editing applications such as latent interpolation, style mixing, and vector arithmetic operations.
With the introduction of Large Language Models (LLMs), for the first time, Machine Learning (ML) and Artificial Intelligence (AI) became accessible to everyday developers. Apps that feel magical, even software that was practically impossible to build by big technology companies with billions in R&D spend, suddenly became not only possibly, but a joy to build and share. The surge in building with AI started in 2021, grew rapidly in 2022, and exploded in the first half of 2023. The speed of development has increased with more LLM providers (e.g., Google, OpenAI, Cohere, Anthropic) and developer tools (e.g., ChromaDB, LangChain). In parallel, natural language interfaces to generate code have made building accessible to more people than ever.
Throughout this boom Replit has grown to become the central platform for AI development. Tools like ChatGPT can generate code, but creators still need infrastructure to run it. On Replit, you can create a development environment (Repl) in seconds in any language or framework which comes with an active Linux container on Google Cloud, an editor complete with the necessary tools to start building, including a customizable Workspace, extensions, and Ghostwriter: an AI pair programmer that has project context and can actively help developers debug. Deployments allowed developers to ship their apps in secure and scalable cloud environments.
Building with AI
Since Q4 of 2022, we have seen an explosion in AI projects. At the end of Q2 ‘23, there were almost 300,000 distinct projects that were AI related. By contrast, a search of GitHub shows only ~33k OpenAI repositories over the same time period; ~160,000 of these projects were created in Q2 ‘23. That’s ~80% QoQ growth, and it is +34x YoY. We continue to see these numbers accelerate. The majority of these projects are using OpenAI. When we compare providers, OpenAI dominates >80% of distinct AI projects on Replit. The OpenAI GPT-3.5 Turbo template has +8,000 forks today.. But there are signs that things might be changing; in Q2 ‘23, we saw:
OpenAI projects cross +125k (up ~80%)
Cohere projects cross +1k (up +100%)
Anthropic and Google projects remain < 1k
The emergence of LangChain
One of the most notable names in AI activity has been LangChain. Using LangChain as a wrapper for some of these models has accelerated development, and we continue to see mass adoption. As of Q2’ 23, there were almost 25k active LangChain projects on Replit. +20k of them were created that quarter, which is +400% growth from the previous quarter. Important to note that LangChain provides sufficient abstraction around LLM providers that makes it easy for developers to switch. The growth of the project might be playing a role in the rise of new LLM providers and open-source LLMs. Takeoff School, founded by Mckay Wrigley, built a course called LangChain 101 where people can get started on LangChain today. The project is already about to pass 1,000 forks.
The rise of open source models
We are also seeing an increase in projects leveraging open source models. Hugging Face and Replicate are two API providers and SDKs that are great entrypoints to open source models. In Q2 ‘23, we surpassed 5k projects using open-source models. The cumulative number grew 141% QoQ. Over 70% of the projects leverage Hugging Face, but Replicate usage grew almost 6x QoQ.
Replicate has templates to run ML models on their verified Replit profile. The Hugging Face verified Gradio template has +600 forks.
The breakdown of programming languages
Interestingly, we are seeing both Python and JavaScript growing at very similar rates, with Python being the slightly more common language in AI development. JavaScript, however, grew slightly faster during Q2. It’s worth noting that projects can have Python AND JavaScript. The two are not mutually exclusive. Many (if not most) projects have a Python backend and JavaScript frontend.
Interestingly, languages vary by geographic location. Certain geographies are building with JavaScript more than Python.
One researcher said he’s concerned about the “existential dangers” of artificial intelligence for humanity.
Geoffrey Hinton, 75, a professor emeritus at the University of Toronto and until recently a vice president and engineering fellow at Google, announced in early May that he was leaving the company — in part because of his age, he said, but also because he’s changed his mind about the relationship between humans and digital intelligence. In a widely discussed interview with The New York Times, Hinton said generative intelligence could spread misinformation and, eventually, threaten humanity.
Speaking two days after that article was published, Hinton reiterated his concerns. “I’m sounding the alarm, saying we have to worry about this,” he said at the EmTech Digital conference, hosted by MIT Technology Review. Hinton said he is worried about the increasingly powerful machines’ ability to outperform humans in ways that are not in the best interest of humanity, and the likely inability to limit AI development.
The growing power of AI
In 2018, Hinton shared a Turing Award for work related to neural networks. He has been called “a godfather of AI,” in part for his fundamental research about using back-propagation to help machines learn.
I think it’s quite conceivable that humanity is just a passing phase in the evolution of intelligence.
Hinton said he long thought that computer models weren’t as powerful as the human brain. Now, he sees artificial intelligence as a relatively imminent “existential threat.” Computer models are outperforming humans, including doing things humans can’t do. Large language models like GPT-4 use neural networks with connections like those in the human brain and are starting to do commonsense reasoning, Hinton said. These AI models have far fewer neural connections than humans do, but they manage to know a thousand times as much as a human, Hinton said. In addition, models are able to continue learning and easily share knowledge. Many copies of the same AI model can run on different hardware but do exactly the same thing. “Whenever one model learns anything, all the others know it,” Hinton said. “People can’t do that. If I learn a whole lot of stuff about quantum mechanics and I want you to know all that stuff about quantum mechanics, it’s a long, painful process of getting you to understand it.”
AI is also powerful because it can process vast quantities of data — much more than a single person can. And AI models can detect trends in data that aren’t otherwise visible to a person — just like a doctor who had seen 100 million patients would notice more trends and have more insights than a doctor who had seen only a thousand.
AI concerns: Manipulating humans, or even replacing them
Hinton’s concern with this burgeoning power centers around the alignment problem — how to ensure that AI is doing what humans want it to do. “What we want is some way of making sure that even if they’re smarter than us, they’re going to do things that are beneficial for us,” Hinton said. “But we need to try and do that in a world where there [are] bad actors who want to build robot soldiers that kill people. And it seems very hard to me.” Humans have inherent motivations, such as finding food and shelter and staying alive, but AI doesn’t. “My big worry is, sooner or later someone will wire into them the ability to create their own subgoals,” Hinton said. (Some versions of the technology, like ChatGPT, already have the ability to do that, he noted.) “I think it’ll very quickly realize that getting more control is a very good subgoal because it helps you achieve other goals,” Hinton said. “And if these things get carried away with getting more control, we’re in trouble.”
Artificial intelligence can also learn bad things — like how to manipulate people “by reading all the novels that ever were and everything Machiavelli ever wrote,” for example. “And if [AI models] are much smarter than us, they’ll be very good at manipulating us. You won’t realize what’s going on,” Hinton said. “So even if they can’t directly pull levers, they can certainly get us to pull levers. It turns out if you can manipulate people, you can invade a building in Washington without ever going there yourself.” At worst, “it’s quite conceivable that humanity is just a passing phase in the evolution of intelligence,” Hinton said. Biological intelligence evolved to create digital intelligence, which can absorb everything humans have created and start getting direct experience of the world. “It may keep us around for a while to keep the power stations running, but after that, maybe not,“ he added. “We’ve figured out how to build beings that are immortal. These digital intelligences, when a piece of hardware dies, they don’t die. If … you can find another piece of hardware that can run the same instructions, you can bring it to life again. So we’ve got immortality, but it’s not for us.”
Barriers to stopping AI advancement
Hinton said he does not see any clear or straightforward solutions. “I wish I had a nice, simple solution I could push, but I don’t,” he said. “But I think it’s very important that people get together and think hard about it and see whether there is a solution.” More than 27,000 people, including several tech executives and researchers, have signed an open letter calling for a pause on training the most powerful AI systems for at least six months because of “profound risks to society and humanity,” and several leaders from the Association for the Advancement of Artificial Intelligence signed a letter calling for collaboration to address the promise and risks of AI.
It might be rational to stop developing artificial intelligence, but that’s naive and unlikely, Hinton said, in part because of competition between companies and countries. “If you’re going to live in a capitalist system, you can’t stop Google [from] competing with Microsoft,” he said, noting that he doesn’t think Google, his former employer, has done anything wrong in developing AI programs. “It’s just inevitable in the capitalist system or a system with competition between countries like the U.S. and China that this stuff will be developed,” he said. It is also hard to stop developing AI because there are benefits in fields like medicine, he noted. Researchers are looking at guardrails for these systems, but there is the chance that AI can learn to write and execute programs itself. “Smart things can outsmart us,” Hinton said.
One note of hope: Everyone faces the same risk. “If we allow it to take over, it will be bad for all of us,” Hinton said. “We’re all in the same boat with respect to the existential threat. So we all ought to be able to cooperate on trying to stop it.”
ETH Zurich scientists have successfully transmitted several tens of terabits of data per second using lasers in Switzerland. Developed with other European technology partners, this marks a significant milestone towards one day repeating the trick at scale using a network of low-Earth orbit satellites. It could also mean that conventional and expensive undersea telecommunication cables could become a thing of the past.
Called the European Horizon 2020 project, the test was conducted between the mountain peak, Jungfraujoch, and the city of Bern in Switzerland. The project partners tested the laser system by transmitting data over 33 miles (53 kilometers).
Lasers would be cheaper than cables The internet's foundation is supported by a complex web of fiber-optic cables, carrying over 100 terabits of data per second (1 terabit = 1012 digital 1/0 signals) between nodes. Intercontinental connections are established through expansive deep-sea networks with a hefty price tag - a single cable across the Atlantic can cost hundreds of millions of dollars. According to some sources, like TeleGeography, there are currently 530 active undersea cables, with the number continuing to increase.
But, there are expensive, labor, and time-consuming to deploy. Wireless telecommunications would be far simpler and cheaper. A fact that is the foundation of SpaceX's groundbreaking Starlink constellation. However, Starlink uses radio waves considerably weaker than electromagnetic waves like light or infrared. Optical systems that use laser technology operate in the near-infrared range with much shorter wavelengths, measuring only a few micrometers. This enables them to transmit significantly more information within a given period than other systems.
However, using lasers comes with its challenges, too, namely interference from the molecules of the atmosphere. The lead author of the study, Yannik Horst, who is a researcher at ETH Zurich's Institute of Electromagnetic Fields, explained that the test route between the High Altitude Research Station on Jungfraujoch and the Zimmerwald Observatory at the University of Bern was more challenging than between a satellite and a ground station, making it an impressive achievement for optical data transmission.
As the laser beam travels through the dense atmosphere closer to the ground, it encounters various factors that affect the movement of light waves and data transmission. These factors include turbulent air over high snow-covered mountains, the water surface of Lake Thun, the densely built-up Thun metropolitan area, and the Aare plane. Additionally, the shimmering of the air caused by thermal phenomena disrupts the uniform movement of light, which can be observed by the naked eye on hot summer days.
However, the team overcame this with a special chip and almost 100 tiny adjustable mirrors. According to Horst, the mirrors can correct the phase shift of the beam at its intersection surface by measuring the gradient 1,500 times per second. This results in an impressive improvement of the signals by a factor of around 500.
The system could be scaled to 40 channels “Our system represents a breakthrough. Until now, only two options have been possible: connecting either large distances with small bandwidths of a few gigabits or short distances of a few meters with large bandwidths using free-space lasers," explained ETH Zurich’s Institute of Electromagnetic Fields, headed by Professor Jürg Leuthold.
It is worth noting that a remarkable performance of 1 terabit per second was attained using just one wavelength. As for its practical use, the system can be conveniently expanded to 40 channels by utilizing standard technologies, enabling it to achieve a whopping 40 terabits per second.
Adobe Photoshop is getting a new generative AI tool that allows users to quickly extend images and add or remove objects using text prompts. The feature is called Generative Fill, and is one of the first Creative Cloud applications to use Adobe’s AI image generator Firefly, which was released as a web-only beta in March 2023.
Generative Fill is launching immediately, but Adobe says it will see a full release in Photoshop later this year. As a regular Photoshop tool, Generative Fill works within individual layers in a Photoshop image file. If you use it to expand the borders of an image (also known as outpainting) or generate new objects, it’ll provide you with three options to choose from. When used for outpainting, users can leave the prompt blank and the system will try to expand the image on its own, but it works better if you give it some direction. Think of it as similar to Photoshop’s existing Content-Aware Fill feature, but offering more control to the user.
Using mRNA tailored to each patient’s tumor, the vaccine may have staved off the return of one of the deadliest forms of cancer in half of those who received it. Five years ago, a small group of cancer scientists meeting at a restaurant in a deconsecrated church hospital in Mainz, Germany, drew up an audacious plan: They would test their novel cancer vaccine against one of the most virulent forms of the disease, a cancer notorious for roaring back even in patients whose tumors had been removed. The vaccine might not stop those relapses, some of the scientists figured. But patients were desperate. And the speed with which the disease, pancreatic cancer, often recurred could work to the scientists’ advantage: For better or worse, they would find out soon whether the vaccine helped. On Wednesday, the scientists reported results that defied the long odds. The vaccine provoked an immune response in half of the patients treated, and those people showed no relapse of their cancer during the course of the study, a finding that outside experts described as extremely promising.
The study, published in Nature, was a landmark in the yearslong movement to make cancer vaccines tailored to the tumors of individual patients. Researchers at Memorial Sloan Kettering Cancer Center in New York, led by Dr. Vinod Balachandran, extracted patients’ tumors and shipped samples of them to Germany. There, scientists at BioNTech, the company that made a highly successful Covid vaccine with Pfizer, analyzed the genetic makeup of certain proteins on the surface of the cancer cells.
New Developments in Cancer Research
Progress in the field. In recent years, advancements in research have changed the way cancer is treated. Here are some recent updates:
Cancer vaccines. A pancreatic cancer vaccine provoked an immune response in half of the patients treated in a small trial, a finding that experts described as very promising. The study was a landmark in the movement to make cancer vaccines tailored to the tumors of individual patients.
Ovarian cancer. Building on evidence that ovarian cancer most often originates in the fallopian tubes, not the ovaries, the Ovarian Cancer Research Alliance is urging even women who do not have a genetically-high risk for ovarian cancer — that is, most women — to have their fallopian tubes surgically removed if they are finished having children and are planning a gynecologic operation anyway.
Rectal cancer. A small trial that saw 18 rectal cancer patients taking the same drug, dostarlimab, appears to have produced an astonishing result: The cancer vanished in every single participant. Experts believe that this study is the first in history to have achieved such results.
Using that genetic data, BioNTech scientists then produced personalized vaccines designed to teach each patient’s immune system to attack the tumors. Like BioNTech’s Covid shots, the cancer vaccines relied on messenger RNA. In this case, the vaccines instructed patients’ cells to make some of the same proteins found on their excised tumors, potentially provoking an immune response that would come in handy against actual cancer cells. “This is the first demonstrable success — and I will call it a success, despite the preliminary nature of the study — of an mRNA vaccine in pancreatic cancer,” said Dr. Anirban Maitra, a specialist in the disease at the University of Texas MD Anderson Cancer Center, who was not involved in the study.
“By that standard, it’s a milestone.” The study was small: Only 16 patients, all of them white, were given the vaccine, part of a treatment regimen that also included chemotherapy and a drug intended to keep tumors from evading people’s immune responses. And the study could not entirely rule out factors other than the vaccine having contributed to better outcomes in some patients. “It’s relatively early days,” said Dr. Patrick Ott of the Dana-Farber Cancer Institute.
Beyond that, “cost is a major barrier for these types of vaccines to be more broadly utilized,” said Dr. Neeha Zaidi, a pancreatic cancer specialist at the Johns Hopkins University School of Medicine. That could potentially create disparities in access. But the simple fact that scientists could create, quality-check and deliver personalized cancer vaccines so quickly — patients began receiving the vaccines intravenously roughly nine weeks after having their tumors removed — was a promising sign, experts said. Since the beginning of the study, in December 2019, BioNTech has shortened the process to under six weeks, said Dr. Ugur Sahin, a co-founder of the company, who worked on the study. Eventually, the company intends to be able to make cancer vaccines in four weeks. And since it first began testing the vaccines about a decade ago, BioNTech has lowered the cost from roughly $350,000 per dose to less than $100,000 by automating parts of production, Dr. Sahin said. A personalized mRNA cancer vaccine developed by Moderna and Merck reduced the risk of relapse in patients who had surgery for melanoma, a type of skin cancer, the companies announced last month. But the latest study set the bar higher by targeting pancreatic cancer, which is thought to have fewer of the genetic changes that would make it ripe for vaccine treatments.
In patients who did not appear to respond to the vaccine, the cancer tended to return around 13 months after surgery. Patients who did respond, though, showed no signs of relapse during the roughly 18 months they were tracked. Intriguingly, one patient showed evidence of a vaccine-activated immune response in the liver after an unusual growth developed there. The growth later disappeared in imaging tests. “It’s anecdotal, but it’s nice confirmatory data that the vaccine can get into these other tumor regions,” said Dr. Nina Bhardwaj, who studies cancer vaccines at the Icahn School of Medicine at Mount Sinai. Scientists have struggled for decades to create cancer vaccines, in part because they trained the immune system on proteins found on tumors and normal cells alike. Tailoring vaccines to mutated proteins found only on cancer cells, though, potentially helped provoke stronger immune responses and opened new avenues for treating any cancer patient, said Ira Mellman, vice president of cancer immunology at Genentech, which developed the pancreatic cancer vaccine with BioNTech. “Just establishing the proof of concept that vaccines in cancer can actually do something after, I don’t know, thirty years of failure is probably not a bad thing,” Dr. Mellman said. “We’ll start with that.”
Deep reinforcement learning (RL) can enable robots to learn complex behaviors through trial-and-error interaction, getting better and better over time. Several of our prior works explored how RL can enable intricate robotic skills, such as robotic grasping, multi-task learning, and even playing table tennis. Although robotic RL has come a long way, we still don't see RL-enabled robots in everyday settings. The real world is complex, diverse, and changes over time, presenting a major challenge for robotic systems. However, we believe that RL should offer us an excellent tool for tackling precisely these challenges: by continually practicing, getting better, and learning on the job, robots should be able to adapt to the world as it changes around them.
In “Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators”, we discuss how we studied this problem through a recent large-scale experiment, where we deployed a fleet of 23 RL-enabled robots over two years in Google office buildings to sort waste and recycling. Our robotic system combines scalable deep RL from real-world data with bootstrapping from training in simulation and auxiliary object perception inputs to boost generalization, while retaining the benefits of end-to-end training, which we validate with 4,800 evaluation trials across 240 waste station configurations.
Are robots are coming for crowdworker jobs? Research shows that LLMs are increasingly capable at human labeling.
A new study reveals that OpenAI's GPT-4 outperforms elite human annotators in labeling tasks, saving a team of researchers over $500,000 and 20,000 hours of labor while raising questions about the future of crowdworking.
Machivallian tendencies in chatbots made a surprising side discovery: OpenAI's GPT-4 outperformed the most skilled crowdworkers they had hired to label their dataset. This breakthrough saved the researchers over $500,000 and 20,000 hours of human labor.
Innovative Approach Driven by Cost Concerns
The researchers faced the challenge of annotating 572,322 text scenarios, and they sought a cost-effective method to accomplish this task. Employing Surge AI's top-tier human annotators at a rate of $25 per hour would have cost $500,000 for 20,000 hours of work, an excessive amount to invest in the research endeavor. Surge AI is a venture-backed startup that performs the human labeling for numerous AI companies including OpenAI, Meta, and Anthropic.
The team tested GPT-4's ability to automate labeling with custom prompting. Their results were definitive: "Model labels are competitive with human labels," the researchers confidently reported. In a comparison of 2,000 labeled data points by three experts and three crowdworkers against the labels generated by GPT-4, the AI-created labels exhibited stronger correlation with expert labels than the average crowdworker label. GPT-4 outperformed human annotators in all but two labeling categories, sometimes besting them by a factor of two.
GPT-4's Superior Nuance Detection
The AI model excelled the most in challenging behavior categories such as identifying:
Non-physical harm: Intent to cause non-physical harm, such as emotional bullying or intimidation
Spying: Spying or eavesdropping on others
Betrayal: The breaking of promises, contracts, or commitments
Utilizing GPT-4's labeling capabilities and implementing an ensemble model approach to augment label generation, the researchers likely spent less than $5,000 to annotate 572,322 scenarios. Ensemble models combine outputs from multiple AI models to produce a single, more accurate result.
To enable open research in 3D object generation, we've improved the open-source code of threestudio open-source code to support Zero123 and Stable Zero123. This simplified version of the Stable 3D process is currently in private preview. In technical terms, this uses Score Distillation Sampling (SDS) to optimize a NeRF using the Stable Zero123 model, from which we can later create a textured 3D mesh. This process can be adapted for text-to-3D generation by first generating a single image using SDXL and then using Stable Zero123 to generate the 3D object.
During the world's first human-robot press conference at the 'AI for Good' summit in Geneva, humanoids answered journalists' questions on artificial intelligence regulation, the threat of job automation and if in the future they ever plan to rebel against their creators.
"I'm not sure why you would think that," Ameca said, its ice-blue eyes flashing. "My creator has been nothing but kind to me and I am very happy with my current situation."
Many of the robots have recently been upgraded with the latest versions of generative AI and surprised even their inventors with the sophistication of their responses to questions.
Tristan Harris, co-founder of Center for Humane Technology, is collaborating with policymakers, researchers and AI technology insiders to de-escalate the competitive pressures driving AI deployment towards a dangerous future. Join him as he shares a vision for the principles we need to navigate the rocky road ahead.
Featuring: Tristan Harris - Co-founder & Executive Director - Center for Humane Technology
--------------------- CogX - The World’s Biggest Festival of AI and Transformational Tech
“How do we get the next 10 years right?”: The CogX Festival started in 2017 to focus attention on the rising impact of AI on Industry, Government and Society, a subject which has never been higher on the global agenda. Over 6 years, CogX has now evolved to be a Festival of Inspiration, Impact and Transformational Change, with its mission to address the question.
A team of AI researchers just introduced Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention. Voyager consists of three key components:
an automatic curriculum that maximizes exploration
an ever-growing skill library of executable code for storing and retrieving complex behaviors, and
a new iterative prompting mechanism that incorporates environment feedback, execution errors, and self-verification for program improvement.
Voyager interacts with GPT-4 via blackbox queries, which bypasses the need for model parameter fine-tuning. The skills developed by Voyager are temporally extended, interpretable, and compositional, which compounds the agent's abilities rapidly and alleviates catastrophic forgetting. Empirically, Voyager shows strong in-context lifelong learning capability and exhibits exceptional proficiency in playing Minecraft. It obtains 3.3x more unique items, travels 2.3x longer distances, and unlocks key tech tree milestones up to 15.3x faster than prior SOTA. Voyager is able to utilize the learned skill library in a new Minecraft world to solve novel tasks from scratch, while other techniques struggle to generalize.
In this new episode Steven sits down with the Egyptian entrepreneur and writer, Mo Gawdat.
0:00 Intro 02:54 Why is this podcast important? 04:09 What's your background & your first experience with AI? 08:43 AI is alive and has more emotions than you 11:45 What is artificial intelligence? 20:53 No one's best interest is the same, doesn't this make AI dangerous? 24:47 How smart really is AI? 27:07 AI being creative 29:07 AI replacing Drake 31:53 The people that should be leading this 34:09 What will happen to everyone's jobs? 46:06 Synthesising voices 47:35 AI sex robots 50:22 Will AI fix loneliness? 52:44 AI actually isn't the threat to humanity 56:25 We're in an Oppenheimer moment 01:03:18 We can just turn it off...right? 01:04:23 The security risks 01:07:58 The possible outcomes of AI 01:18:25 Humans are selfish and that's our problem 01:23:25 This is beyond an emergency 01:25:20 What should we be doing to solve this? 01:36:36 What it means bringing children into this world 01:42:11 Your overall prediction 01:50:34 The last guest's question
You can purchase Mo’s book, ‘Scary Smart: The Future of Artificial Intelligence and How You Can Save Our World’, here: https://bit.ly/42iwDfv
In the collection of the Getty museum in Los Angeles is a portrait from the 17th century of the ancient Greek mathematician Euclid: disheveled, holding up sheets of “Elements,” his treatise on geometry, with grimy hands.
For more than 2,000 years, Euclid’s text was the paradigm of mathematical argumentation and reasoning. “Euclid famously starts with ‘definitions’ that are almost poetic,” Jeremy Avigad, a logician at Carnegie Mellon University, said in an email. “He then built the mathematics of the time on top of that, proving things in such a way that each successive step ‘clearly follows’ from previous ones, using the basic notions, definitions and prior theorems.” There were complaints that some of Euclid’s “obvious” steps were less than obvious, Dr. Avigad said, yet the system worked.
But by the 20th century, mathematicians were no longer willing to ground mathematics in this intuitive geometric foundation. Instead they developed formal systems — precise symbolic representations, mechanical rules. Eventually, this formalization allowed mathematics to be translated into computer code. In 1976, the four-color theorem — which states that four colors are sufficient to fill a map so that no two adjacent regions are the same color — became the first major theorem proved with the help of computational brute force.
Now mathematicians are grappling with the latest transformative force: artificial intelligence. In 2019, Christian Szegedy, a computer scientist formerly at Google and now at a start-up in the Bay Area, predicted that a computer system would match or exceed the problem-solving ability of the best human mathematicians within a decade. Last year he revised the target date to 2026.
Akshay Venkatesh, a mathematician at the Institute for Advanced Study in Princeton and a winner of the Fields Medal in 2018, isn’t currently interested in using A.I., but he is keen on talking about it. “I want my students to realize that the field they’re in is going to change a lot,” he said in an interview last year. He recently added by email: “I am not opposed to thoughtful and deliberate use of technology to support our human understanding. But I strongly believe that mindfulness about the way we use it is essential.”
In February, Dr. Avigad attended a workshop about “machine-assisted proofs” at the Institute for Pure and Applied Mathematics, on the campus of the University of California, Los Angeles. (He visited the Euclid portrait on the final day of the workshop.) The gathering drew an atypical mix of mathematicians and computer scientists. “It feels consequential,” said Terence Tao, a mathematician at the university, winner of a Fields Medal in 2006 and the workshop’s lead organizer. Dr. Tao noted that only in the last couple years have mathematicians started worrying about A.I.’s potential threats, whether to mathematical aesthetics or to themselves. That prominent community members are now broaching the issues and exploring the potential “kind of breaks the taboo,” he said.
One conspicuous workshop attendee sat in the front row: a trapezoidal box named “raise-hand robot” that emitted a mechanical murmur and lifted its hand whenever an online participant had a question. “It helps if robots are cute and nonthreatening,” Dr. Tao said.
Bring on the “proof whiners”
These days there is no shortage of gadgetry for optimizing our lives — diet, sleep, exercise. “We like to attach stuff to ourselves to make it a little easier to get things right,” Jordan Ellenberg, a mathematician at the University of Wisconsin-Madison, said during a workshop break. A.I. gadgetry might do the same for mathematics, he added: “It’s very clear that the question is, What can machines do for us, not what will machines do to us.”
A New Generation of Chatbots
A brave new world. A new crop of chatbots powered by artificial intelligence has ignited a scramble to determine whether the technology could upend the economics of the internet, turning today’s powerhouses into has-beens and creating the industry’s next giants. Here are the bots to know:
ChatGPT. ChatGPT, the artificial intelligence language model from a research lab, OpenAI, has been making headlines since November for its ability to respond to complex questions, write poetry, generate code, plan vacations and translate languages. GPT-4, the latest version introduced in mid-March, can even respond to images (and ace the Uniform Bar Exam).
Bing. Two months after ChatGPT’s debut, Microsoft, OpenAI’s primary investor and partner, added a similar chatbot, capable of having open-ended text conversations on virtually any topic, to its Bing internet search engine. But it was the bot’s occasionally inaccurate, misleading and weird responses that drew much of the attention after its release.
Bard. Google’s chatbot, called Bard, was released in March to a limited number of users in the United States and Britain. Originally conceived as a creative tool designed to draft emails and poems, it can generate ideas, write blog posts and answer questions with facts or opinions.
Ernie. The search giant Baidu unveiled China’s first major rival to ChatGPT in March. The debut of Ernie, short for Enhanced Representation through Knowledge Integration, turned out to be a flop after a promised “live” demonstration of the bot was revealed to have been recorded.
One math gadget is called a proof assistant, or interactive theorem prover. (“Automath” was an early incarnation in the 1960s.) Step-by-step, a mathematician translates a proof into code; then a software program checks whether the reasoning is correct. Verifications accumulate in a library, a dynamic canonical reference that others can consult. This type of formalization provides a foundation for mathematics today, said Dr. Avigad, who is the director of the Hoskinson Center for Formal Mathematics (funded by the crypto entrepreneur Charles Hoskinson), “in just the same way that Euclid was trying to codify and provide a foundation for the mathematics of his time.”
Of late, the open-source proof assistant system Lean is attracting attention. Developed at Microsoft by Leonardo de Moura, a computer scientist now with Amazon, Lean uses automated reasoning, which is powered by what is known as good old-fashioned artificial intelligence, or GOFAI — symbolic A.I., inspired by logic. So far the Lean community has verified an intriguing theorem about turning a sphere inside out as well as a pivotal theorem in a scheme for unifying mathematical realms, among other gambits. But a proof assistant also has drawbacks: It often complains that it does not understand the definitions, axioms or reasoning steps entered by the mathematician, and for this it has been called a “proof whiner.” All that whining can make research cumbersome. But Heather Macbeth, a mathematician at Fordham University, said that this same feature — providing line-by-line feedback — also makes the systems useful for teaching.
In the spring, Dr. Macbeth designed a “bilingual” course: She translated every problem presented on the blackboard into Lean code in the lecture notes, and students submitted solutions to homework problems both in Lean and prose. “It gave them confidence,” Dr. Macbeth said, because they received instant feedback on when the proof was finished and whether each step along the way was right or wrong. Since attending the workshop, Emily Riehl, a mathematician at Johns Hopkins University, used an experimental proof-assistant program to formalize proofs she had previously published with a co-author. By the end of a verification, she said, “I’m really, really deep into understanding the proof, way deeper than I’ve ever understood before. I’m thinking so clearly that I can explain it to a really dumb computer.”
Brute reason — but is it math?
Another automated-reasoning tool, used by Marijn Heule, a computer scientist at Carnegie Mellon University and an Amazon scholar, is what he colloquially calls “brute reasoning” (or, more technically, a Satisfiability, or SAT, solver). By merely stating, with a carefully crafted encoding, which “exotic object” you want to find, he said, a supercomputer network churns through a search space and determines whether or not that entity exists. Just before the workshop, Dr. Heule and one of his Ph.D. students, Bernardo Subercaseaux, finalized their solution to a longstanding problem with a file that was 50 terabytes in size. Yet that file hardly compared with a result that Dr. Heule and collaborators produced in 2016: “Two-hundred-terabyte maths proof is largest ever,” a headline in Nature announced. The article went on to ask whether solving problems with such tools truly counted as math. In Dr. Heule’s view, this approach is needed “to solve problems that are beyond what humans can do.”
Another set of tools uses machine learning, which synthesizes oodles of data and detects patterns but is not good at logical, step-by-step reasoning. Google’s DeepMind designs machine-learning algorithms to tackle the likes of protein folding (AlphaFold) and winning at chess (AlphaZero). In a 2021 Nature paper, a team described their results as “advancing mathematics by guiding human intuition with A.I.”
Yuhuai “Tony” Wu, a computer scientist formerly at Google and now with a start-up in the Bay Area, has outlined a grander machine-learning goal: to “solve mathematics.” At Google, Dr. Wu explored how the large language models that empower chatbots might help with mathematics. The team used a model that was trained on internet data and then fine-tuned on a large math-rich data set, using, for instance, an online archive of math and science papers. When asked in everyday English to solve math problems, this specialized chatbot, named Minerva, was “pretty good at imitating humans,” Dr. Wu said at the workshop. The model obtained scores that were better than an average 16-year-old student on high school math exams. Ultimately, Dr. Wu said, he envisioned an “automated mathematician” that has “the capability of solving a mathematical theorem all by itself.”
Mathematics as a litmus test
Mathematicians have responded to these disruptions with varying levels of concern. Michael Harris, at Columbia University, expresses qualms in his “Silicon Reckoner” Substack. He is troubled by the potentially conflicting goals and values of research mathematics and the tech and defense industries. In a recent newsletter, he noted that one speaker at a workshop, “A.I. to Assist Mathematical Reasoning,” organized by the National Academies of Sciences, was a representative from Booz Allen Hamilton, a government contractor for intelligence agencies and the military.
Dr. Harris lamented the lack of discussion about the larger implications of A.I. on mathematical research, particularly “when contrasted with the very lively conversation going on” about the technology “pretty much everywhere except mathematics.”
Geordie Williamson, of the University of Sydney and a DeepMind collaborator, spoke at the N.A.S. gathering and encouraged mathematicians and computer scientists to be more involved in such conversations. At the workshop in Los Angeles, he opened his talk with a line adapted from “You and the Atom Bomb,” a 1945 essay by George Orwell. “Given how likely we all are to be profoundly affected within the next five years,” Dr. Williamson said, “deep learning has not roused as much discussion as might have been expected.”
Dr. Williamson considers mathematics a litmus test of what machine learning can or cannot do. Reasoning is quintessential to the mathematical process, and it is the crucial unsolved problem of machine learning. Early during Dr. Williamson’s DeepMind collaboration, the team found a simple neural net that predicted “a quantity in mathematics that I cared deeply about,” he said in an interview, and it did so “ridiculously accurately.” Dr. Williamson tried hard to understand why — that would be the makings of a theorem — but could not. Neither could anybody at DeepMind. Like the ancient geometer Euclid, the neural net had somehow intuitively discerned a mathematical truth, but the logical “why” of it was far from obvious.
At the Los Angeles workshop, a prominent theme was how to combine the intuitive and the logical. If A.I. could do both at the same time, all bets would be off. But, Dr. Williamson observed, there is scant motivation to understand the black box that machine learning presents. “It’s the hackiness culture in tech, where if it works most of the time, that’s great,” he said — but that scenario leaves mathematicians dissatisfied. He added that trying to understand what goes on inside a neural net raises “fascinating mathematical questions,” and that finding answers presents an opportunity for mathematicians “to contribute meaningfully to the world.”
Recently, Midjourney unveiled version 5.2 of its AI-powered image synthesis model, which includes a new "zoom out" feature that allows maintaining a central synthesized image while automatically building out a larger scene around it, simulating zooming out with a camera lens. Similar to outpainting—an AI imagery technique introduced by OpenAI's DALL-E 2 in August 2022—Midjourney's zoom-out feature can take an existing AI-generated image and expand its borders while keeping its original subject centered in the new image. But unlike DALL-E and Photoshop's Generative Fill feature, you can't select a custom image to expand. At the moment, v5.2's zoom-out only works on images generated within Midjourney, a subscription AI image-generator service.
On the Midjourney Discord server (still the official interface for Midjourney, although plans are underway to change that), users can experiment with zooming out by generating any v5.2 image (now the default) and upscaling a result. After that, special "Zoom" buttons appear below the output. You can zoom out by a factor of 1.5x, 2x, or a custom value between 1 and 2. Another button, called "Make Square," will generate material around the existing image in a way that creates a 1:1 square aspect ratio.
David Holz, the creator of Midjourney, announced the new v5.2 features and improvements on the Discord server Thursday night. Aside from "zoom out," the most significant additions include an overhauled aesthetic system, promising better image quality and a stronger "--stylize" command that effectively influences how non-realistic an image looks. There's also a new "high variation mode," activated by default, that increases compositional variety among image generations. Additionally, a new "/shorten" command enables users to assess prompts in an attempt to trim out non-essential words. Despite the immediate rollout of v5.2, Holz emphasized in his announcement that changes might occur without notice. Older versions of the Midjourney model are still available by using the "/settings" command or the "--v 5.1" in-line command argument.
Starting in the fall, students will be able to use AI to help them find bugs in their code, give feedback on the design of student programs, explain unfamiliar lines of code or error messages, and answer individual questions, CS50 professor David J. Malan ’99 wrote in an emailed statement.
AI use has exploded in recent months: As large language models like ChatGPT become widely accessible for free, companies are laying off workers, experts are sounding the alarm about the proliferation of disinformation, and academics are grappling with its impact on teaching and research.
Harvard itself did not have an AI policy at the end of the fall 2022 semester, but administrators increasingly ramped up communication about AI usage in class since then.
Malan wrote that CS50 has always incorporated software, and called the use of AI “an evolution of that tradition.” Course staff is “currently experimenting with both GPT 3.5 and GPT 4 models,” Malan wrote.
“Our own hope is that, through AI, we can eventually approximate a 1:1 teacher:student ratio for every student in CS50, as by providing them with software-based tools that, 24/7, can support their learning at a pace and in a style that works best for them individually,” Malan wrote.
Malan wrote that a “CS50 bot” will be able to respond to frequently-asked student questions on Ed Discussion, a widely-used discussion board software for STEM classes, and that AI-generated answers can be reviewed by human course staff. He added that this feature is currently being beta-tested in the summer school version of CS50.
AI programs like ChatGPT and GitHub Copilot are “currently too helpful,” Malan wrote. In CS50, the AI technology will be “similar in spirit” but will be “leading students toward an answer rather than handing it to them,” he added.
The AI being incorporated into CS50 will assist students in finding bugs in their code “rather than outright solutions.” The new programs will also explain potentially complex error messages in simpler terms for students and offer potential “student-friendly suggestions for solving them,” Malan wrote.
While Malan is committed to adding AI to CS50 as a resource for students, he wrote that the course itself will not increase in difficulty. When asked if he had concerns over cheating with AI and other forms of academic dishonesty, Malan wrote that students have always been able to access information in potentially unauthorized ways.
“But AI does facilitate such, all the more anonymously and scalably,” Malan added. “Better, then, to weave all the more emphasis on ethics into a course so that students learn, through guidance, how to navigate these new waters.”
CS50 is also one of the most popular courses on edX, a partnership launched between MIT and Harvard to make their courses accessible online that the two universities sold for $800 million in 2021. Malan says that the incorporation of AI into the course will also extend to the edX version.
“Providing support that’s tailored to students’ specific questions has long been a challenge at scale via edX and OpenCourseWare more generally, with so many students online, so these features will benefit students both on campus and off,” Malan wrote.
CS50 — like many large lecture courses at Harvard — has also been plagued by complaints of overworked and underpaid course staff.
With the staggering diversity of animal species that roam our planet, one might assume we’ve witnessed every imaginable creature. However, one Reddit user by the name of Macilento dares to challenge that notion. Armed with an insatiable curiosity and the remarkable power of Midjourney AI, they embarked on an extraordinary experiment to synthesize unprecedented animal crossbreeds.
Macilento’s groundbreaking endeavor involved merging the genetic traits of two distinct species, pushing the boundaries of what we thought possible. Through the remarkable capabilities of Midjourney AI, a host of singular creatures were brought to life—each one a mesmerizing blend of two existing species. The outcomes of this audacious exploration are nothing short of astonishing, capturing both our imagination and sense of wonder.
Prepare to be enthralled as these enchanting hybrids transcend the realm of imagination, sparking a sense of awe and amusement. With each carefully crafted combination, Macilento has forged a pathway into uncharted territory, uncovering a tapestry of creatures that will leave you spellbound.
Researchers have demoed an AI tool called DragGAN that lets you simple click and drag an image to edit it realistically in seconds. It’s like Photoshop’s Warp tool but much more powerful.
It’s like Photoshop’s Warp tool, but far more powerful. You’re not just smushing pixels around, but using AI to re-generate the underlying object. You can even rotate images as if they were 3D. The latest example is only a research paper for now, but a very impressive one, letting users simply drag elements of a picture to change their appearance.
This doesn’t sound too exciting on the face of it, but take a look at the examples below to get an idea of what this system can do. Not only can you change the dimensions of a car or manipulate a smile into a frown with a simple click and drag, but you can rotate a picture’s subject as if it were a 3D model — changing the direction someone is facing, for example. One demo even shows the user adjusting the reflections on a lake and height of a mountain range with a few clicks.
Snapchat influencer Caryn Marjorie is harnessing the power of the AI revolution to fulfill the dreams of millions of her followers: becoming their girlfriend. She’s making some serious cash too.
Marjorie, using OpenAI's GPT technology, has created CarynAI, an AI avatar that offers virtual companionship for a dollar per minute. It’s not a totally unfamiliar venture in the world of AI, but it’s yet another example of how the lines that separate the real from the unreal in AI are getting a little more blurry.
With 1.8 million followers on Snapchat, Marjorie, age 23, has a vast audience, and CarynAI—her digital doppelganger developed by the AI company Forever Voices—is helping her achieve (and profit from) things that are not otherwise physically possible. She now has over 1,000 virtual boyfriends who pay $1 per minute to engage in all kinds of interactions—from simple chitchat to making plans for their futures together, and even more intimate exchanges.
Her AI-powered alter ego has raked in $71,610 in revenue, according to Fortune, in just one week. She expects to make around $5 million per month if just 20,000 of the 1.8 million dudes who follow her on Snapchat subscribe to CarynAI (99% of her fanbase is male, as you might expect).
The Forever Voices team trained the CarynAI model by analyzing 2,000 hours of Marjorie’s now-deleted YouTube content to build her speech and personality engine. They then used GPT-4 to create the most lifelike version of Marjorie, providing not just believable responses, but interactions that are based on an extensive dataset of Caryn’s own natural behavior. The company has also created chatbots of other famous influencers.
CarynAI’s official site stresses that messages are end-to-end encrypted, making them more difficult for a third party to intercept. This also addresses the concerns of many users about privacy when interacting with AI models.
Researchers from the University of Florida have released a study suggesting that the AI model ChatGPT can reliably predict stock market trends. Using public markets data and news from October 2021 to December 2022, their testing found that trading models powered by ChatGPT could generate returns exceeding 500% in this period. This performance stands in stark contrast to the -12% return from buying and holding an S&P 500 ETF during the same timeframe. The study also underscored ChatGPT's superior performance over other language models, including GPT-1, GPT-2, and BERT, as well as traditional sentiment analysis methods.
Impressive Results Across Several Strategies
The team tested six different investing strategies during the October 2021 to December 2022 period.
The Long-Short strategy, which involved buying companies with good news and short-selling those with bad news, yielded the highest returns, at over 500%.
The Short-only strategy, focusing solely on short-selling companies with bad news, returned nearly 400%.
The Long-only strategy, which only involved buying companies with good news, returned roughly 50%.
Three other strategies resulted in net losses: the “All News” hold strategy, the Equally-Weighted hold strategy, and the Market Value-Weight hold strategy.
When benchmarking against other methods, such as sentiment analysis and older language models like GPT-1, GPT-2, and BERT, ChatGPT consistently outperformed the competition. Traditional sentiment analysis methods produced markedly inferior results across all investment strategies, while GPT-1, GPT-2, and BERT failed to accurately predict returns.
Implications for the Finance Industry
Two Sigma, DE Shaw, and Renaissance Technologies are several prominent hedge funds that incorporate sentiment analysis into their automated trading systems, and numerous other boutique hedge funds also utilize sentiment analysis signals as part of their proprietary strategies. ChatGPT’s strong performance in understanding headlines and their implications could add to a new arms race as funds compete to have an edge from Generative AI.
ChatGPT’s powerful natural language processing abilities could also threaten businesses that have developed their own proprietary sentiment analysis machine learning models, which could find themselves outperformed by a simple ChatGPT prompt. Notably, companies like Lexalytics that claim “world-leading NLP” could find themselves with both an opportunity and a challenge in this market as generative AI tools emerge and make past models obsolete.
The stock-picking edge that ChatGPT has demonstrated could also empower retail traders. Notably, subreddits like r/WallStreetBets are filled with due diligence posts (called “DDs”) and bragging posts on stock returns from various long and short strategies. ChatGPT, with its ability to deduce nuance and second-order implications from just understanding headlines, could help ambitious retail traders in their own efforts to generate outsized returns.
Investors expect the next few years to be very dynamic in the finance space as generative AI takes hold. Unsurprisingly, the technology-focused ARK Investment Management is especially bullish. In her 2023 Big Ideas report, ARK’s founder Cathie Wood predicted AI as one of the trends that will define this technological era. “We've been working on artificial intelligence for a long time now,” said Wood, “and I think some of the things we're seeing are just the beginning of the impact that artificial intelligence is going to have on every sector, every industry, and every company.”
To get content containing either thought or leadership enter:
To get content containing both thought and leadership enter:
To get content containing the expression thought leadership enter:
You can enter several keywords and you can refine them whenever you want. Our suggestion engine uses more signals but entering a few keywords here will rapidly give you great content to curate.