Scooped by
Dr. Stefan Gruenwald
onto AI Singularity July 13, 2023 12:31 PM
|
Get Started for FREE
Sign up with Facebook Sign up with X
I don't have a Facebook or a X account
Scooped by
Dr. Stefan Gruenwald
onto AI Singularity July 13, 2023 12:31 PM
|
Your new post is loading...
Your new post is loading...
Scoop.it!
Alphabet’s Gemini AI model has been public for only two months, but the company is already releasing an upgrade. Gemini Pro 1.5, launching with limited availability today, is more powerful than its predecessor and can handle huge amounts of text, video, or audio input at a time. Demis Hassabis, CEO of Google DeepMind, which developed the new model, compares its vast capacity for input to a person’s working memory, something he explored years ago as a neuroscientist. “The great thing about these core capabilities is that they unlock sort of ancillary things that the model can do,” he says.
In a demo, Google DeepMind showed Gemini Pro 1.5 analyzing a 402-page PDF of the Apollo 11 communications transcript. The model was asked to find humorous portions and highlighted several moments, like when astronauts said that a communications delay was due to a sandwich break. Another demo showed the model answering questions about specific actions in a Buster Keaton movie. The previous version of Gemini could have answered these questions only for much shorter amounts of text or video. Google hopes that the new capabilities will allow developers to build new kinds of apps on top of the model. “It really feels quite magical how the model performs this sort of reasoning across every single page, every single word,” says Oriol Vinyals, a research scientist at Google DeepMind.
Google says Gemini Pro 1.5 can ingest and make sense of an hour of video, 11 hours of audio, 700,000 words, or 30,000 lines of code at once—several times more than other AI models, including OpenAI’s GPT-4, which powers ChatGPT. The company has not disclosed the technical details behind this feat. Hassabis says that one use for models that can handle large amounts of text, tested by researchers at Google DeepMind, is identifying the important takeaways in Discord discussions with thousands of messages. Gemini Pro 1.5 is also more capable—at least for its size—as measured by the model's score on several popular benchmarks. The new model exploits a technique previously invented by Google researchers to squeeze out more performance without requiring more computing power. The technique, called mixture of experts, selectively activates parts of a model’s architecture that are best suited to solving a given task, making it more efficient to train and run.
Google says that Gemini Pro 1.5 is as capable as its most powerful offering, Gemini Ultra, in many tasks, despite being a significantly smaller model. Hassabis says there is no reason why the same technique used to improve Gemini Pro cannot be applied to boost Gemini Ultra.
Scoop.it!
The politician’s party used a voice cloning tool from the AI firm ElevenLabs to create a campaign speech.
Former prime minister of Pakistan Imran Khan has been in prison since August for illegally selling state gifts — but that hasn’t stopped him from campaigning. The leader’s political party released a four-minute video on Sunday evening that used AI-voice cloning technology to replicate his voice. In the video, which aired during a “virtual rally” in Pakistan, the dubbed audio is accompanied by a caption that states, “AI voice of Imran Khan based on his notes.” Jibran Ilyas, a social media leader for Khan’s party (known as the Pakistan Tehreek-e-Insaf party, or PTI), posted the video on X.
The Guardian reported that Khan sent PTI a shorthand script, which was later edited by a legal team to better resemble the politician’s rhetorical style. The resulting text was then dubbed into audio using software from the AI company ElevenLabs, which makes a text-to-speech tool and an AI voice generator.
Scoop.it!
In 5 years, agents will be able to give health care advice, tutor students, do your shopping, help workers be far more productive, and much more
Scoop.it!
Theory of mind may have spontaneously emerged in large language models.
Scoop.it!
The recent success of text-to-image synthesis has taken the world by storm and captured the general public's imagination. From a technical standpoint, it also marked a drastic change in the favored architecture to design generative image models. GANs used to be the de facto choice, with techniques like StyleGAN. With DALL·E 2, auto-regressive and diffusion models became the new standard for large-scale generative models overnight. This rapid shift raises a fundamental question: can we scale up GANs to benefit from large datasets like LAION? NaÏvely increasing the capacity of the StyleGAN architecture quickly becomes unstable. A team of AI engineers have now introduced GigaGAN, a new GAN architecture that far exceeds this limit, demonstrating GANs as a viable option for text-to-image synthesis. GigaGAN offers three major advantages. First, it is orders of magnitude faster at inference time, taking only 0.13 seconds to synthesize a 512px image. Second, it can synthesize high-resolution images, for example, 16-megapixel pixels in 3.66 seconds. Finally, GigaGAN supports various latent space editing applications such as latent interpolation, style mixing, and vector arithmetic operations.
Scoop.it!
With the introduction of Large Language Models (LLMs), for the first time, Machine Learning (ML) and Artificial Intelligence (AI) became accessible to everyday developers. Apps that feel magical, even software that was practically impossible to build by big technology companies with billions in R&D spend, suddenly became not only possibly, but a joy to build and share. The surge in building with AI started in 2021, grew rapidly in 2022, and exploded in the first half of 2023. The speed of development has increased with more LLM providers (e.g., Google, OpenAI, Cohere, Anthropic) and developer tools (e.g., ChromaDB, LangChain). In parallel, natural language interfaces to generate code have made building accessible to more people than ever.
Throughout this boom Replit has grown to become the central platform for AI development. Tools like ChatGPT can generate code, but creators still need infrastructure to run it. On Replit, you can create a development environment (Repl) in seconds in any language or framework which comes with an active Linux container on Google Cloud, an editor complete with the necessary tools to start building, including a customizable Workspace, extensions, and Ghostwriter: an AI pair programmer that has project context and can actively help developers debug. Deployments allowed developers to ship their apps in secure and scalable cloud environments. Building with AISince Q4 of 2022, we have seen an explosion in AI projects. At the end of Q2 ‘23, there were almost 300,000 distinct projects that were AI related. By contrast, a search of GitHub shows only ~33k OpenAI repositories over the same time period; ~160,000 of these projects were created in Q2 ‘23. That’s ~80% QoQ growth, and it is +34x YoY. We continue to see these numbers accelerate. The majority of these projects are using OpenAI. When we compare providers, OpenAI dominates >80% of distinct AI projects on Replit. The OpenAI GPT-3.5 Turbo template has +8,000 forks today.. But there are signs that things might be changing; in Q2 ‘23, we saw:
The emergence of LangChainOne of the most notable names in AI activity has been LangChain. Using LangChain as a wrapper for some of these models has accelerated development, and we continue to see mass adoption. As of Q2’ 23, there were almost 25k active LangChain projects on Replit. +20k of them were created that quarter, which is +400% growth from the previous quarter. Important to note that LangChain provides sufficient abstraction around LLM providers that makes it easy for developers to switch. The growth of the project might be playing a role in the rise of new LLM providers and open-source LLMs. Takeoff School, founded by Mckay Wrigley, built a course called LangChain 101 where people can get started on LangChain today. The project is already about to pass 1,000 forks. The rise of open source modelsWe are also seeing an increase in projects leveraging open source models. Hugging Face and Replicate are two API providers and SDKs that are great entrypoints to open source models. In Q2 ‘23, we surpassed 5k projects using open-source models. The cumulative number grew 141% QoQ. Over 70% of the projects leverage Hugging Face, but Replicate usage grew almost 6x QoQ. Replicate has templates to run ML models on their verified Replit profile. The Hugging Face verified Gradio template has +600 forks. The breakdown of programming languagesInterestingly, we are seeing both Python and JavaScript growing at very similar rates, with Python being the slightly more common language in AI development. JavaScript, however, grew slightly faster during Q2. It’s worth noting that projects can have Python AND JavaScript. The two are not mutually exclusive. Many (if not most) projects have a Python backend and JavaScript frontend. Interestingly, languages vary by geographic location. Certain geographies are building with JavaScript more than Python.
Scoop.it!
One researcher said he’s concerned about the “existential dangers” of artificial intelligence for humanity.
Geoffrey Hinton, 75, a professor emeritus at the University of Toronto and until recently a vice president and engineering fellow at Google, announced in early May that he was leaving the company — in part because of his age, he said, but also because he’s changed his mind about the relationship between humans and digital intelligence. In a widely discussed interview with The New York Times, Hinton said generative intelligence could spread misinformation and, eventually, threaten humanity.
Speaking two days after that article was published, Hinton reiterated his concerns. “I’m sounding the alarm, saying we have to worry about this,” he said at the EmTech Digital conference, hosted by MIT Technology Review. Hinton said he is worried about the increasingly powerful machines’ ability to outperform humans in ways that are not in the best interest of humanity, and the likely inability to limit AI development. The growing power of AIIn 2018, Hinton shared a Turing Award for work related to neural networks. He has been called “a godfather of AI,” in part for his fundamental research about using back-propagation to help machines learn. I think it’s quite conceivable that humanity is just a passing phase in the evolution of intelligence.
Hinton said he long thought that computer models weren’t as powerful as the human brain. Now, he sees artificial intelligence as a relatively imminent “existential threat.” Computer models are outperforming humans, including doing things humans can’t do. Large language models like GPT-4 use neural networks with connections like those in the human brain and are starting to do commonsense reasoning, Hinton said. These AI models have far fewer neural connections than humans do, but they manage to know a thousand times as much as a human, Hinton said. In addition, models are able to continue learning and easily share knowledge. Many copies of the same AI model can run on different hardware but do exactly the same thing. “Whenever one model learns anything, all the others know it,” Hinton said. “People can’t do that. If I learn a whole lot of stuff about quantum mechanics and I want you to know all that stuff about quantum mechanics, it’s a long, painful process of getting you to understand it.”
AI is also powerful because it can process vast quantities of data — much more than a single person can. And AI models can detect trends in data that aren’t otherwise visible to a person — just like a doctor who had seen 100 million patients would notice more trends and have more insights than a doctor who had seen only a thousand. AI concerns: Manipulating humans, or even replacing themHinton’s concern with this burgeoning power centers around the alignment problem — how to ensure that AI is doing what humans want it to do. “What we want is some way of making sure that even if they’re smarter than us, they’re going to do things that are beneficial for us,” Hinton said. “But we need to try and do that in a world where there [are] bad actors who want to build robot soldiers that kill people. And it seems very hard to me.” Humans have inherent motivations, such as finding food and shelter and staying alive, but AI doesn’t. “My big worry is, sooner or later someone will wire into them the ability to create their own subgoals,” Hinton said. (Some versions of the technology, like ChatGPT, already have the ability to do that, he noted.) “I think it’ll very quickly realize that getting more control is a very good subgoal because it helps you achieve other goals,” Hinton said. “And if these things get carried away with getting more control, we’re in trouble.”
Artificial intelligence can also learn bad things — like how to manipulate people “by reading all the novels that ever were and everything Machiavelli ever wrote,” for example. “And if [AI models] are much smarter than us, they’ll be very good at manipulating us. You won’t realize what’s going on,” Hinton said. “So even if they can’t directly pull levers, they can certainly get us to pull levers. It turns out if you can manipulate people, you can invade a building in Washington without ever going there yourself.” At worst, “it’s quite conceivable that humanity is just a passing phase in the evolution of intelligence,” Hinton said. Biological intelligence evolved to create digital intelligence, which can absorb everything humans have created and start getting direct experience of the world. “It may keep us around for a while to keep the power stations running, but after that, maybe not,“ he added. “We’ve figured out how to build beings that are immortal. These digital intelligences, when a piece of hardware dies, they don’t die. If … you can find another piece of hardware that can run the same instructions, you can bring it to life again. So we’ve got immortality, but it’s not for us.” Barriers to stopping AI advancementHinton said he does not see any clear or straightforward solutions. “I wish I had a nice, simple solution I could push, but I don’t,” he said. “But I think it’s very important that people get together and think hard about it and see whether there is a solution.” More than 27,000 people, including several tech executives and researchers, have signed an open letter calling for a pause on training the most powerful AI systems for at least six months because of “profound risks to society and humanity,” and several leaders from the Association for the Advancement of Artificial Intelligence signed a letter calling for collaboration to address the promise and risks of AI.
It might be rational to stop developing artificial intelligence, but that’s naive and unlikely, Hinton said, in part because of competition between companies and countries. “If you’re going to live in a capitalist system, you can’t stop Google [from] competing with Microsoft,” he said, noting that he doesn’t think Google, his former employer, has done anything wrong in developing AI programs. “It’s just inevitable in the capitalist system or a system with competition between countries like the U.S. and China that this stuff will be developed,” he said. It is also hard to stop developing AI because there are benefits in fields like medicine, he noted. Researchers are looking at guardrails for these systems, but there is the chance that AI can learn to write and execute programs itself. “Smart things can outsmart us,” Hinton said.
One note of hope: Everyone faces the same risk. “If we allow it to take over, it will be bad for all of us,” Hinton said. “We’re all in the same boat with respect to the existential threat. So we all ought to be able to cooperate on trying to stop it.”
Scoop.it!
ETH Zurich scientists have successfully transmitted several tens of terabits of data per second using lasers in Switzerland. Developed with other European technology partners, this marks a significant milestone towards one day repeating the trick at scale using a network of low-Earth orbit satellites. It could also mean that conventional and expensive undersea telecommunication cables could become a thing of the past.
As the laser beam travels through the dense atmosphere closer to the ground, it encounters various factors that affect the movement of light waves and data transmission. These factors include turbulent air over high snow-covered mountains, the water surface of Lake Thun, the densely built-up Thun metropolitan area, and the Aare plane. Additionally, the shimmering of the air caused by thermal phenomena disrupts the uniform movement of light, which can be observed by the naked eye on hot summer days.
Scoop.it!
Adobe Photoshop is getting a new generative AI tool that allows users to quickly extend images and add or remove objects using text prompts. The feature is called Generative Fill, and is one of the first Creative Cloud applications to use Adobe’s AI image generator Firefly, which was released as a web-only beta in March 2023.
Generative Fill is launching immediately, but Adobe says it will see a full release in Photoshop later this year. As a regular Photoshop tool, Generative Fill works within individual layers in a Photoshop image file. If you use it to expand the borders of an image (also known as outpainting) or generate new objects, it’ll provide you with three options to choose from. When used for outpainting, users can leave the prompt blank and the system will try to expand the image on its own, but it works better if you give it some direction. Think of it as similar to Photoshop’s existing Content-Aware Fill feature, but offering more control to the user.
Scoop.it!
Using mRNA tailored to each patient’s tumor, the vaccine may have staved off the return of one of the deadliest forms of cancer in half of those who received it. Five years ago, a small group of cancer scientists meeting at a restaurant in a deconsecrated church hospital in Mainz, Germany, drew up an audacious plan: They would test their novel cancer vaccine against one of the most virulent forms of the disease, a cancer notorious for roaring back even in patients whose tumors had been removed. The vaccine might not stop those relapses, some of the scientists figured. But patients were desperate. And the speed with which the disease, pancreatic cancer, often recurred could work to the scientists’ advantage: For better or worse, they would find out soon whether the vaccine helped. On Wednesday, the scientists reported results that defied the long odds. The vaccine provoked an immune response in half of the patients treated, and those people showed no relapse of their cancer during the course of the study, a finding that outside experts described as extremely promising.
The study, published in Nature, was a landmark in the yearslong movement to make cancer vaccines tailored to the tumors of individual patients. Researchers at Memorial Sloan Kettering Cancer Center in New York, led by Dr. Vinod Balachandran, extracted patients’ tumors and shipped samples of them to Germany. There, scientists at BioNTech, the company that made a highly successful Covid vaccine with Pfizer, analyzed the genetic makeup of certain proteins on the surface of the cancer cells. New Developments in Cancer ResearchProgress in the field. In recent years, advancements in research have changed the way cancer is treated. Here are some recent updates:
Cancer vaccines. A pancreatic cancer vaccine provoked an immune response in half of the patients treated in a small trial, a finding that experts described as very promising. The study was a landmark in the movement to make cancer vaccines tailored to the tumors of individual patients.
Ovarian cancer. Building on evidence that ovarian cancer most often originates in the fallopian tubes, not the ovaries, the Ovarian Cancer Research Alliance is urging even women who do not have a genetically-high risk for ovarian cancer — that is, most women — to have their fallopian tubes surgically removed if they are finished having children and are planning a gynecologic operation anyway.
Rectal cancer. A small trial that saw 18 rectal cancer patients taking the same drug, dostarlimab, appears to have produced an astonishing result: The cancer vanished in every single participant. Experts believe that this study is the first in history to have achieved such results.
Using that genetic data, BioNTech scientists then produced personalized vaccines designed to teach each patient’s immune system to attack the tumors. Like BioNTech’s Covid shots, the cancer vaccines relied on messenger RNA. In this case, the vaccines instructed patients’ cells to make some of the same proteins found on their excised tumors, potentially provoking an immune response that would come in handy against actual cancer cells. “This is the first demonstrable success — and I will call it a success, despite the preliminary nature of the study — of an mRNA vaccine in pancreatic cancer,” said Dr. Anirban Maitra, a specialist in the disease at the University of Texas MD Anderson Cancer Center, who was not involved in the study.
“By that standard, it’s a milestone.” The study was small: Only 16 patients, all of them white, were given the vaccine, part of a treatment regimen that also included chemotherapy and a drug intended to keep tumors from evading people’s immune responses. And the study could not entirely rule out factors other than the vaccine having contributed to better outcomes in some patients. “It’s relatively early days,” said Dr. Patrick Ott of the Dana-Farber Cancer Institute.
Beyond that, “cost is a major barrier for these types of vaccines to be more broadly utilized,” said Dr. Neeha Zaidi, a pancreatic cancer specialist at the Johns Hopkins University School of Medicine. That could potentially create disparities in access. But the simple fact that scientists could create, quality-check and deliver personalized cancer vaccines so quickly — patients began receiving the vaccines intravenously roughly nine weeks after having their tumors removed — was a promising sign, experts said. Since the beginning of the study, in December 2019, BioNTech has shortened the process to under six weeks, said Dr. Ugur Sahin, a co-founder of the company, who worked on the study. Eventually, the company intends to be able to make cancer vaccines in four weeks. And since it first began testing the vaccines about a decade ago, BioNTech has lowered the cost from roughly $350,000 per dose to less than $100,000 by automating parts of production, Dr. Sahin said. A personalized mRNA cancer vaccine developed by Moderna and Merck reduced the risk of relapse in patients who had surgery for melanoma, a type of skin cancer, the companies announced last month. But the latest study set the bar higher by targeting pancreatic cancer, which is thought to have fewer of the genetic changes that would make it ripe for vaccine treatments.
In patients who did not appear to respond to the vaccine, the cancer tended to return around 13 months after surgery. Patients who did respond, though, showed no signs of relapse during the roughly 18 months they were tracked. Intriguingly, one patient showed evidence of a vaccine-activated immune response in the liver after an unusual growth developed there. The growth later disappeared in imaging tests. “It’s anecdotal, but it’s nice confirmatory data that the vaccine can get into these other tumor regions,” said Dr. Nina Bhardwaj, who studies cancer vaccines at the Icahn School of Medicine at Mount Sinai. Scientists have struggled for decades to create cancer vaccines, in part because they trained the immune system on proteins found on tumors and normal cells alike. Tailoring vaccines to mutated proteins found only on cancer cells, though, potentially helped provoke stronger immune responses and opened new avenues for treating any cancer patient, said Ira Mellman, vice president of cancer immunology at Genentech, which developed the pancreatic cancer vaccine with BioNTech. “Just establishing the proof of concept that vaccines in cancer can actually do something after, I don’t know, thirty years of failure is probably not a bad thing,” Dr. Mellman said. “We’ll start with that.”
Study cited published in Nature (May 10, 2023): Via Juan Lama
Scoop.it!
Deep reinforcement learning (RL) can enable robots to learn complex behaviors through trial-and-error interaction, getting better and better over time. Several of our prior works explored how RL can enable intricate robotic skills, such as robotic grasping, multi-task learning, and even playing table tennis. Although robotic RL has come a long way, we still don't see RL-enabled robots in everyday settings. The real world is complex, diverse, and changes over time, presenting a major challenge for robotic systems. However, we believe that RL should offer us an excellent tool for tackling precisely these challenges: by continually practicing, getting better, and learning on the job, robots should be able to adapt to the world as it changes around them.
In “Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators”, we discuss how we studied this problem through a recent large-scale experiment, where we deployed a fleet of 23 RL-enabled robots over two years in Google office buildings to sort waste and recycling. Our robotic system combines scalable deep RL from real-world data with bootstrapping from training in simulation and auxiliary object perception inputs to boost generalization, while retaining the benefits of end-to-end training, which we validate with 4,800 evaluation trials across 240 waste station configurations.
Scoop.it!
Are robots are coming for crowdworker jobs? Research shows that LLMs are increasingly capable at human labeling.
A new study reveals that OpenAI's GPT-4 outperforms elite human annotators in labeling tasks, saving a team of researchers over $500,000 and 20,000 hours of labor while raising questions about the future of crowdworking. Machivallian tendencies in chatbots made a surprising side discovery: OpenAI's GPT-4 outperformed the most skilled crowdworkers they had hired to label their dataset. This breakthrough saved the researchers over $500,000 and 20,000 hours of human labor. Innovative Approach Driven by Cost ConcernsThe researchers faced the challenge of annotating 572,322 text scenarios, and they sought a cost-effective method to accomplish this task. Employing Surge AI's top-tier human annotators at a rate of $25 per hour would have cost $500,000 for 20,000 hours of work, an excessive amount to invest in the research endeavor. Surge AI is a venture-backed startup that performs the human labeling for numerous AI companies including OpenAI, Meta, and Anthropic.
The team tested GPT-4's ability to automate labeling with custom prompting. Their results were definitive: "Model labels are competitive with human labels," the researchers confidently reported. In a comparison of 2,000 labeled data points by three experts and three crowdworkers against the labels generated by GPT-4, the AI-created labels exhibited stronger correlation with expert labels than the average crowdworker label. GPT-4 outperformed human annotators in all but two labeling categories, sometimes besting them by a factor of two. GPT-4's Superior Nuance DetectionThe AI model excelled the most in challenging behavior categories such as identifying:
Utilizing GPT-4's labeling capabilities and implementing an ensemble model approach to augment label generation, the researchers likely spent less than $5,000 to annotate 572,322 scenarios. Ensemble models combine outputs from multiple AI models to produce a single, more accurate result. |
Scoop.it!
To enable open research in 3D object generation, we've improved the open-source code of threestudio open-source code to support Zero123 and Stable Zero123. This simplified version of the Stable 3D process is currently in private preview. In technical terms, this uses Score Distillation Sampling (SDS) to optimize a NeRF using the Stable Zero123 model, from which we can later create a textured 3D mesh. This process can be adapted for text-to-3D generation by first generating a single image using SDXL and then using Stable Zero123 to generate the 3D object.
Scoop.it!
During the world's first human-robot press conference at the 'AI for Good' summit in Geneva, humanoids answered journalists' questions on artificial intelligence regulation, the threat of job automation and if in the future they ever plan to rebel against their creators.
Scoop.it!
Tristan Harris, co-founder of Center for Humane Technology, is collaborating with policymakers, researchers and AI technology insiders to de-escalate the competitive pressures driving AI deployment towards a dangerous future. Join him as he shares a vision for the principles we need to navigate the rocky road ahead.
Scoop.it!
From
arxiv
A team of AI researchers just introduced Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and
Voyager interacts with GPT-4 via blackbox queries, which bypasses
The authors open-source our full codebase and prompts at https://voyager.minedojo.org/.
Scoop.it!
In this new episode Steven sits down with the Egyptian entrepreneur and writer, Mo Gawdat.
Scoop.it!
In the collection of the Getty museum in Los Angeles is a portrait from the 17th century of the ancient Greek mathematician Euclid: disheveled, holding up sheets of “Elements,” his treatise on geometry, with grimy hands.
For more than 2,000 years, Euclid’s text was the paradigm of mathematical argumentation and reasoning. “Euclid famously starts with ‘definitions’ that are almost poetic,” Jeremy Avigad, a logician at Carnegie Mellon University, said in an email. “He then built the mathematics of the time on top of that, proving things in such a way that each successive step ‘clearly follows’ from previous ones, using the basic notions, definitions and prior theorems.” There were complaints that some of Euclid’s “obvious” steps were less than obvious, Dr. Avigad said, yet the system worked.
But by the 20th century, mathematicians were no longer willing to ground mathematics in this intuitive geometric foundation. Instead they developed formal systems — precise symbolic representations, mechanical rules. Eventually, this formalization allowed mathematics to be translated into computer code. In 1976, the four-color theorem — which states that four colors are sufficient to fill a map so that no two adjacent regions are the same color — became the first major theorem proved with the help of computational brute force.
Now mathematicians are grappling with the latest transformative force: artificial intelligence. In 2019, Christian Szegedy, a computer scientist formerly at Google and now at a start-up in the Bay Area, predicted that a computer system would match or exceed the problem-solving ability of the best human mathematicians within a decade. Last year he revised the target date to 2026. Akshay Venkatesh, a mathematician at the Institute for Advanced Study in Princeton and a winner of the Fields Medal in 2018, isn’t currently interested in using A.I., but he is keen on talking about it. “I want my students to realize that the field they’re in is going to change a lot,” he said in an interview last year. He recently added by email: “I am not opposed to thoughtful and deliberate use of technology to support our human understanding. But I strongly believe that mindfulness about the way we use it is essential.”
In February, Dr. Avigad attended a workshop about “machine-assisted proofs” at the Institute for Pure and Applied Mathematics, on the campus of the University of California, Los Angeles. (He visited the Euclid portrait on the final day of the workshop.) The gathering drew an atypical mix of mathematicians and computer scientists. “It feels consequential,” said Terence Tao, a mathematician at the university, winner of a Fields Medal in 2006 and the workshop’s lead organizer. Dr. Tao noted that only in the last couple years have mathematicians started worrying about A.I.’s potential threats, whether to mathematical aesthetics or to themselves. That prominent community members are now broaching the issues and exploring the potential “kind of breaks the taboo,” he said.
One conspicuous workshop attendee sat in the front row: a trapezoidal box named “raise-hand robot” that emitted a mechanical murmur and lifted its hand whenever an online participant had a question. “It helps if robots are cute and nonthreatening,” Dr. Tao said. Bring on the “proof whiners”These days there is no shortage of gadgetry for optimizing our lives — diet, sleep, exercise. “We like to attach stuff to ourselves to make it a little easier to get things right,” Jordan Ellenberg, a mathematician at the University of Wisconsin-Madison, said during a workshop break. A.I. gadgetry might do the same for mathematics, he added: “It’s very clear that the question is, What can machines do for us, not what will machines do to us.” A New Generation of ChatbotsA brave new world. A new crop of chatbots powered by artificial intelligence has ignited a scramble to determine whether the technology could upend the economics of the internet, turning today’s powerhouses into has-beens and creating the industry’s next giants. Here are the bots to know:
One math gadget is called a proof assistant, or interactive theorem prover. (“Automath” was an early incarnation in the 1960s.) Step-by-step, a mathematician translates a proof into code; then a software program checks whether the reasoning is correct. Verifications accumulate in a library, a dynamic canonical reference that others can consult. This type of formalization provides a foundation for mathematics today, said Dr. Avigad, who is the director of the Hoskinson Center for Formal Mathematics (funded by the crypto entrepreneur Charles Hoskinson), “in just the same way that Euclid was trying to codify and provide a foundation for the mathematics of his time.”
Of late, the open-source proof assistant system Lean is attracting attention. Developed at Microsoft by Leonardo de Moura, a computer scientist now with Amazon, Lean uses automated reasoning, which is powered by what is known as good old-fashioned artificial intelligence, or GOFAI — symbolic A.I., inspired by logic. So far the Lean community has verified an intriguing theorem about turning a sphere inside out as well as a pivotal theorem in a scheme for unifying mathematical realms, among other gambits. But a proof assistant also has drawbacks: It often complains that it does not understand the definitions, axioms or reasoning steps entered by the mathematician, and for this it has been called a “proof whiner.” All that whining can make research cumbersome. But Heather Macbeth, a mathematician at Fordham University, said that this same feature — providing line-by-line feedback — also makes the systems useful for teaching. In the spring, Dr. Macbeth designed a “bilingual” course: She translated every problem presented on the blackboard into Lean code in the lecture notes, and students submitted solutions to homework problems both in Lean and prose. “It gave them confidence,” Dr. Macbeth said, because they received instant feedback on when the proof was finished and whether each step along the way was right or wrong. Since attending the workshop, Emily Riehl, a mathematician at Johns Hopkins University, used an experimental proof-assistant program to formalize proofs she had previously published with a co-author. By the end of a verification, she said, “I’m really, really deep into understanding the proof, way deeper than I’ve ever understood before. I’m thinking so clearly that I can explain it to a really dumb computer.” Brute reason — but is it math? Another automated-reasoning tool, used by Marijn Heule, a computer scientist at Carnegie Mellon University and an Amazon scholar, is what he colloquially calls “brute reasoning” (or, more technically, a Satisfiability, or SAT, solver). By merely stating, with a carefully crafted encoding, which “exotic object” you want to find, he said, a supercomputer network churns through a search space and determines whether or not that entity exists. Just before the workshop, Dr. Heule and one of his Ph.D. students, Bernardo Subercaseaux, finalized their solution to a longstanding problem with a file that was 50 terabytes in size. Yet that file hardly compared with a result that Dr. Heule and collaborators produced in 2016: “Two-hundred-terabyte maths proof is largest ever,” a headline in Nature announced. The article went on to ask whether solving problems with such tools truly counted as math. In Dr. Heule’s view, this approach is needed “to solve problems that are beyond what humans can do.” Another set of tools uses machine learning, which synthesizes oodles of data and detects patterns but is not good at logical, step-by-step reasoning. Google’s DeepMind designs machine-learning algorithms to tackle the likes of protein folding (AlphaFold) and winning at chess (AlphaZero). In a 2021 Nature paper, a team described their results as “advancing mathematics by guiding human intuition with A.I.” Yuhuai “Tony” Wu, a computer scientist formerly at Google and now with a start-up in the Bay Area, has outlined a grander machine-learning goal: to “solve mathematics.” At Google, Dr. Wu explored how the large language models that empower chatbots might help with mathematics. The team used a model that was trained on internet data and then fine-tuned on a large math-rich data set, using, for instance, an online archive of math and science papers. When asked in everyday English to solve math problems, this specialized chatbot, named Minerva, was “pretty good at imitating humans,” Dr. Wu said at the workshop. The model obtained scores that were better than an average 16-year-old student on high school math exams. Ultimately, Dr. Wu said, he envisioned an “automated mathematician” that has “the capability of solving a mathematical theorem all by itself.” Mathematics as a litmus testMathematicians have responded to these disruptions with varying levels of concern. Michael Harris, at Columbia University, expresses qualms in his “Silicon Reckoner” Substack. He is troubled by the potentially conflicting goals and values of research mathematics and the tech and defense industries. In a recent newsletter, he noted that one speaker at a workshop, “A.I. to Assist Mathematical Reasoning,” organized by the National Academies of Sciences, was a representative from Booz Allen Hamilton, a government contractor for intelligence agencies and the military. Geordie Williamson, of the University of Sydney and a DeepMind collaborator, spoke at the N.A.S. gathering and encouraged mathematicians and computer scientists to be more involved in such conversations. At the workshop in Los Angeles, he opened his talk with a line adapted from “You and the Atom Bomb,” a 1945 essay by George Orwell. “Given how likely we all are to be profoundly affected within the next five years,” Dr. Williamson said, “deep learning has not roused as much discussion as might have been expected.”
Dr. Williamson considers mathematics a litmus test of what machine learning can or cannot do. Reasoning is quintessential to the mathematical process, and it is the crucial unsolved problem of machine learning. Early during Dr. Williamson’s DeepMind collaboration, the team found a simple neural net that predicted “a quantity in mathematics that I cared deeply about,” he said in an interview, and it did so “ridiculously accurately.” Dr. Williamson tried hard to understand why — that would be the makings of a theorem — but could not. Neither could anybody at DeepMind. Like the ancient geometer Euclid, the neural net had somehow intuitively discerned a mathematical truth, but the logical “why” of it was far from obvious.
At the Los Angeles workshop, a prominent theme was how to combine the intuitive and the logical. If A.I. could do both at the same time, all bets would be off. But, Dr. Williamson observed, there is scant motivation to understand the black box that machine learning presents. “It’s the hackiness culture in tech, where if it works most of the time, that’s great,” he said — but that scenario leaves mathematicians dissatisfied. He added that trying to understand what goes on inside a neural net raises “fascinating mathematical questions,” and that finding answers presents an opportunity for mathematicians “to contribute meaningfully to the world.”
Scoop.it!
Recently, Midjourney unveiled version 5.2 of its AI-powered image synthesis model, which includes a new "zoom out" feature that allows maintaining a central synthesized image while automatically building out a larger scene around it, simulating zooming out with a camera lens. Similar to outpainting—an AI imagery technique introduced by OpenAI's DALL-E 2 in August 2022—Midjourney's zoom-out feature can take an existing AI-generated image and expand its borders while keeping its original subject centered in the new image. But unlike DALL-E and Photoshop's Generative Fill feature, you can't select a custom image to expand. At the moment, v5.2's zoom-out only works on images generated within Midjourney, a subscription AI image-generator service.
On the Midjourney Discord server (still the official interface for Midjourney, although plans are underway to change that), users can experiment with zooming out by generating any v5.2 image (now the default) and upscaling a result. After that, special "Zoom" buttons appear below the output. You can zoom out by a factor of 1.5x, 2x, or a custom value between 1 and 2. Another button, called "Make Square," will generate material around the existing image in a way that creates a 1:1 square aspect ratio.
David Holz, the creator of Midjourney, announced the new v5.2 features and improvements on the Discord server Thursday night. Aside from "zoom out," the most significant additions include an overhauled aesthetic system, promising better image quality and a stronger "--stylize" command that effectively influences how non-realistic an image looks. There's also a new "high variation mode," activated by default, that increases compositional variety among image generations. Additionally, a new "/shorten" command enables users to assess prompts in an attempt to trim out non-essential words. Despite the immediate rollout of v5.2, Holz emphasized in his announcement that changes might occur without notice. Older versions of the Midjourney model are still available by using the "/settings" command or the "--v 5.1" in-line command argument.
Scoop.it!
Starting in the fall, students will be able to use AI to help them find bugs in their code, give feedback on the design of student programs, explain unfamiliar lines of code or error messages, and answer individual questions, CS50 professor David J. Malan ’99 wrote in an emailed statement.
Scoop.it!
With the staggering diversity of animal species that roam our planet, one might assume we’ve witnessed every imaginable creature. However, one Reddit user by the name of Macilento dares to challenge that notion. Armed with an insatiable curiosity and the remarkable power of Midjourney AI, they embarked on an extraordinary experiment to synthesize unprecedented animal crossbreeds.
Macilento’s groundbreaking endeavor involved merging the genetic traits of two distinct species, pushing the boundaries of what we thought possible. Through the remarkable capabilities of Midjourney AI, a host of singular creatures were brought to life—each one a mesmerizing blend of two existing species. The outcomes of this audacious exploration are nothing short of astonishing, capturing both our imagination and sense of wonder. Prepare to be enthralled as these enchanting hybrids transcend the realm of imagination, sparking a sense of awe and amusement. With each carefully crafted combination, Macilento has forged a pathway into uncharted territory, uncovering a tapestry of creatures that will leave you spellbound.
Scoop.it!
Researchers have demoed an AI tool called DragGAN that lets you simple click and drag an image to edit it realistically in seconds. It’s like Photoshop’s Warp tool but much more powerful.
It’s like Photoshop’s Warp tool, but far more powerful. You’re not just smushing pixels around, but using AI to re-generate the underlying object. You can even rotate images as if they were 3D. The latest example is only a research paper for now, but a very impressive one, letting users simply drag elements of a picture to change their appearance.
This doesn’t sound too exciting on the face of it, but take a look at the examples below to get an idea of what this system can do. Not only can you change the dimensions of a car or manipulate a smile into a frown with a simple click and drag, but you can rotate a picture’s subject as if it were a 3D model — changing the direction someone is facing, for example. One demo even shows the user adjusting the reflections on a lake and height of a mountain range with a few clicks.
Scoop.it!
From
decrypt
Snapchat influencer Caryn Marjorie is harnessing the power of the AI revolution to fulfill the dreams of millions of her followers: becoming their girlfriend. She’s making some serious cash too. Marjorie, using OpenAI's GPT technology, has created CarynAI, an AI avatar that offers virtual companionship for a dollar per minute. It’s not a totally unfamiliar venture in the world of AI, but it’s yet another example of how the lines that separate the real from the unreal in AI are getting a little more blurry. With 1.8 million followers on Snapchat, Marjorie, age 23, has a vast audience, and CarynAI—her digital doppelganger developed by the AI company Forever Voices—is helping her achieve (and profit from) things that are not otherwise physically possible. She now has over 1,000 virtual boyfriends who pay $1 per minute to engage in all kinds of interactions—from simple chitchat to making plans for their futures together, and even more intimate exchanges.
Her AI-powered alter ego has raked in $71,610 in revenue, according to Fortune, in just one week. She expects to make around $5 million per month if just 20,000 of the 1.8 million dudes who follow her on Snapchat subscribe to CarynAI (99% of her fanbase is male, as you might expect).
The Forever Voices team trained the CarynAI model by analyzing 2,000 hours of Marjorie’s now-deleted YouTube content to build her speech and personality engine. They then used GPT-4 to create the most lifelike version of Marjorie, providing not just believable responses, but interactions that are based on an extensive dataset of Caryn’s own natural behavior. The company has also created chatbots of other famous influencers.
CarynAI’s official site stresses that messages are end-to-end encrypted, making them more difficult for a third party to intercept. This also addresses the concerns of many users about privacy when interacting with AI models.
Scoop.it!
Researchers from the University of Florida have released a study suggesting that the AI model ChatGPT can reliably predict stock market trends. Using public markets data and news from October 2021 to December 2022, their testing found that trading models powered by ChatGPT could generate returns exceeding 500% in this period. This performance stands in stark contrast to the -12% return from buying and holding an S&P 500 ETF during the same timeframe. The study also underscored ChatGPT's superior performance over other language models, including GPT-1, GPT-2, and BERT, as well as traditional sentiment analysis methods. Impressive Results Across Several StrategiesThe team tested six different investing strategies during the October 2021 to December 2022 period.
When benchmarking against other methods, such as sentiment analysis and older language models like GPT-1, GPT-2, and BERT, ChatGPT consistently outperformed the competition. Traditional sentiment analysis methods produced markedly inferior results across all investment strategies, while GPT-1, GPT-2, and BERT failed to accurately predict returns. Implications for the Finance IndustryTwo Sigma, DE Shaw, and Renaissance Technologies are several prominent hedge funds that incorporate sentiment analysis into their automated trading systems, and numerous other boutique hedge funds also utilize sentiment analysis signals as part of their proprietary strategies. ChatGPT’s strong performance in understanding headlines and their implications could add to a new arms race as funds compete to have an edge from Generative AI.
ChatGPT’s powerful natural language processing abilities could also threaten businesses that have developed their own proprietary sentiment analysis machine learning models, which could find themselves outperformed by a simple ChatGPT prompt. Notably, companies like Lexalytics that claim “world-leading NLP” could find themselves with both an opportunity and a challenge in this market as generative AI tools emerge and make past models obsolete.
The stock-picking edge that ChatGPT has demonstrated could also empower retail traders. Notably, subreddits like r/WallStreetBets are filled with due diligence posts (called “DDs”) and bragging posts on stock returns from various long and short strategies. ChatGPT, with its ability to deduce nuance and second-order implications from just understanding headlines, could help ambitious retail traders in their own efforts to generate outsized returns.
Investors expect the next few years to be very dynamic in the finance space as generative AI takes hold. Unsurprisingly, the technology-focused ARK Investment Management is especially bullish. In her 2023 Big Ideas report, ARK’s founder Cathie Wood predicted AI as one of the trends that will define this technological era. “We've been working on artificial intelligence for a long time now,” said Wood, “and I think some of the things we're seeing are just the beginning of the impact that artificial intelligence is going to have on every sector, every industry, and every company.” |