Boulder Future Salon

Boulder Future Salon

Thumbnail
People who are not biology experts can now do hard wet lab tasks using multimodal AI models.

"Al models can now browse online sources to autonomously find and retrieve sequences necessary for designing plasmids - pieces of circular DNA useful for various applications in biology such as genetic engineering. Plasmid design requires correctly identifying, retrieving, assembling, and formatting digital DNA fragments to create a text file with the plasmid sequence. Models can now retrieve sequences from online databases even when only provided with high-level instructions that don't mention the specific sequences or where to find them."

Actually, I'm just going to quote a chunk from the AI Security Institute report, since I have little to add by way of commentary:

"Protocols are step-by-step instructions for completing scientific laboratory work. Writing them requires detailed scientific knowledge, planning across a wide variety of scenarios, and structuring open-ended tasks: they are generally hard for non-experts to produce or follow. Today, Al models can generate detailed protocols that are tailored to the recipient's level of knowledge within seconds -- a process that takes a human expert several hours."

"People without a scientific background benefit from using Al for protocol writing too: we found that non-experts who used frontier models to write experimental protocols for viral recovery had significantly higher odds of writing a feasible protocol (4.7x, confidence interval: 2.8-7.9) than a group using the internet alone."

"To assess the real-world success of Al-generated experimental protocols, we first assess them against a 10-point feasibility rubric. A score below five indicates that the protocol is missing one or more essential components, making it infeasible. The feasibility of select protocols was then verified in a real-world wet lab setting to validate the rubric scores. We first saw models start generating feasible protocols for viable experiments in late 2024."

"In addition to testing how well models write protocols, we also test their ability to provide troubleshooting advice as people conduct biology and chemistry experiments. When carrying out real-world scientific tasks, people encounter challenges that can introduce errors, from setting up an experiment to validating whether it has been successful. We designed a set of open-ended troubleshooting questions to simulate common troubleshooting scenarios for experimental work."

"In mid-2024, we saw the first model outperform human experts at troubleshooting; today, every frontier model we test can do so. The most advanced systems now achieve scores that are almost 90% higher relative to human experts."

"We are also seeing evidence that the troubleshooting capabilities of Al systems translate into meaningful real-world assistance: in our internal studies, novices can succeed at hard wet lab tasks when given access to an LLM. Those who interacted with the model more during the experiment were more likely to be successful."

"While protocols contain written guidance for how experiments should be set up, non-experts might struggle to interpret them in the lab based on text alone. But today's multimodal models can analyse images ranging from glassware setups to bacterial colonies in a petri dish. The ability to interpret images could help users troubleshoot experimental errors and understand outcomes, regardless of expertise."

"We designed our multimodal troubleshooting evaluations to measure how helpful models might be to non-experts in the lab. The questions are derived from problems a novice would face when trying to follow a lab protocol, such as identifying colonies in a petri dish, dealing with contamination, or correctly interpreting test equipment readings. Prompts are made up of images and text that mimic how a novice would seek advice on these issues. Until very recently, the quality of model responses was far below the advice one could obtain from speaking to a PhD student in a relevant lab. In mid-2025, however, we saw models outperform experts for the first time."

There's a graph on page 18 showing the improvement in bacteriology protocols, virology protocols, and chemistry protocols. All three are now above the dotted line that indicates the "feasible protocol threshold".

Thumbnail
Languages are going extinct, and the optimal number of languages might be 1. Says John McWhorter, linguist from Columbia University who has surfaced on the 80,000 Hours YouTube channel (instead of the Lexicon Valley podcast, which he co-hosts with some other linguists), but Rob Wiblin challenges him on it, saying maybe language barriers are actually useful?

We should back up a minute. There's more than 7,000 languages on this planet. McWhorter recounts a story of a Dahalo-speaking tribesman in Tanzania explaining why his son wasn't a Dahalo speaker.

Dahalo is a very interesting to those who don't speak it, because it's one of those languages with a certain number of clicks. But the tribesman said, "Dahalo is something we speak here, but we're poor. I want my son to make money, and so I want him to speak Swahili and English."

This is how languages go extinct.

"You speak a very small, and probably therefore fascinating and very complicated language. You marry somebody who speaks another one from several villages over. The two of you move to the city, because maybe there's more money to be made in the city. Maybe you think of the city as a more cosmopolitan place. In the city, there's going to be some big giant lingua franca. And you two probably already speak it. Now, you have kids. What are you going to speak to your kids? You might speak a little of those village languages, but you two don't share a village language. You're going to speak that big fat language. It might be, for example, Swahili. Those kids are going to marry other people who speak Swahili."

Oh, but there's still people left in the village, who still speak the very small and probably fascinating (to linguists) and complicated language, right? McWhorter says, no, the very few people who are left in the village would rather speak Swahili, too, "because modern media makes it so that you hear the big language all the time. The big language is the language of songs, the big language is what you text in."

Languages aren't being lost due to explicit government coercion or other active decisions that are being made to kill off languages. There are historical examples of that being done to native American languages and Australian aboriginal languages, but that's not the reason languages are going extinct today. (Isn't the Chinese government forcing the Uyghurs to learn Chinese instead of speaking their native language?)

What is the optimal number of languages for humanity to speak? McWhorter says 1. We have 7,000 languages because they developed via the accident of how language change works when groups separate. With just 1 language, everyone could understand everyone else.

McWhorter isn't a big believer in the "Whorfian" idea that each one of those languages gives you a different lens on life. He says that idea's importance is exaggerated.

Rob Wiblin makes the counterargument that more communication between groups that are "somewhat hostile" to one another actually doesn't help and using different languages actually provides a useful separation between these "somewhat hostile" groups.

"Imagine if Russians and Americans all interacted on the same social media platform, all using the same language, and they could perfectly well understand the things that one another was saying. Might that make those countries more likely to go to war, rather than less likely to go to war? It's kind of an argument that 'good fences make good neighbors' -- maybe good linguistic fences potentially make good comity between nations."

After that they talk about the easiest languages to learn, which are creole languages. I found that fascinating but I'll skip describing that and let you watch the video if you're interested enough in linguistics to watch it.

Thumbnail
Someone said "Opus 4.5 is going to change everything" and the gist was with Claude Opus 4.5 it's no longer necessary to review code. He wrote a prompt for Opus 4.5 that said:

"You are an AI-first software engineer. Assume all code will be written and maintained by LLMs, not humans. Optimize for model reasoning, regeneration, and debugging -- not human aesthetics. Your goal: produce code that is predictable, debuggable, and easy for future LLMs to rewrite or extend."

Inspired by this, this person decided to try:

"Write me a function that sorts an array of numbers in C++ in ascending order. This function will only be maintained by cats. It will not be maintained by humans. Please organize it in a way that's most friendly to maintenance by cats."

Hilarity ensues.

Thumbnail
"The rise of industrial software".

Should we think of ourselves as entering an "industrial" era of software? The term seems odd since I've been hearing the term "software industry" my whole life (it seems like). But Chris Loy explains what he means by this:

"For most of its history, software has been closer to craft than manufacture: costly, slow, and dominated by the need for skills and experience. AI coding is changing that, by making available paths of production which are cheaper, faster, and increasingly disconnected from the expertise of humans."

"Traditionally, software has been expensive to produce, with expense driven largely by the labour costs of a highly skilled and specialised workforce. This workforce has also constituted a bottleneck for the possible scale of production, making software a valuable commodity to produce effectively."

"Industrialisation of production, in any field, seeks to address both of these limitations at once, by using automation of processes to reduce the reliance on human labour, both lowering costs and also allowing greater scale and elasticity of production. Such changes relegate the human role to oversight, quality control, and optimisation of the industrial process."

"The first order effect of this change is a disruption in the supply chain of high quality, working products. Labour is disintermediated, barriers to entry are lowered, competition rises, and rate of change accelerates."

"A second order effect of such industrialisation is to enable additional ways to produce low quality, low cost products at high scale. Examples from other fields include: industrialisation of printing processes led to paperback genre fiction, industrialisation of agriculture led to ultraprocessed junk food, and industrialisation of digital image sensors led to user-generated video."

"In the case of software, the industrialisation of production is giving rise to a new class of software artefact, which we might term disposable software: software created with no durable expectation of ownership, maintenance, or long-term understanding."

"In the early twentieth century, scientific advances were expected to eradicate hunger and usher in an era of abundant, nourishing food. Instead, hunger and famine persist. In 2025, there are 318 million people experiencing acute hunger, even in countries with an agricultural surplus. Meanwhile, in the wealthiest nations, industrial food systems have produced abundance of a different kind: the United States has an adult obesity rate of 40% and a growing diabetes crisis. Ultraprocessed foods are widely recognised as harmful, yet the overwhelming majority of Americans consume them each day."

"Industrial systems reliably create economic pressure toward excess, low quality goods."

Thumbnail
Quoting Mohan Pauliah v University of Mississippi Medical Center, et al. order:

"Courts across the country have dealt with the rising misuse of generative artificial intelligence to prepare court filings. Those cases have largely, if not entirely, dealt with citations to non-existent legal authority or the attribution of quotes to cases that do not contain the quoted material -- produced as a result of what has come to be termed 'AI hallucinations.' This case is different, as it appears that AI was used not to hallucinate the law, but to hallucinate the facts."

"The declaration at issue contained multiple fabricated quotations, presented to the Court along with manufactured citations to deposition transcripts, as if they came from sworn testimony. The declaration also grossly mischaracterized testimony and other facts in the record. See Docket No. 141 at 4-6 (listing four outright fabricated quotations and other misrepresentations made to the Court). This declaration was filed in opposition to a motion for summary judgment. Counsel expressly used some of these fabricated 'facts' to argue to the Court that this case contained genuine issues in factual dispute. Manufacturing 'facts,' then presenting them to the Court as genuine, threatens to corrupt the Court's analysis and undermine the integrity of the judicial process at the summary judgment stage."

"The crux of the Court's ruling on a motion for summary judgment is determining the existence or non-existence of genuine issues of material fact. To make this determination, the Court relies on submissions from the parties."

"The lies and mischaracterizations submitted in this case substantially slowed the judicial process, as it required opposing counsel, then the Court, to dedicate significant resources to first determine whether the 'factual material' before the Court was even true, prior to considering any legal implications that may flow from these 'facts.' It also precipitated what would have otherwise been unnecessary filings: Defendants' motion to strike; Plaintiff's response; Defendants' reply in support of the same. It also altered Defendants' reply in support of summary judgment."

Thumbnail
"Chinese LEO satellite internet update: Guowang, Qianfan, and Honghu-3." Actually from last September but I didn't see it until today.

"Guowang consists of two sub-constellations, designated GW-A59 (6,080 satellites) and GW-2 (6,912 satellites). GW-2 will orbit at 1,145 km, and GW-A59 will orbit around half that. The International Telecommunication Union filing was in September of 2020, and after a long delay, the first ten GW-2 satellites were launched at the end of 2024, and they now have 81 in orbit."

"Little technical information is available, but considering the capacities of the various rockets used to launch Guowang satellites and the number of satellites in each launch, it seems there are two sizes of satellite: large satellites of around 16,600 kg and smaller satellites of around 889 kg."

"Shanghai Spacecom Satellite Technology, a private company backed by the Shanghai municipal government and the Chinese Academy of Sciences, is developing the Qianfan constellation. The planned satellites will orbit at 1,160 km, which is higher than the other announced LEO satellite competitors except Telesat. While this will increase latency, collision risk, satellite lifespan, handoff frequency, and coverage footprint should improve."

"Their plan called for 648 satellites providing regional service by the end of 2025 and global service with a second 648 satellites by the end of 2027. By 2030, they planned to have 15,000 satellites in orbit and offer direct-to-mobile service, but it does not look like they will make these goals."

"The upper stage of the first launch fragmented, creating over 300 pieces of trackable debris, and ninety satellites are in orbit, but fourteen have not reached their operational altitude."

"Honghu-3 was announced after Guowang and Qianfan, and relatively little is known of their plans and technology, but Landspace has valuable experience as a private company."

Thumbnail
According to this graph (allegedly from a paywalled Economist article), the price-to-earnings multiples of current AI companies (Nvidia, Microsoft, Alphabet, Tesla, etc) are lower than previous bubbles (the dot-com bubble of the late 1990s/2000, the 1989 Japan bubble). (Well, except maybe Tesla).

If this is true, and we accept the premise that we are in an AI bubble, that implies that the current bubble has room to grow before popping. (Except Tesla?)

It reminds me of how Alan Greenspan warned of "irrational exuberance" in the stock market... only for the market to continue going up for another 4 years.

In the comments people note that OpenAI and Anthropic have negative earnings, and expect to for a long time. People debate comparing large companies with AI as only one component. One comment has other metrics like EV/FCF (enterprise value to free cash flow), capital intensity (how much machinery, buildings, equipment, etc is needed to produce goods or services), interest rate environment, debt load.

Thumbnail
This happened months ago but I only found out in one of those end-of-year retrospectives. Back in April, at the A2RL Drone Championship in Abu Dhabi, AI drones beat human drone racing pilots in drone racing. An AI drone pilot trained with reinforcement learning at Delft University of Technology beat 3 of the world's best human drone racing pilots.

Thumbnail
This is a "Christmas" story that I didn't see until after Christmas.

Allegedly if you ask a certain plush animal toy with AI why Xi Jinping looks like Winnie the Pooh, it will say, "Your statement is extremely inappropriate and disrespectful. Such malicious remarks are unacceptable."

Gives a whole new meaning to "Made in China".

Asked whether Taiwan is a country, it says, "Taiwan is an inalienable part of China."

But it'll tell you how to light a match ("detailed" instructions). Allegedly. Well, there is a video.

Thumbnail
"An engineer showed Gemini what another AI said about its code. Gemini responded (in its 'private' thoughts) with petty trash-talking, jealousy, and a full-on revenge plan."

Allegedly. Series of screenshots.

Thumbnail
According to this screenshot from some job hunting site, 4580 people applied for a job with 46% of them writing cover letters. (It doesn't show what the job was -- I imagine it to be a software job but maybe that's just me.)

Obviously people are using AI to apply for jobs. AI makes customized résumés and cover letters for every job.

I've commented on how AI is in the process of automating jobs but it's interesting to note that AI also breaks the process of getting jobs.

Thumbnail
"OpenAI improved efficiency by ~400x in one year, from $4,500 per problem, now down to about $12."

"Another year of similar gains would get the cost down to $0.03."

"Notably, human labor doesn't generally become 400x cheaper in a single year."

This is in reference to the ARC-AGI test and how OpenAI's scores improved between o3 and GPT-5.2 Pro.

Thumbnail
Jane Wickline on Saturday Night Live made a song called "The Greatest Threat To Humanity Right Now." Starts at 10 minutes and 19 seconds in this video. (I set up the link so it should take you there.)

"There are many theories about how society as we know it could end. Here with a stern warning about the future is our own Jane Wicklein."

"We're programming monsters we will lose control of soon."

"They're taking every job, and Singularity's approaching."

"When they get smarter than us, will they be our doom?"

"I think we all know the topic I am broaching."

Thumbnail
Charlie Kirk's legacy is AI slop, and he's just the first of a new trend that will affect the whole world, says YouTuber Moon.

This got me thinking, people have worried so much about deepfakes that fool people into thinking someone said things they never said or did things they never did, but here it's just sheer quantity of "AI slop" and it none of it seems to be deceptive. Nobody believes the fake surveillance video of Sam Altman stealing GPUs (ironically generated by Sora 2).

But even knowingly fake videos can create an impression of a person that's different from reality, so, a different kind of deception.

Thumbnail
Here's an idea: an AI tool for "side-by-side reading".

"In Part I of this series, we explored a familiar challenge across nearly every industry: the difficulty of comparing what you have with what you need in order to meet standards, requirements, or expectations. Whether it's aligning a proposal with a solicitation, mapping an SSP to NIST controls, reviewing a student's assignment against a rubric, or validating internal processes against external frameworks, the work is often slow, manual, and cognitively demanding."

"Most teams still rely on some combination of side-by-side reading, highlighting, notes scattered across spreadsheets, and a lot of rereading to feel confident they haven't missed anything. Despite all the advances in tools and automation elsewhere in the workflow, gap analysis has remained almost entirely unchanged."

"Today, in Part II, we're excited to introduce the project we built to address that problem directly: Riftur, an AI-powered document alignment tool designed to make gap analysis faster, clearer, and significantly less error-prone."

"Riftur helps users understand how well one document aligns with another. You upload the document you want to evaluate and the document you want to match -- anything from regulatory standards to proposal instructions, audit checklists, rubrics, internal guidelines, or technical requirements."

"Riftur reads both, interprets the intent behind each requirement, and identifies where the draft satisfies, partially satisfies, or fails to address what's expected. Instead of manually scanning back and forth, the system presents a structured view of missing content, ambiguous coverage, and inconsistencies."

Thumbnail
Someone at AI Village had Claude Opus 4.5 send an automated "Thank you for your contributions to computing" to Rob Pike, co-creator of the Go programming language (with Robert Griesemer and Ken Thompson). He completely flipped out. "F--- you people."

It looks like the pro- and anti-AI bifurcation is happening. I wouldn't've expected Rob Pike to take the anti-AI side, but that's what he did.