Dec 24, 2024 2 min read

Too fast; didn't proof

This was potentially lost in the Christmas craziness: OpenAI has unveiled a new model called o3 which is the first one to be seriously compared to AGI (artificial general intelligence). It is still unable to solve some fairly easy human problems but destroys anyone at "frontier" mathematics. These are the type of math problems that advanced math scientists will take days to solve.

Meanwhile o1 Pro was praised by a biology scientist for providing the ultimate best critique of their bleeding edge research paper.

As a reminder, this is the path to singularity:

AI exists but it is specialized (beats humans at chess, at image recognition, silo-ed),
AI is more general (multi-modal, can process moving picture or sound) but still not quite at the level of a human - this is where we are,
General AI as good as a research scientist - this is the knee of the curve, this is when AI can be set to work on making AI better, on its own. The acceleration accelerates. If you thought AI development could use a break, enjoy the current pace for as long as it lasts,
AGI beats humans <– we have to solve for alignment before then otherwise we face the risk that AGI is misaligned with human interest and humans can be construed as, at best, obstacles.

Here are two principles we have recently discovered that you might want to use as intuition - until they're proven wrong:

Inference time scaling matters: The longer you leave AI thinking, the better the outcome
Our current AI progress is blessed with scaling properties:
- Larger neural networks with more parameters perform better - scaling the model
- More extensive and diverse datasets lead to better model performance - scaling the data
  - Some AI scientists are calling a data wall on the horizon: the end of this specific scaling property
- Compute power - the more powerful the cluster, the better the performance
  - Which gave x.ai a crown of laurels when they turned on a 100,000 node cluster in Tennessee.

These scaling properties do not take into consideration possible and plausible optimization.

A fusion reactor is planned to go online in Virginia in 2030.

...

"Fusion"!

Unrelated to the above, here are a few tidbits I heard about on the latest All-in podcast:

A US government budget package worth billions was thwarted or compromised (depending on your perspective) by what can be described as open source scrutiny on X, AKA "you're spending HOW MUCH on WHAT?"
Radioactive waste had been lost from New Jersey... but it has now been found, sans help from drones

Honey, the chrome extension that promises to save you money, is a scam. But since Honey is owned by PayPal and the PayPal mafia is practically in the white house, your bet as to whether the FTC will look into it is as good as mine.