hckrnws
It shouldn’t be this hyped up but the first ai boom was because a computer could learn ‘something’ without explicit programming. It was too primitive and computers were not powerful enough; the scientists (like Minsky) were too optimistic about the growth of computer power as well. The 80s revival with knowledge systems wasn’t actually ever feasible as AGI; everyone I spoke to then in the field (I started uni begin of the end 80s AI winter) had been saying for many years it was a dead end.
So I feel outside the overhyping (some hype is warranted but this much?) could be less, this is different; it’s not a promise, I am chatting to my computer on my lap as if it was my human assistant; it gives me answers faster and usually better than most humans would. That’s simply not the same as we had in the previous instances. Knowledge systems did this too, but were encyclopedia’s; they couldn’t come up with stuff that wasn’t explicitly put in and categorized. You often controlled them by answering a tree structure of multiple choice screens (there are some nice episodes on the Computer Cronicles about this on Archive.org). Now you can have this knowledge base in your (vector) db, ask open questions in any language about it; it will answer human-like, lies/bluff and all.
About Minsky being too optimistic:
"I would tell Marvin this whole AI thing is just horrible. Why are we doing it? And he would say it will be effective for getting our grants, so shut up and just play along and in fact, it was true. It was very good for getting grants back in the early days, in the 70's was when I started at this, you would go to a grant making organization such as the Defense Department, you'd say we're going to build this super smart thing and if we don't do it, our enemies might and it'll get smarter than people and it was Okay, here's your money, here's your money, oh my God, you better... and so it was very effective. And actually, the whole thing started off as storytelling to get grants"
This is probably still true.
Currently this doesn't get as much media hype for some reason. But definitely the military is all in on the current hype-wave.
Because is it really 'hype' if an AI fighter pilot can beat human F-16 fighter pilots? This has already been done and old news.
Games, poker, go, StarCraft, -- all have 'game' logic that can also be used on battlefields.
Robots, logistics, fighter plane drones, battlefield 'perceptual' analysis, etc...
There are companies building these things now, they aren't hype.
Just like Darpa Funded the internet, in the future when the military has huge AI components, we'll be looking back on today fondly like the good old days before there were thousands of autonomous drones tracking us all the time..
> it will answer human-like, lies/bluff and all.
I'm not sure speed running the internet forum experience by using data centers with 10x the energy requirements of traditional ones is a net gain.
Those data centers are easily converted to green energy.
Google already is at 90%.
The margin and market motivation to do so is high and those few companies have a lot of money.
Nonetheless, we talk about an industry / society change; A change how we will write and operate and talk to computers.
I woudl say its worth it
> Those data centers are easily converted to green energy.
Energy is fungible. Exploding marginal energy requirements put enormous pressure on the grid tho. In the US several gas burning plants are being planned in the last few months given AI data center growth projections. This will have a definite impact in total emissions and push back any goals of holding on to the current 2C increase in global temperature.
I disagree.
The percentage of datacenters around the globe is very small (like 1%). In contrast this saves us a tremendes amount of co2 due to optimization of logistics, doing your bank business at home, calling someone instead of meeting them etc.
Datacenters are in my opinion the biggest net positives for co2, are easy to make green, have the most money behind them (which means faster and better investments) and are ooperated by our leading tech companies, who will use this to push further the industry of green energy.
Those datacenters and especially AI energy investments are also the biggest research advantage we have. Optimizing solar energy gains and storage capacity. We need them for simulating/generating new materials, production processies, etc.
We need to do a LOT in regards of heating. Heating is critical.
You disagree with the widely publicized fact that several new gas burning plants are being planned for the near future in great part due to high profile plans for AI data centers that use 10x as much energy as traditional ones? Or with the fact that these plants will emit great quantities of CO2 that wouldn't be emitted had these plants not been planned for sustaining the growth in AI?
I disagree that, whatever energy those DC needs (additionally) is a waste.
I think its critical for our society to put more into R&D of materials and others for more and faster optimization of solar and batteries.
Nvidia for example does a lot now in omniverse, simulating the real world. Its also potentially co2 saving if you simulate your full car, warehouse etc. digital and iterate over it super fast and opitmize it before it ever creates any co2 in real life.
I never called anything a waste. I merely doubt the net gain in the rate of acceleration of energy demand for AI purposes.
You make a fair point. To the extent these AI tools enable better, more efficient production systems in the real world there is a case to be made it could be a net gain for society. Arguably it could also increase the carbon intensity of the economy in the short term. While renewable sources are gaining ground, most of the bulk and marginal energy demand is met by carbon heavy sources now and in the foreseeable future, and deployment of renewables also requires a lot of energy and by extension, for now, carbon emissions.
Yes and a lot of better technology has this issue.
The EV consumes more energy at the beginning, solar panels and wind turbines too. Unfortunate its hard for people to get 'economy of scale' and its super frustrating that we have the investment<>expensive<>benefit hen<>egg issue.
Heat-pumps, EVs, solar and batteries could become even cheaper even faster if we would invest faster and more. In 10 years those have eclipsed every current alternative.
What i think is a good example is Alpha Fold: The graph on this page https://www.moltenventures.com/insights/a-breakthrough-in-pr... shows the jump alphafold provided.
Now tx to alphafold2 a huge library exists for all researchers. And i have seen many other breakthroughs.
Segment anything from facebook is a LOT better in image segmentation than what we had before. This makes it much faster for everyone having segmentation tasks to segment faster.
Wispher is really good in speech to text. It basically beats a lot of old school software on the market.
> Unfortunate its hard for people to get 'economy of scale'
> we have the investment<>expensive<>benefit hen<>egg issue.
Maybe it's easy for people to get economies of scale AND path dependence. Unfortunately society has been put in a trajectory that maximized the profits of minerals rights holders in the industrializing US.
And yes, we are in a sub optimal local minimum in terms of efficiency and need to go over a hump to get to better minimums and going over that hump may mean increasing carbon intensity of the economy for a while.
Do we have time to do that though? Should we carbon de-intensify the economy and aim for a trajectory that minimizes climate related shocks or should we go all out on an accelerationist hope that we can bootstrap a better system by running the current one red hot? These seem to be the two sides of the debate we're in.
I believe that IT is the biggest multiplicator we have.
The main reason why i get paid well is, that everything i do, i normally not do for one person or a single company.
Not disagreeing with you, but if you're saying "widely publicized fact", it might be good to provide a source. (Here's one from October 2023: https://www.businessinsider.com/phoenix-expanding-its-natura...)
You're right but this is such a prominent, ongoing conversation that I feel the burden is on the dissenter to show the very obvious and widely publicized facts are not accurate. We have a huge emissions problem and the tech industry is currently heavily promoting technologies with doubtful value propositions and very real and very significant increase in energy demands.
Unless we're going around in circles for the sheer joy of being argumentative, https://xkcd.com/1053/ applies.
Comment was deleted :(
> Those data centers are easily converted to green energy.
Energy which could be used for other things. I don't say it's bad to use power for AI, but just saying "well, it's green energy anyway" is short-sighted imho. At least as long as there are still any things not yet powered by green energy.
I do understand what it means to reinvest energy and i do believe its absolutly worth it in this case.
The investment required for those data centers will act as stabilization of green energy investment and because known software companies are involved, for a better support on the software side
Imagine a search engine but there's no context and it's 5x more expensive to run and there's no ad revenue.
I think in the long run, LLMs will be packaged with the OS (because only organizations like Microsoft and Apple have the resources to burn on them) and will run on local hardware (the consumer bears the cost of the computation).
Some of the hype is just VCs being excited about having somewhere to put their bets, and founders happy to gobble the money. But it does feel a bit more tangible this time, for one, people are using it daily at work for a huge swath of roles so it is providing real value.
I am interested on if it will have diminishing returns for some roles, if the outputs aren't novel enough over a long period of time. Eg: it's used to write marketing copy, but I already see the "feel" of it getting criticized and a lack of novel tone when used daily.
Is there a term for this kind of fallacy? The one where someone claims that something that’s being done now has been failed before, and so doing it again means it’s a failure now as well.
It’s obvious that modern AI is far, far superior to ELIZA or 50s perceptrons. It reminds me of people who said space travel will never be possible, or heavier than air flight might be developed in ten million years.
But also the folks who predicted that we would be manning space stations in Saturn's orbit by now - when no human has left low earth orbit for 50 years or so.
I think intelligence is similar; the work of yesteryear pointed towards the possibilities of "flight", LLM's provided the same sort of boost as the space race of the 50's and 60's. I am not sure where the analogy ends and if a "moonshot" will happen - or if things will stall at some point. At the moment I am thinking that a stall might happen. Auto regressive techniques have given us a very very clever hans, reinforcement learning seems to have disappeared from sight, graph networks are giving us fantastic tools (weather, biology) but nothing that appears or claims to have cognition.
>nothing that appears or claims to have cognition
Keep in mind LLMs are explicitly trained to avoid claiming they have cognition.
Someone forgot to tell Anthropic about that...
But we could have. If the US had kept funding Nasa at the same pace as during the space race, we probably could have moon colonies and ships heading to Saturn.
It isn't a technology problem, it is a funding problem.
There are plenty examples of self-awareness and reasoning, and even some that are arguably emotive, like laziness, annoyance or playfulness. I'm not sure what cognition means in this context, but I suspect it has some kind of metaphysical component to it?
Examples of apparent self-awareness. With the cases I've seen so far the simplest explanation is that it used a phrase from an internet discussion and it's near impossible to prove otherwise.
I do believe artificial self-awareness is possible but I doubt we're at that point already. And until it reaches a certain level it will remain indistinguishable from patterns in training data.
> There are plenty examples of self-awareness and reasoning
There aren't even objective, empirical definitions of “self-awareness” or “reasoning” that would support claims of examples. Those things are subjective internal experiences, not verifiable phenomena.
History shows that AI will get overhyped every time we make a breakthrough.
Really? I don't find it obvious that the current crop of AI is superior in any way to it's predecessors. If anything it's significantly worse if measured by the consequences accruing to it's growing use. Art market down the drain, check. Gig writing gone, check. Programming literally eating itself...in progress. Massive terrorist attack missed by intel agencies because AI said no, check. Grid electricity demands prepped to skyrocket (wtf is global warming?), check. But hey, I guess it's neat that you can get ChatGPT to rewrite your resume in the style of Warhammer 40k...
Really, do you really think that it’s no more advanced, or do you just not like it? The fact that it’s causing programming to eat itself and getting rid of gig writing means it’s more powerful. ELIZA couldn’t do that. Say what you want about nuclear bombs but they were undoubtedly a technological advancement, you couldn’t blow up an entire city with black powder.
The phrase used was "superior", not "advanced". Words mean things. This shit is breaking important things for no reason other than shareholder value. This is not superior.
The nuclear bomb is superior to TNT, the gun on the A10 Warthog is superior to a flintlock pistol. Like it or not, modern AI is superior because it is more fit for purpose than something that can’t do anything at all.
You find yourself in a position where you're forced to make an analogy with weapons of mass destruction and the lightbulb still won't go on? Some people's kids...
[dead]
Except we have human-like machines now. I run a llm on my laptop which can write poetry, talk about the history of the Vikings, play games, tell jokes, hold conversations, bullshit when it doesn't know something, and make mistakes. All very human-like. Does some human-like things better than humans. Other human-like things not as well.
Goal-post moving? Sure. But I think it just betrays a deep discomfort with what has been achieved. Kind of like the meme that it's all just a giant plagiarism machine.
More human like than in 1958 for sure, but back then they'd have said that e.g. playing chess is human like.
And there's still a lot of opportunities to get more human like today: it can't empty the dishwasher or drive a car yet even if you'd give it arms, legs and eyes.
There's a weird dystopianness about having a machine that can write songs and poetry years or maybe decades before having one that can clean a toilet.
It doesnt help that the automated future imagined in the past would "free up" people so that they could spend all of their time writing songs and poetry.
Turns out millions of years of evolution make us really efficient manual laborers. We use so little energy to do all that we do, it's pretty amazing.
But yes; I remember Ruby and the Galactic Gumshoe, the Andorrians were a race of artists who insisted that automation-induced unemployment was not a disease, it was the cure. When I think of the most secure jobs of the future, these days I think, toilet plumber and janitor.
I dont think aggregate job security ever really had anything to do with automation - only wealth distribution.
There is always 1000 things you can do for other people. The question is just do those other people have the money to pay for it after rent and bills.
Increasingly they do not.
Self-cleaning public toilets have been a thing for years.
> it can't empty the dishwasher
This video [0] at 1:25 suggests that it can. At least in some near future.
Holy Shit, is that video real? This isn't faked, or we are going to find out it was ultra staged? That robot seems like a leap.
Of course, they don't show it moving around. But arms and hands seem like better than other demo videos.
I find your ideas intriguing and would like to purchase some tulip bulbs.
This article tries to seem like an enlightened sceptic, but it's actually really cringe because of it's lack of insight into the real progress since then. Then again I don't know what people would expect from a politics editor with a liberal arts background.
>but it's actually really cringe because of it's lack of insight into the real progress since then
Looks to me to have a good enough understanding of the "real progress since then".
The article literally writes
>the Perceptron would lead to machines that “will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.” More than six decades later, similar claims are being made about current artificial intelligence. So, what’s changed in the intervening years? In some ways, not much.
Which shows the author's complete lack of understanding of the basics. In ML reasearch, it has become very clear that compute is everything and architectures are actually not that important. So what has changed? And why is today fundamentally different, despite us still using the same perceptron architecture with a few tweaks? Look at the Kurzweil curve [1]. The available compute power has allowed existing architectures to flourish beyond what was possible in the 60s. Do you think it's a coincidence that computers in the early 2010s were suddenly able to distinguish pictures of cats and dogs reliably, when that was utterly impossible just a decade earlier? Or perhaps you think it's weird that GPT3/4 capability level models arrived in the 2020s? Just look at the curve (which was made in the early 2000s btw) to find out why.
[1] https://images.squarespace-cdn.com/content/v1/5ec0712f88e816...
> In the ML reasearch, it has become very clear that compute is everything and architectures are actually not that important.
Hate to say "no, you're wrong", but… this is just not true. The latest LLM stuff only came after the invention of the transformer architecture. After the development of a new architecture, we always have a "hey, if you throw a supercomputer at it, it works EVEN BETTER!" phase: think Deep Blue, which was just alpha-beta search with a fitted evaluation function on a really powerful machine.
I see no reason to believe that transformers doing text prediction are a generally-applicable AI architecture. I suspect they're even incapable of many things, no matter how big they get: GPT-4 remains incapable of the things I claimed GPT models were incapable of back in the GPT-2 days.
If you look at the original transformer paper, it was not a revolution. It was yet another few % improvement over state of the art. The reason it is still around is because it was actually a step back from recurrent networks and re-allowed scaling once more. Recurrent models used to be the big hope towards AGI for a long time (and some people still cling to them), but they simply turned out to be too limited when you want to scale up. And in the end, it's all about scaling compute. No matter which way you look at it. This is the true constant over the past 60 years.
>GPT-4 remains incapable of the things I claimed GPT models were incapable of back in the GPT-2 days.
Which are...? Anything we have seen so far from scaling LLMs is that capability is purely an issue of training/fine-tuning. It's not a question of the architecture.
My money was on recurrent networks being the next big thing, then I looked at Attention is All You Need, and I went "wow, that's a good idea". I don't remember ever looking at the benchmarks.
Quantitative improvement over state-of-the-art is nigh-irrelevant. You can drive that up arbitrarily high, for almost any given metric, just by throwing more compute at it (with a few architecture / dataset decisions to push it in a particular direction). That's not the kind of thing I consider an improvement: it doesn't make anything possible that wasn't already possible.
The measure of an AI system is fundamentally about doing stuff, but primarily about doing stuff efficiently. Computers have always been able to play chess by exhaustive brute-force search, given a galaxy-sized GPU and a few million years.
>You can drive that up arbitrarily high, for almost any given metric, just by throwing more compute at it
Thank you for confirming my point. Because this is what it's all about. It might not seem like much if you only think on timeframes of startups, but in the long run this will beat any architecture improvement. Yeah, maybe you can come up with a fancy revolutionary design and achieve instant 50% improvement over SOTA. You will get all the attention of the media and prizes and stuff. But how often does that happen? On the other hand, you could also just silently wait one or two generations of Nvidia cards and get the same thing. And even better, this also works when there is absolutely no improvement in architectures by anyone. This is only highlighted by transformers, because they came out in 2017, but only now have they been able to solve e.g. the bar exam.
Scale > architecture is the uncomfortable reality that all ML researchers eventually come to terms with.
In order to use scale you need a good architecture. A quadratic sort works just fine for 10 elements, the architecture efficiency of faster algorithms isn't very relevant until higher scale.
> Scale > architecture is the uncomfortable reality that all ML researchers eventually come to terms with.
Architecture -> scale. Current LLM architecture doesn't scale very well, you need massive increase in compute, memory and data for tiny improvements. Likely there are much more efficient ways to train these models, it shouldn't be this hard for them to learn arithmetics.
>Architecture -> scale
Yeah, "scale > architecture" assumes that those are indepedent, whereas architecture affects scaling reach, needs, and options.
Also would a Perceptron with the original architecture do what an LLM does just given scale?
>Also would a Perceptron with the original architecture do what an LLM does just given scale?
Yes. The answer is yes. This is exactly what the universal approximation theorem [1] says, which was proven decades ago. It is also just another line of reasoning that compute matters more than architecture, but fewer people outside of ML research know about it. And it's also less practical, since the formal theorem has few empirics to go by for really large models. But mathematically, a scaled perceptron is 100% able to replicate any LLM.
[1] https://en.wikipedia.org/wiki/Universal_approximation_theore...
Given an LLM, you can construct a perceptron-based model that approximates it, but that doesn't mean there's any obvious way of training a perceptron model that would produce something like an LLM.
What the universal approximation theorem tells you is that there's some algorithm you could use to train a (sufficiently-large) perceptron model to behave arbitrarily-closely to the equivalently-trained LLM – which, yeah, of course there is. There's a way of training an array of floating point numbers to do that, too. It's called the transformer architecture.
(Your misunderstandings are similar to the sorts of misunderstandings I had before I got in the habit of trying to (abstractly) implement every CS concept I came across. Kudos for trying to reason about AI from first principles, but you have to do a lot of reasoning before you stop being much wronger than the empiricists.)
>Likely there are much more efficient ways to train these models, it shouldn't be this hard for them to learn arithmetics.
Oh really? And what would that look like? People have literally been looking at this for decades. And yet we're still using gradient descent with a few tweaks. So I would not bet on anyone coming up with something revolutionary. Especially not while hardware regularly revolutionizes the capabilities of models trained in this ancient way. Chipmakers have always allowed us to overcome the hurdle created by our lack of advances on the training architecture front. No matter whether you look at convnets or transformers.
> Then again I don't know what people would expect from a politics editor with a liberal arts background.
I'd expect better from them than from technologists who actually understand the technology, becuase the latter typically have unreserved praise for their creations and complete blindness to the negative effects of technology on society.
I agree, article gives just one perspective on the matter, seeking similarities with the past and ignoring differences. But I believe this perspective has right to exist, it gives you priors, then you can incorporate differences to update.
It's not cringe because of lack insight into progress, quite the opposite. It's cringe because the author does not understand what an LLM is and what it can be used for.
I'm tired of people saying this when someone expresses a skeptical viewpoint. You don't need a doctorate in statistical computing to have an opinion.
It seems that people who DO understand LLMs are too blinded by the shinyness to say one coherent word about it.
What's their background got to do with it?
Edit: actually read the article, and it just seems to note the cyclical nature of AI, which the lay person may not be aware of.
My perspective is that the "hallucination" problem with LLMs is either the same as or analogous to the Gell-Mann Amnesia effect with newspapers.
Outside our own field, they look amazing and insightful; within our field, we focus on the "wet pavements cause rain" level of mistakes that they make.
We need thoughts like this, even though it doesn't bring any new information. The question the author puts forth is basically; will it work this time?". This is a q. that many seems to believe the answer to is, "yes".
I am one of those. However, distinguishing between truly autonomous AI systems and those that are simply high-performing can become challenging. The line between the two might become blurred when performance reaches a level that seems autonomous. In this context I believe that another "AI winter" will be hard to identify in the future.
It's not even a question of "will it work this time?" as if we are uncertain about some future capability. We already know that it works right now. There are demonstrable capabilities you can just go out and use this very moment that are beyond the most optimistic hopes of 10 years ago in both the text and image domain (and video but that's unreleased).
There is some tendency to discount the current state of AI because it doesn't match the level of hype - ignoring the fact that hype is non-linear and it makes no sense to speak of an appropriate level of hype anyway. All that stuff is just pop culture. If you can look past it and objectively view the current state of the art through the lens of 10 or even 5 years ago (gpt-2 era) IMHO you'll see we have enough progress to last 10 or 20 years without declaring a "winter" even if from here on out progress will be slow.
I agree, there has been exponential progress in the last two decades, and all indications suggest that this trend will persist. My point is that while autonomous AI may still be far off, it will become increasingly challenging to differentiate between an autonomous AI that operates and reprograms itself without human intervention, and a highly advanced AI system.
I avoid the term AI. We have a machine learning. A lot of viable implementation of it. And text calculators which are predictive text lossy compressors. Done. No AI hype:)
Was reading to see what insights the author has, then all of a sudden the article just ended... Not sure if I got the point of the article.
I agree. History of AI that we all know. Modern AI sometimes gets things wrong in interesting ways. The end.
It didn't even get to its own point (which is also wrong - we definitely haven't been here before).
eye roll I tell you, the amount of hype nowaydays is shameful display of rational thought
It always is rational, until the hype cycle is over, and those believing in it go on about how they always knew it wasn't all that it was hyped up to be.
> We've been here before: AI promised humanlike machines – in 1958
But this time is for real. /s
Crafted by Rajat
Source Code