Welcome to Archality - AI Inference by Enthusiasts

FrgMstr · May 27, 2026

https://x.com/KyleBennett/status/2059723048950046951

serpretetsky · May 27, 2026

Im going to assume a bot hasnt hijacked your account. Maybe this is a tricky question this early in the game, but how would this compare to other machine learning solutions from dedicated ml companies like tenstorrent or cerebras.

Zarathustra[H] · May 27, 2026

This AI stuff is not really my cup pf tea, but I am all for transitioning AI workloads away from general purpose compute over to more special purpose FPGA and ASIC hardware.

I've seen some of your posts on LinkedIN, and it certainly seems like interesting tech.

That said, FPGA's and ASICs do often compete for the same upstream fab capacity, but this is definitely a move in the right direction.

I'm glad you've found something that peaks your interest!

Here's hoping you upset Nvidia's AI-centric business model completely

FrgMstr · May 27, 2026

serpretetsky said:
Im going to assume a bot hasnt hijacked your account. Maybe this is a tricky question this early in the game, but how would this compare to other machine learning solutions from dedicated ml companies like tenstorrent or cerebras.

Can't really get into the bones of that right now, but the 10,000 foot view is there is no comparing their technology to ours except for the job it does.

LukeTbk · May 27, 2026

Zarathustra[H] said:
Here's hoping you upset Nvidia's AI-centric business model completely

Nvidia is going toward that direction has well, with groq, so much money and so much compute for inference that from the ground up built just for inference solution will appear a bit everywhere, AMD will have (and obviously the google-openAI-amazon has well), starting with upcoming Rubin in Nvidia case.

Regular gpu are still king for the pre-fill phase (super parrallel workload that gpu are made for), the play for a Groq before acquisition, cerebras and I imagine Archalty for an AMD/Google TPU to do the prefill and them do the pure inference part.

HeadRusch · May 27, 2026

..........this is not the return to "Gaming Chair Review" form I was hoping for..................but you know, solve the worlds problems, return GPU's to gamers whatever I guess that's also cool.....

LukeTbk · May 27, 2026

Zarathustra[H] said:
That said, FPGA's and ASICs do compete for the same upstream fab capacity, but this is a move in the right direction.

Not necessarily exactly the same, cerebras was TSMC 14 for a long time and latest on TSMC 5, Groc on samsung node for the future chips, they were on old TSMC and global foundry until recently.

SRAM does not scale well anyway and the super predeterministic dataflow goes fast on older gen affair, it would be possible for the Archality BMAC to be the same, easy to imagine they would be more than quite happy to use what would be old by then TSMC 7, anything weight resident could be like regular sram in how little it scale with nodes advancement versus logic.

M76 · May 27, 2026

Yeah, kick Jen-Hsun's butt.

alxlwson · May 27, 2026

Woah. Your CA has some chops.

Gigantopithecus · May 27, 2026

Are you looking for investors? If so, what's the minimum buy-in that you'd consider? Four figures? Five figures? Six figures?

IDK wtf most of that means but I believe in your track record.

cdabc123 · May 27, 2026

The 2 second glance at this implies the monumental task of holding weights is accomplished by asic registers or sram. Not sure how the architectural difference of this strategy can minimize weight size to a level that is acceptable.

Almost all ai chips just have a massive amount of hbm on the packaging. How else would you get hundreds of gigs of fast mem?

Noble Aquarius · May 27, 2026

Please by all means put all your eggs in one basket, A.I is a short cut to learn knowledge, The Stock market will crash in 2029, So Please by all means Feed the illusion.

Nobu · May 27, 2026

Noble Aquarius said:
Please by all means put all your eggs in one basket, A.I is a short cut to learn knowledge, The Stock market will crash in 2029, So Please by all means Feed the illusion.

Trust me, FrgMstr and any serious investors on this forum aren't doing that. But you have to play the game to win sometimes. And if his idea is as good as he is advertising, it could be a big win for somebody.

1.1.2.3.5... · May 27, 2026

Kyle, if there's anything I (or we) can do to help you in this, just say the word.

Completely love it.

GoldenTiger · May 27, 2026

Wishing you the best with these endeavors. Im very curious to see where Ai is going in general with how much it's already advanced.

Noble Aquarius · May 27, 2026

Nobu said:
Trust me, FrgMstr and any serious investors on this forum aren't doing that. But you have to play the game to win sometimes. And if his idea is as good as he is advertising, it could be a big win for somebody

When you start out with Trust me, It is a red flag. ever heard of human psychology? They teach when someone says Trust me or have good intentions its always opposite.

FrgMstr · May 27, 2026

Nobu · May 27, 2026

Noble Aquarius said:
When you start out with Trust me, It is a red flag. ever heard of human psychology? They teach when someone says Trust me or have good intentions its always opposite.

Okay, I'm trying to trick you and everyone posting their strategies in the Finance thread are liars and fools.

I'm not trying to convince you to invest anyway, I'm just saying any serious investor on this forum who is investing isn't being stupid about it.

Zarathustra[H] · May 27, 2026

FrgMstr said:
View attachment 805829

Are there any reductions in RAM use from a technology like this compared to the status quo, or is it pretty much the same?

Making AI training and calculation more efficient is definitely a huge win. Googles recent claims regarding massive RAM reduction was pretty encouraging, but I can't help but wonder if the relatively insatiable demand of the AI industry will just say "nice, thanks" and gobble it all up anyway...

KazeoHin · May 27, 2026

Remember, The current pattern is as hardware becomes more and more specialised for AI, the data centres become bigger and resources becomes more contested.

If "We will stop sucking everything dry once we can do what we're currently doing, with less" worked, then a datacentre smaller than a wallgreens would be able to serve the entire world with GPT3 level quality.

But no. If you create a chip that can do what we can do now, only with less memory: the people in charge will see dollars and see what it can do with more memory, and inflate the scope of their goals to saturate resources, NOT try to accomplish the original goals easier.

Noble Aquarius · May 27, 2026

Nobu said:
Okay, I'm trying to trick you and everyone posting their strategies in the Finance thread are liars and fools.

I'm not trying to convince you to invest anyway, I'm just saying any serious investor on this forum who is investing isn't being stupid about it.

A.I is so great right? This is why states are banning them right?

KazeoHin · May 27, 2026

Noble Aquarius said:
A.I is so great right? This is why states are banning them right? View attachment 805831

"being incredibly lucrative" and "being great" are two different things.

Remember, we live in a society that states capitol=god. If it makes money, it is the ultimate truth.

AI makes a lot of money. (not profit, god no. AI is not profitable, literally loses billions every quarter, but investors pour fountains of cash so it still makes money)

1.1.2.3.5... · May 27, 2026

KazeoHin said:
Remember, The current pattern is as hardware becomes more and more specialised for AI, the data centres become bigger and resources becomes more contested.

If "We will stop sucking everything dry once we can do what we're currently doing, with less" worked, then a datacentre smaller than a wallgreens would be able to serve the entire world with GPT3 level quality.

But no. If you create a chip that can do what we can do now, only with less memory: the people in charge will see dollars and see what it can do with more memory, and inflate the scope of their goals to saturate resources, NOT try to accomplish the original goals easier.

Yes and no.

If someone is offering services for a fraction of the cost and 90% of the performance that pulls business from more expensive models.

That's absolutely going to start happening because at some level (below where we are) the models performance is less and less important vs the tooling around it.

The cpu demand showing up probably largely for that tooling.

Nobu · May 27, 2026

Noble Aquarius said:
A.I is so great right? This is why states are banning them right? View attachment 805831

The hell are you even arguing with?

LukeTbk · May 27, 2026

1.1.2.3.5... said:
Yes and no.

If someone is offering services for a fraction of the cost and 90% of the performance that pulls business from more expensive models.

I think you are saying the same..... the cheaper the model the more the agents will use it, the better the model the more the agents will use and the cheaper/better it get new things will be available.

If cars went at horse speed moving similar weight than them they would have been quite better for the environment, same for the Internet if the data was just letter-newspaper-book, we will proabably see something similar, usage explode with efficacy for a bigger total use, for a long time.

KazeoHin said:
AI makes a lot of money. (not profit, god no. AI is not profitable, literally loses billions every quarter, but investors pour fountains of cash so it still makes money)

AI is a broad term, some AI have been some of the most or the most profitable thing in history for a decade now, the AI content suggestion/ads targeting, look at meta recent quarters to see an acceleration on that side of things, the google/meta/Walmart/Amazon usage of AI has been profitable all along and the cash flow from those AI usage was paying a large part of the build out for new one.

Anthropic had a positive ebitda last quarter, so even frontier LLM type of AI seem on the cusp now, not just because with blackwell inference have been ultra profitable in the 70-80% zone, but that it became so big it paid for trainng the next one for the most profitable company in that field now.

kram182 · May 27, 2026

Zarathustra[H] said:
Are there any reductions in RAM use from a technology like this compared to the status quo, or is it pretty much the same?

Making AI training and calculation more efficient is definitely a huge win. Googles recent claims regarding massive RAM reduction was pretty encouraging, but I can't help but wonder if the relatively insatiable demand of the AI industry will just say "nice, thanks" and gobble it all up anyway...

KazeoHin said:
a datacentre smaller than a wallgreens would be able to serve the entire world with GPT3 level quality.

But no. If you create a chip that can do what we can do now, only with less memory: the people in charge will see dollars and see what it can do with more memory, and inflate the scope of their goals to saturate resources, NOT try to accomplish the original goals easier.

That's because the current idea/plan of creating/achieving AGI is compounding/piling on complexities upon complexities (which manifests itself as more and more resources/hardware/datacenters) ad nauseum until the complexities become so complex, that AGI is achieved -in theory (Scaling Hypothesis). I don't subscribe to this entirely as I feel if achievable, a quantum component is needed (because we have quantum phenomena going on in our own conscious brains/bodies, which also scales down complexity/needed complexity) - but this should explain why even if you make something that 'can do more with less' - in today's market/mindset as others have pointed out - that just 'makes more room to pile more complexities/resources on'.

MavericK · May 27, 2026

This on the same day that Valve raises prices on the Steam Deck. Coincidence? I think not.

FrgMstr · May 27, 2026

MavericK said:
This on the same day that Valve raises prices on the Steam Deck. Coincidence? I think not.

All about timing.

Zarathustra[H] · May 27, 2026

kram182 said:
That's because the current idea/plan of creating/achieving AGI is compounding/piling on complexities upon complexities (which manifests itself as more and more resources/hardware/datacenters) ad nauseum until the complexities become so complex, that AGI is achieved -in theory (Scaling Hypothesis). I don't subscribe to this entirely as I feel if achievable, a quantum component is needed (because we have quantum phenomena going on in our own conscious brains/bodies, which also scales down complexity/needed complexity) - but this should explain why even if you make something that 'can do more with less' - in today's market/mindset as others have pointed out - that just 'makes more room to pile more complexities/resources on'.

I'm not convinced AGI is even possible, and even if it is, I'm not sure why we want it.

There is no doubt AI - particularly Machine Learning elements - can add real value, but AGI really seems like a bridge too far.

kram182 · Thursday at 12:17 AM

Zarathustra[H] said:
I'm not convinced AGI is even possible, and even if it is, I'm not sure why we want it.

There is no doubt AI - particularly Machine Learning elements - can add real value, but AGI really seems like a bridge too far.

I'm not 100% convinced it's achievable either - but as previously stated if it is, I think at the very least some of it needs to be quantum. But as for the idea itself - as you see with ray tracing from its conception all those decades ago (or centuries ago if you attribute it to Albrecht Dürer), to flight, to space travel, to virtual reality etc - once humanity has a technical idea/goal, it continuously strives to achieve it, no matter what (and AGI is a pretty aspirational/monumental idea/goal at that, as far as ideas go).

philb2 · Thursday at 1:19 AM

Not to be a Debbie Downer, but IF I understand correctly these guys want to build silicon. (If I have misunderstood, then call me the prince of noobdom.)

So who is going to build this silicon into motherboards or PCI-E cards, as opposed to datacenter rack cards?

sc5mu93 · Thursday at 6:32 AM

To steal from everyone's leather jacket buddy: HardForum is now an AI Company.

Time to erase all the gaming forums and pretend like they never existed.

leSLIe · Thursday at 7:48 AM

FrgMstr said:
https://x.com/KyleBennett/status/2059723048950046951

View attachment 805802

Yeah, but can it run Crysis?

LukeTbk · Thursday at 12:01 PM

LukeTbk said:
Not necessarily exactly the same

Has an example of this, Nvidia "equivalent" they are making for this tech (not at all the same in some ways, but weight resident on silicon and made only for inference chips, very direct competitor)

https://www.sammobile.com/news/samsung-makes-groq-3-lpu-chips-nvidia/
https://newsletter.semianalysis.com/p/nvidia-the-inference-kingdom-expands

One of the benefits of relying on SF4 (Samsung foundry) is that it isn’t constrained like TSMC’s N3, which is putting a cap on accelerator production and is a key reason why the industry remains compute constrained. This is in addition to not having HBM which is also constrained. This allows Nvidia to ramp production of the LPU without sacrificing or eating into their valuable TSMC allocation or HBM allocations, representing true incremental revenue and capacity that noone else can access.

The whole groq lpu part of the upcoming nvidia datacenter is built on different supply chain in some critical way, no HBM, no TSMC.

And the next generation will be on "old" Tsmc 3/Cowos (top of the line a couple of years ago but not at all in 2028) while the rest of their chips will be on TSMC 2 family node with newer packaging tech.

That type of specialized silicon for each steps:

s%2F05b555ed-9d4e-45db-ad03-cbc1cc261b17_3064x1497.jpg

And once model stop to advance a super pace, silicon specialy for part of a specific popular model will start has well. Nvidia projection are a 3500% gain type for the inference part per megawatt vs GB200 blackwell, opening the door for a ~10x for the total system under ideal agentic mix of expert large language situation.

WhoBeDaPlaya · Thursday at 5:38 PM

Gideon · Thursday at 8:03 PM

Hey if you can fix the Industry where people can actually get their hands on hardware again for a reasonable price than I am all for it. The proper tools for a job will always outshine brute force.

FrgMstr · Thursday at 8:20 PM

Gideon said:
Hey if you can fix the Industry where people can actually get their hands on hardware again for a reasonable price than I am all for it. The proper tools for a job will always outshine brute force.

Bingo. Have to think outside of the box. Cerebras and Groq are the only players now and those still require huge silicon and memory footprints.

1.1.2.3.5... · Thursday at 8:25 PM

Oy...

Zarathustra[H] said:
I'm not convinced AGI is even possible, and even if it is, I'm not sure why we want it.

There is no doubt AI - particularly Machine Learning elements - can add real value, but AGI really seems like a bridge too far.

Same reason we still look for a theory of everything. We don't need it since QED and GR are perfectly useful for whatever practical application we need them for. Maybe AGI will tell us something about ourselves? Either way, I think we get there in time, but it will look pretty different from what we have today.

Armenius · Friday at 10:38 AM

1.1.2.3.5... said:
Oy...

Same reason we still look for a theory of everything. We don't need it since QED and GR are perfectly useful for whatever practical application we need them for. Maybe AGI will tell us something about ourselves? Either way, I think we get there in time, but it will look pretty different from what we have today.

42.

Balkroth · Friday at 4:57 PM

Interested to see what you guys are working on. Something I've been waiting for people to do is make a multi-asic set, where each ASIC is responsible for , lack of better wording, their own domain of AI and with respective size of memory pools based on what their domain is instead of just having a onething for all with large pool sort of thing. Although a large pool would work with it fine too I guess, but seperating things out can make a ton of sense.

Welcome to Archality - AI Inference by Enthusiasts

Just Plain Mean

2[H]4U

Extremely [H]

Just Plain Mean

[H]F Junkie

[H]ard|Gawd

[H]F Junkie

[H]F Junkie

You Know Where I Live

[H]ard|Gawd

Supreme [H]ardness

[H]ard|Gawd

[H]F Junkie

2[H]4U

I Got It Back... Kinda

[H]ard|Gawd

Just Plain Mean

[H]F Junkie

Extremely [H]

[H]F Junkie

[H]ard|Gawd

[H]F Junkie

2[H]4U

[H]F Junkie

[H]F Junkie

[H]ard|Gawd

Zero Cool

Just Plain Mean

Extremely [H]

[H]ard|Gawd

2[H]4U

[H]ard|Gawd

Fully [H]

[H]F Junkie

2[H]4U

Supreme [H]ardness

Just Plain Mean

2[H]4U

Extremely [H]

Gawd