• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

Welcome to Archality - AI Inference by Enthusiasts

FrgMstr

Just Plain Mean
Staff member
2FA
Joined
May 18, 1997
Messages
58,009
https://x.com/KyleBennett/status/2059723048950046951

1779911415916.png
 
Im going to assume a bot hasnt hijacked your account. Maybe this is a tricky question this early in the game, but how would this compare to other machine learning solutions from dedicated ml companies like tenstorrent or cerebras.
 
This AI stuff is not really my cup pf tea, but I am all for transitioning AI workloads away from general purpose compute over to more special purpose FPGA and ASIC hardware.

I've seen some of your posts on LinkedIN, and it certainly seems like interesting tech.

That said, FPGA's and ASICs do often compete for the same upstream fab capacity, but this is definitely a move in the right direction.

I'm glad you've found something that peaks your interest!

Here's hoping you upset Nvidia's AI-centric business model completely 😅
 
Last edited:
Im going to assume a bot hasnt hijacked your account. Maybe this is a tricky question this early in the game, but how would this compare to other machine learning solutions from dedicated ml companies like tenstorrent or cerebras.
Can't really get into the bones of that right now, but the 10,000 foot view is there is no comparing their technology to ours except for the job it does. :)
 
Here's hoping you upset Nvidia's AI-centric business model completely 😅

Nvidia is going toward that direction has well, with groq, so much money and so much compute for inference that from the ground up built just for inference solution will appear a bit everywhere, AMD will have (and obviously the google-openAI-amazon has well), starting with upcoming Rubin in Nvidia case.

Regular gpu are still king for the pre-fill phase (super parrallel workload that gpu are made for), the play for a Groq before acquisition, cerebras and I imagine Archalty for an AMD/Google TPU to do the prefill and them do the pure inference part.
 
That said, FPGA's and ASICs do compete for the same upstream fab capacity, but this is a move in the right direction.
Not necessarily exactly the same, cerebras was TSMC 14 for a long time and latest on TSMC 5, Groc on samsung node for the future chips, they were on old TSMC and global foundry until recently.

SRAM does not scale well anyway and the super predeterministic dataflow goes fast on older gen affair, it would be possible for the Archality BMAC to be the same, easy to imagine they would be more than quite happy to use what would be old by then TSMC 7, anything weight resident could be like regular sram in how little it scale with nodes advancement versus logic.
 
The 2 second glance at this implies the monumental task of holding weights is accomplished by asic registers or sram. Not sure how the architectural difference of this strategy can minimize weight size to a level that is acceptable.

Almost all ai chips just have a massive amount of hbm on the packaging. How else would you get hundreds of gigs of fast mem?
 
Please by all means put all your eggs in one basket, A.I is a short cut to learn knowledge, The Stock market will crash in 2029, So Please by all means Feed the illusion.
 
Please by all means put all your eggs in one basket, A.I is a short cut to learn knowledge, The Stock market will crash in 2029, So Please by all means Feed the illusion.
Trust me, FrgMstr and any serious investors on this forum aren't doing that. But you have to play the game to win sometimes. And if his idea is as good as he is advertising, it could be a big win for somebody.
 
Wishing you the best with these endeavors. Im very curious to see where Ai is going in general with how much it's already advanced.
 
Trust me, FrgMstr and any serious investors on this forum aren't doing that. But you have to play the game to win sometimes. And if his idea is as good as he is advertising, it could be a big win for somebody

When you start out with Trust me, It is a red flag. ever heard of human psychology? They teach when someone says Trust me or have good intentions its always opposite.
 
When you start out with Trust me, It is a red flag. ever heard of human psychology? They teach when someone says Trust me or have good intentions its always opposite.
Okay, I'm trying to trick you and everyone posting their strategies in the Finance thread are liars and fools. :rolleyes:

I'm not trying to convince you to invest anyway, I'm just saying any serious investor on this forum who is investing isn't being stupid about it.
 

Are there any reductions in RAM use from a technology like this compared to the status quo, or is it pretty much the same?

Making AI training and calculation more efficient is definitely a huge win. Googles recent claims regarding massive RAM reduction was pretty encouraging, but I can't help but wonder if the relatively insatiable demand of the AI industry will just say "nice, thanks" and gobble it all up anyway...
 
Remember, The current pattern is as hardware becomes more and more specialised for AI, the data centres become bigger and resources becomes more contested.

If "We will stop sucking everything dry once we can do what we're currently doing, with less" worked, then a datacentre smaller than a wallgreens would be able to serve the entire world with GPT3 level quality.

But no. If you create a chip that can do what we can do now, only with less memory: the people in charge will see dollars and see what it can do with more memory, and inflate the scope of their goals to saturate resources, NOT try to accomplish the original goals easier.
 
Okay, I'm trying to trick you and everyone posting their strategies in the Finance thread are liars and fools. :rolleyes:

I'm not trying to convince you to invest anyway, I'm just saying any serious investor on this forum who is investing isn't being stupid about it.
A.I is so great right? This is why states are banning them right?
687968139_18090807032244432_6578323166690009694_n.jpg
 
A.I is so great right? This is why states are banning them right? View attachment 805831
"being incredibly lucrative" and "being great" are two different things.

Remember, we live in a society that states capitol=god. If it makes money, it is the ultimate truth.

AI makes a lot of money. (not profit, god no. AI is not profitable, literally loses billions every quarter, but investors pour fountains of cash so it still makes money)
 
Remember, The current pattern is as hardware becomes more and more specialised for AI, the data centres become bigger and resources becomes more contested.

If "We will stop sucking everything dry once we can do what we're currently doing, with less" worked, then a datacentre smaller than a wallgreens would be able to serve the entire world with GPT3 level quality.

But no. If you create a chip that can do what we can do now, only with less memory: the people in charge will see dollars and see what it can do with more memory, and inflate the scope of their goals to saturate resources, NOT try to accomplish the original goals easier.
Yes and no.

If someone is offering services for a fraction of the cost and 90% of the performance that pulls business from more expensive models.

That's absolutely going to start happening because at some level (below where we are) the models performance is less and less important vs the tooling around it.

The cpu demand showing up probably largely for that tooling.
 
Yes and no.

If someone is offering services for a fraction of the cost and 90% of the performance that pulls business from more expensive models.
I think you are saying the same..... the cheaper the model the more the agents will use it, the better the model the more the agents will use and the cheaper/better it get new things will be available.

If cars went at horse speed moving similar weight than them they would have been quite better for the environment, same for the Internet if the data was just letter-newspaper-book, we will proabably see something similar, usage explode with efficacy for a bigger total use, for a long time.

AI makes a lot of money. (not profit, god no. AI is not profitable, literally loses billions every quarter, but investors pour fountains of cash so it still makes money)
AI is a broad term, some AI have been some of the most or the most profitable thing in history for a decade now, the AI content suggestion/ads targeting, look at meta recent quarters to see an acceleration on that side of things, the google/meta/Walmart/Amazon usage of AI has been profitable all along and the cash flow from those AI usage was paying a large part of the build out for new one.

Anthropic had a positive ebitda last quarter, so even frontier LLM type of AI seem on the cusp now, not just because with blackwell inference have been ultra profitable in the 70-80% zone, but that it became so big it paid for trainng the next one for the most profitable company in that field now.
 
Last edited:
Are there any reductions in RAM use from a technology like this compared to the status quo, or is it pretty much the same?

Making AI training and calculation more efficient is definitely a huge win. Googles recent claims regarding massive RAM reduction was pretty encouraging, but I can't help but wonder if the relatively insatiable demand of the AI industry will just say "nice, thanks" and gobble it all up anyway...

a datacentre smaller than a wallgreens would be able to serve the entire world with GPT3 level quality.

But no. If you create a chip that can do what we can do now, only with less memory: the people in charge will see dollars and see what it can do with more memory, and inflate the scope of their goals to saturate resources, NOT try to accomplish the original goals easier.

That's because the current idea/plan of creating/achieving AGI is compounding/piling on complexities upon complexities (which manifests itself as more and more resources/hardware/datacenters) ad nauseum until the complexities become so complex, that AGI is achieved -in theory (Scaling Hypothesis). I don't subscribe to this entirely as I feel if achievable, a quantum component is needed (because we have quantum phenomena going on in our own conscious brains/bodies, which also scales down complexity/needed complexity) - but this should explain why even if you make something that 'can do more with less' - in today's market/mindset as others have pointed out - that just 'makes more room to pile more complexities/resources on'.
 
This on the same day that Valve raises prices on the Steam Deck. Coincidence? I think not.
 
That's because the current idea/plan of creating/achieving AGI is compounding/piling on complexities upon complexities (which manifests itself as more and more resources/hardware/datacenters) ad nauseum until the complexities become so complex, that AGI is achieved -in theory (Scaling Hypothesis). I don't subscribe to this entirely as I feel if achievable, a quantum component is needed (because we have quantum phenomena going on in our own conscious brains/bodies, which also scales down complexity/needed complexity) - but this should explain why even if you make something that 'can do more with less' - in today's market/mindset as others have pointed out - that just 'makes more room to pile more complexities/resources on'.

I'm not convinced AGI is even possible, and even if it is, I'm not sure why we want it.

There is no doubt AI - particularly Machine Learning elements - can add real value, but AGI really seems like a bridge too far.
 
I'm not convinced AGI is even possible, and even if it is, I'm not sure why we want it.

There is no doubt AI - particularly Machine Learning elements - can add real value, but AGI really seems like a bridge too far.

I'm not 100% convinced it's achievable either - but as previously stated if it is, I think at the very least some of it needs to be quantum. But as for the idea itself - as you see with ray tracing from its conception all those decades ago (or centuries ago if you attribute it to Albrecht Dürer), to flight, to space travel, to virtual reality etc - once humanity has a technical idea/goal, it continuously strives to achieve it, no matter what (and AGI is a pretty aspirational/monumental idea/goal at that, as far as ideas go).

f2707a2deaac8c9385ce9703ebda3ced.gif
7110190d-cba3-443d-bc36-2a8abfed9891_text.gif
 
Not to be a Debbie Downer, but IF I understand correctly these guys want to build silicon. (If I have misunderstood, then call me the prince of noobdom.)

So who is going to build this silicon into motherboards or PCI-E cards, as opposed to datacenter rack cards?
 
Not necessarily exactly the same
Has an example of this, Nvidia "equivalent" they are making for this tech (not at all the same in some ways, but weight resident on silicon and made only for inference chips, very direct competitor)

https://www.sammobile.com/news/samsung-makes-groq-3-lpu-chips-nvidia/
https://newsletter.semianalysis.com/p/nvidia-the-inference-kingdom-expands

One of the benefits of relying on SF4 (Samsung foundry) is that it isn’t constrained like TSMC’s N3, which is putting a cap on accelerator production and is a key reason why the industry remains compute constrained. This is in addition to not having HBM which is also constrained. This allows Nvidia to ramp production of the LPU without sacrificing or eating into their valuable TSMC allocation or HBM allocations, representing true incremental revenue and capacity that noone else can access.

The whole groq lpu part of the upcoming nvidia datacenter is built on different supply chain in some critical way, no HBM, no TSMC.

And the next generation will be on "old" Tsmc 3/Cowos (top of the line a couple of years ago but not at all in 2028) while the rest of their chips will be on TSMC 2 family node with newer packaging tech.

That type of specialized silicon for each steps:
s%2F05b555ed-9d4e-45db-ad03-cbc1cc261b17_3064x1497.jpg


And once model stop to advance a super pace, silicon specialy for part of a specific popular model will start has well. Nvidia projection are a 3500% gain type for the inference part per megawatt vs GB200 blackwell, opening the door for a ~10x for the total system under ideal agentic mix of expert large language situation.
 
Last edited:
Hey if you can fix the Industry where people can actually get their hands on hardware again for a reasonable price than I am all for it. The proper tools for a job will always outshine brute force.
 
Hey if you can fix the Industry where people can actually get their hands on hardware again for a reasonable price than I am all for it. The proper tools for a job will always outshine brute force.
Bingo. Have to think outside of the box. Cerebras and Groq are the only players now and those still require huge silicon and memory footprints.
 
Oy...
I'm not convinced AGI is even possible, and even if it is, I'm not sure why we want it.

There is no doubt AI - particularly Machine Learning elements - can add real value, but AGI really seems like a bridge too far.
Same reason we still look for a theory of everything. We don't need it since QED and GR are perfectly useful for whatever practical application we need them for. Maybe AGI will tell us something about ourselves? Either way, I think we get there in time, but it will look pretty different from what we have today.
 
Oy...

Same reason we still look for a theory of everything. We don't need it since QED and GR are perfectly useful for whatever practical application we need them for. Maybe AGI will tell us something about ourselves? Either way, I think we get there in time, but it will look pretty different from what we have today.

42.
 
Interested to see what you guys are working on. Something I've been waiting for people to do is make a multi-asic set, where each ASIC is responsible for , lack of better wording, their own domain of AI and with respective size of memory pools based on what their domain is instead of just having a onething for all with large pool sort of thing. Although a large pool would work with it fine too I guess, but seperating things out can make a ton of sense.
 
Back
Top