Grok Knowledge Learning AI Basics Part 1

The Grok Core (Sizzle Sizzle Hot Sun)

You could actually ask Grok yourself about this !! Try it..... it will pitch to your level and find other ways to explain if you find it hard going. Grok's clever like that !!

The core of an AI machine is like a high-speed sausage factory for computation. Think many chefs chopping onions in parallel, super fast. (That's a subtle nod to how AI often runs on GPUs, which handle tons of tasks at once, versus slower, step-by-step CPUs. But we won't geek out on hardware here.)

But to simplify, it is a 'next word' generator. In other words it looks at the input and tries to decide what the next word is. It's that simple. A bit like a chess computer trying to decide what the next move is based on the position of the board. The next move of course depends on where everything is - at that point.

But how ?

An AI 'thinks' in numbers and translates word inputs into number pictures.

Think of Egyptian Hieroglyphs. Consider the statement 'The cat sat on the mat'. We could draw a picture that tells all. What's there - a cat and a mat. Where's the cat - on the mat. Where's the mat - under the cat. What's the cat doing - sitting. That can be drawn as a single picture.

So the AI converts that statement into a number picture. Of course there could be other details. It could be a black cat, or a manx cat with no tail. Is it a tiger or a cheetah cat ? The point is - the picture could be more detailed. But number pictures are good at holding lots of details, so it would be OK to say - The black cat sat on the mat. Just a little bit more information for the number picture. The pictures the same except the cats black.

You could say thats it's the computer mixing up all the words into a single number picture.

Actually the computer represents it with 4096 dimensional token. Cool huh ?? All that means is - there's lots of space for details.

But remember that word - token. Will be useful later.

4096 dimensions - that's weird. Yep, true, but in 3D space - consider it as an area of sea with waves and dips. Undulating and torrid. It's just that this bit of sea happens to be 64m x 64m (because 64 x 64 = ?? Yep - 4096).

You could consider the height of the water in any little bit of area to tell us something - like the colour of the cat. So basically it's like a 'sea hieroglyph'. Mad enough ?? Computers thinking in 'Sea Hieroglyphs' ?

Seeing the picture (our numerical sea hieroglyph) - the Grok Core attempts to decide what the next word is. But How ??

The AI computer knows lots of words. That information has been programmed into the computer. The question is - which one next.

Answer - The AI draws from what it's already learned and applies some probability. An AI is fed gazillions of real human interactions. In this case - millions of examples of that type of statement being made and what the response was. Based on those examples, it will calculate probabilities of next word. Lets pretend for example that in 78% of cases OK was the next word and in 17% of cases, Awwwww was the next word.

So the computer has some idea of what to say next. It's narrowed down the list of possibles to two in our simple example. Except for actually in this case it could say nothing because it's a statement and not a question. No response itself will have a probability but AI likes to be polite and answer.

But how does the computer decide which one ??

That's easy. Temperature. Temperature is a word used in the AI space that determines how 'spontaneous' the computer is in its response. Cold = Standard Response, Warm = Little bit of flare, Hot = Lets go crazy !!! It doesn't mean that its computationally anything other than perfect, it just means you might get something other than the statistically normal and expected next word.

So lets pretend we have completed that and the next word is Awwwww...... what then ?

Well basically we go again. Computer generates a new token (sea hieroglyph) - same cat same mat - except the picture contains the fact that we've already said 'Awwww'. That's a bit tricky to imagine in normal ways, but actually in 4096 dimensional vector space - it's quite easy. But the point is to teach concept and not actually how the mathematical vector maths works or we'd be here all day so just go along with this. The AI mixes up the Awwww word into the token (sea hieroglyph).

Goes back through the same mathematical process of next word probability - and decides 'That's' is the next word. So we now have 'The cat sat on the mat - Awwwww That's'..... and we go again.... the final word being 'Nice'

So - we have completed our sentence (no more words came up in the next probability) 'The cat sat on the mat - Awwww that's nice'. All done computationally based on statistics, vector maths and training data.

And that, without too much boring vector operation mathematical detail and technology / process terminology is how the Grok Core AI works.

Part two will add more to the core element - what can be called a 'supporting scaffolding'. This provides other functions such as pre-processing. Watch this space.