Model Architecture
In short
The blueprint or structure of a Model — how it’s organized and what operations it performs. Different architectures suit different tasks.
In Machine Learning, a Model is a system that learns patterns from Data. But before it can learn anything, you need to decide how it’s structured — that’s the architecture.
If you have a very muscular body and put it to lift heavy weights, it would probably do pretty well. Then if you take the same muscular body and put it at running a marathon, you might find it struggling. Same thing with models — some architectures are great for images but not for text.
The architecture defines what the model can and can’t do well. For LLMs, the key architecture is the Transformer, which showed really good results for text processing. The choice of architecture is one of the most important decisions when building a model, because it determines what kind of patterns the model can learn.
Related
- Model - the architecture defines its structure
- Transformer - the architecture behind LLMs
- Model Parameters - fill in the architecture’s blueprint