A type of ML model that is trained on a broad dataset. These models are generally used for general or broad use cases, and may be adaptable through fine-tuning.

Foundation models like the ones powering OpenAI’s ChatGPT, can cost hundreds of millions of dollars to train.

The term was coined in Aug. 2021 at Stanford.


Foundation models are near universally based on Transformers.