
This project trains a small TinyStories-style GPT model from random initialization on an Apple M4 Mac mini 16GB using MLX. It is not about calling APIs or fine-tuning an existing model, but rather walking through the entire pipeline of data preparation, tokenizer, model architecture, training loop, checkpoint, and inference generation.
This write-up leans more toward an engineering retrospective: the focus is not on training a chat-capable model, but on verifying whether a personal machine can complete an end-to-end small-scale LLM training run.
Project repository: sergioperezcheco/llm-from-scratch



