Intro
BM25 (Best Matching 25) is an Information Retrieval ranking algorithm used by search engines dto score how relevant a document is to a query. It is an enhancement of TF-IDF that improves ranking quality by incorporating term saturation and document length normalization.
BM25 ranks documents based on:
- Term frequency (TF): how often query terms appear in the document, with diminishing re
- Inverse document frequency (IDF): how rare those terms are across the corpus
- Document length normalization: penalizes overly long documents so they don’t rank higher just due to size




