We will use DinoV2, a vision transformer model, to extract image embeddings from the Fashion MNIST dataset. These embeddings will be used to visualize image similarity via clustering and to perform similarity search using k-nearest neighbors (KNN).
Environment Setup
Temp Directory
The virtual environment will be created in the fast, temporary file system located at $TMPDIR
echo '# Set a temporary directory' >> ~/.bashrcecho 'export TMPDIR=${TMPDIR:-/tmp}' >> ~/.bashrcsource ~/.bashrcecho $TMPDIR
Navigate to the user/ directory and link dependencies
# cd user folder# Create a symbolic link named .venv in your user folder, # pointing to the actual environment in $TMPDIRln -s $TMPDIR/.venv $(pwd)/.venv# Install project dependencies defined in pyproject.tomlcd ~/dsgt-arc/fall-2025-interest-group-projects/project/02-embeddingsuv pip install -e .
Open a copy of embedding.ipynb in Jupyter and select the linked .venv kernel
jupyter notebook
Set up the virtual environment in scratch
Create a .venv in scratch
# create a directory to store the environmentmkdir -p ~/scratch/dsgt-arc/embeddings# create a virtual environment using uvuv venv ~/scratch/dsgt-arc/embeddings/.venv
Create alias in ~/.bashrc
alias activate='source ~/scratch/dsgt-arc/embeddings/.venv/bin/activate'
Activate it
activate# verifywhich python
~/scratch/dsgt-arc/embeddings/.venv/bin/python
Link the scratch .venv with the lora project .venv and Install dependencies using pyproject.toml
Link .venv and install dependencies
# navigate to project directorycd ~/dsgt-arc/fall-2025-interest-group-projects/user/ctio3/# Delete existing symlinkrm .venv# create a symbolic link to your environment in this project folderln -s ~/scratch/dsgt-arc/embeddings/.venv $(pwd)/.venv# verifyls -l .venv# Install project dependencies defined in pyproject.tomlcd ~/dsgt-arc/fall-2025-interest-group-projects/project/02-embeddingsuv pip install -e .