import torchimport torch.nn as nn># Define the number of words in our vocabularyvocab_size = 100# Define the size of the embedding vectorsembedding_dim = 10# Create the embedding layerembedding = nn.Embedding(vocab_size, embedding_dim)# Convert a single sentence into its word indicessentence = "I love cats"sentence_indices = [word_to_index[word] for word in sentence.split()]# Convert the word indices into embedding vectorsembedded = embedding(torch.tensor(sentence_indices))# Print the shape of the embedded tensorprint(embedded.shape)# Output: torch.Size([3, 10])# Print the values of the embedded tensorprint(embedded)# Output:# tensor([[ 0.0415, 0.0276, -0.0492, 0.0641, 0.0457, -0.0133, -0.0309, 0.0375,# 0.0287, -0.0390],# [ 0.0201, 0.0370, -0.0458, 0.0517, 0.0208, -0.0226, -0.0390, 0.0372,# 0.0384, -0.0276],# [-0.0121, 0.0381, -0.0145, 0.0666, 0.0118, -0.0485, -0.0441, 0.0414,# -0.0236, 0.0436]])
Embedding vs Linear Layer
What is nn.Embedding really?
An embedding is basically the same thing as a linear layer (without bias term) but it does a lookup instead of a matrix-vector multiplication