Why  Skip-gram models need 2 embedding layer ? 

Hi SungDong. Thanks for the great posts. I am reading the first two models on skip-gram. Why do you use two embedding instead of one?  The second embedding_u has all the same weights for each row after I train it. Based on the formula on this model, I think it should have only one embedding for all word vectors. Am I missing some details ?  



___________________________________________

Is the second matrix used for efficiency ? I guess the second matrix can be replace by a linear transformation with the transpose size. But the prediction is a one-hot vector, so it is a waste to compute bunches of zeros. A matrix look up is far more efficient.  


```
class Skipgram(nn.Module):
    
    def __init__(self, vocab_size, projection_dim):
        super(Skipgram,self).__init__()
        self.embedding_v = nn.Embedding(vocab_size, projection_dim)
        self.embedding_u = nn.Embedding(vocab_size, projection_dim)

        self.embedding_v.weight.data.uniform_(-1, 1) # init
        self.embedding_u.weight.data.uniform_(0, 0) # init
        #self.out = nn.Linear(projection_dim,vocab_size)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why Skip-gram models need 2 embedding layer ? #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Why Skip-gram models need 2 embedding layer ? #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions