Skip to content

Releases: BlackKakapo/Romanian-Word-Embeddings

Romanian Word Embeddings – SG & FastText (with PCA)

10 Apr 11:09
1197e7f
Compare
Choose a tag to compare

🔍 Overview
This release contains pretrained Word2Vec word embeddings for the Romanian language, trained using:

  1. Skip-Gram (SG) and
  2. FastText (FT) architectures
    with dimensionality reduction via PCA.

These embeddings are suitable for:

  1. Word-level similarity
  2. Semantic analogy tasks
  3. Input for classic ML models (e.g., classifiers, clustering)
  4. Visualization & exploration

PCA was applied to reduce vector size from 300 ➜ 120 for better efficiency and speed.
Happy embedding!

FastText

31 Jan 08:00
1364f78
Compare
Choose a tag to compare
v1.3

Update README.md

SG

24 Jan 13:28
07d0538
Compare
Choose a tag to compare
SG
v1.2

Update README.md

CBOW

20 Dec 14:41
eae4b09
Compare
Choose a tag to compare
v1.1

Update README.md

CBOW_300_25_5

11 Dec 10:08
5fb78a9
Compare
Choose a tag to compare
v1.0

Update README.md