卯卯 | 炼就一手绝世刀法！-gensim训练word2vec并使用PCA实现二维可视化

gensim训练word2vec并使用PCA实现二维可视化

结果：

代码：

# -*- coding: utf-8 -*-
from gensim.models import Word2Vec
from sklearn.decomposition import PCA
from matplotlib import pyplot
# 训练的语料
sentences = [['this', 'is', 'the', 'first', 'sentence', 'for', 'word2vec'],
   ['this', 'is', 'the', 'second', 'sentence'],
   ['yet', 'another', 'sentence'],
   ['one', 'more', 'sentence'],
   ['and', 'the', 'final', 'sentence']]
# 利用语料训练模型
model = Word2Vec(sentences,window=5, min_count=1)

#添加vocab:
model.build_vocab(sentences, update=True) #添加vocab

#check是否添加成功：
for k,v in model.wv.vocab.items():
print (k,v)

# 基于2d PCA拟合数据
X = model[model.wv.vocab]
pca = PCA(n_components=2)##PCA无监督的降维，LDA线性判别分析：有监督的降维

result = pca.fit_transform(X)
print(result)
print(result[:, 0])
# 可视化展示
pyplot.scatter(result[:, 0], result[:, 1])##scatter：散开，分散，两个点
words = list(model.wv.vocab)
for i, word in enumerate(words):
pyplot.annotate(word, xy=(result[i, 0], result[i, 1]))##annotate:注释
pyplot.show()

« 2025年7月 »
一	二	三	四	五	六	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

« 2025年7月 »

一

二

三

四

五

六

日

卯卯 | 炼就一手绝世刀法！

日出东海落西山，愁也一天，喜也一天。遇事不钻牛角尖，人也舒坦，心也舒坦！

2019-01-21 21:18:05

gensim训练word2vec并使用PCA实现二维可视化

作者:yangli | 分类:自然语言处理 | 浏览:1354 | 评论:0

Powered By Z-BlogPHP 1.5.2 Zero

转载请注明文章出处！！！！！