radu/TensorFlow-Models: This repository contains machine learning models implemented in TensorFlow. The models are maintained by their respective authors. To propose a model for inclusion, please submit a pull request. @ 86ecc9730d751c1f72e3bfecac958166390f4125

Neal Wu 86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models		9 年前
..
BUILD	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 年前
README.md	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 年前
__init__.py	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 年前
word2vec.py	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 年前
word2vec_kernels.cc	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 年前
word2vec_ops.cc	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 年前
word2vec_optimized.py	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 年前
word2vec_optimized_test.py	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 年前
word2vec_test.py	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 年前

This directory contains models for unsupervised training of word embeddings using the model described in:

(Mikolov, et. al.) Efficient Estimation of Word Representations in Vector Space, ICLR 2013.

Detailed instructions on how to get started and use them are available in the tutorials. Brief instructions are below.

Word2Vec Tutorial

To download the example text and evaluation data:

wget http://mattmahoney.net/dc/text8.zip -O text8.zip
unzip text8.zip
wget https://storage.googleapis.com/google-code-archive-source/v2/code.google.com/word2vec/source-archive.zip
unzip -p source-archive.zip  word2vec/trunk/questions-words.txt > questions-words.txt
rm source-archive.zip

Assuming you are using the pip package install and have cloned the git repository, navigate into this directory and run using:

cd tensorflow/models/embedding
python word2vec_optimized.py \
  --train_data=text8 \
  --eval_data=questions-words.txt \
  --save_path=/tmp/

To run the code from sources using bazel:

bazel run -c opt tensorflow/models/embedding/word2vec_optimized -- \
  --train_data=text8 \
  --eval_data=questions-words.txt \
  --save_path=/tmp/

Here is a short overview of what is in this directory.

File	What's in it?
`word2vec.py`	A version of word2vec implemented using TensorFlow ops and minibatching.
`word2vec_test.py`	Integration test for word2vec.
`word2vec_optimized.py`	A version of word2vec implemented using C ops that does no minibatching.
`word2vec_optimized_test.py`	Integration test for word2vec_optimized.
`word2vec_kernels.cc`	Kernels for the custom input and training ops.
`word2vec_ops.cc`	The declarations of the custom ops.

README.md