radu/TensorFlow-Models: This repository contains machine learning models implemented in TensorFlow. The models are maintained by their respective authors. To propose a model for inclusion, please submit a pull request. @ 863f6c0919256202de12aee18cce680361f56969

Neal Wu 0d9a3abdca Remove all references to 'tensorflow.models' which is no longer correct		9 лет назад
..
BUILD	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 лет назад
README.md	bb5798c7a0 Made several fixes to the embedding README	9 лет назад
__init__.py	0d9a3abdca Remove all references to 'tensorflow.models' which is no longer correct	9 лет назад
word2vec.py	0d9a3abdca Remove all references to 'tensorflow.models' which is no longer correct	9 лет назад
word2vec_kernels.cc	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 лет назад
word2vec_ops.cc	86ecc9730d Moving example models from github.com/tensorflow/tensorflow to github.com/tensorflow/models	9 лет назад
word2vec_optimized.py	0d9a3abdca Remove all references to 'tensorflow.models' which is no longer correct	9 лет назад
word2vec_optimized_test.py	0d9a3abdca Remove all references to 'tensorflow.models' which is no longer correct	9 лет назад
word2vec_test.py	0d9a3abdca Remove all references to 'tensorflow.models' which is no longer correct	9 лет назад

This directory contains models for unsupervised training of word embeddings using the model described in:

(Mikolov, et. al.) Efficient Estimation of Word Representations in Vector Space, ICLR 2013.

Detailed instructions on how to get started and use them are available in the tutorials. Brief instructions are below.

Word2Vec Tutorial

To download the example text and evaluation data:

wget http://mattmahoney.net/dc/text8.zip -O text8.zip
unzip text8.zip
wget https://storage.googleapis.com/google-code-archive-source/v2/code.google.com/word2vec/source-archive.zip
unzip -p source-archive.zip  word2vec/trunk/questions-words.txt > questions-words.txt
rm source-archive.zip

Assuming you have cloned the git repository, navigate into this directory and run using:

cd models/tutorials/embedding
python word2vec_optimized.py \
  --train_data=text8 \
  --eval_data=questions-words.txt \
  --save_path=/tmp/

To run the code from sources using bazel:

bazel run -c opt models/tutorials/embedding/word2vec_optimized -- \
  --train_data=text8 \
  --eval_data=questions-words.txt \
  --save_path=/tmp/

Here is a short overview of what is in this directory.

File	What's in it?
`word2vec.py`	A version of word2vec implemented using TensorFlow ops and minibatching.
`word2vec_test.py`	Integration test for word2vec.
`word2vec_optimized.py`	A version of word2vec implemented using C ops that does no minibatching.
`word2vec_optimized_test.py`	Integration test for word2vec_optimized.
`word2vec_kernels.cc`	Kernels for the custom input and training ops.
`word2vec_ops.cc`	The declarations of the custom ops.

README.md