Quellcode durchsuchen

Merge branch 'master' of https://github.com/WillKoehrsen/Data-Analysis

Will Koehrsen vor 7 Jahren
Ursprung
Commit
47abea94ae
1 geänderte Dateien mit 74 neuen und 59 gelöschten Zeilen
  1. 74 59
      stocker/readme.md

+ 74 - 59
stocker/readme.md

@@ -19,7 +19,7 @@ Python 3.6 and the following packages are required:
 	pandas 0.22.0
 	pytrends 4.3.0
 
-These can all be installed with pip from the command line
+These can be installed with pip from the command line
 (some of these might require running the command prompt as 
 administrator). 
 
@@ -32,11 +32,12 @@ distribution, try installing with conda:
 
 `conda update quandl numpy pandas matplotlib pystan`
 
-pytrends and fbprophet can only be installed with pip. 
+pytrends and fbprophet can only be installed with pip. If you run into 
+any other errors installing packages, check out [Stack Overflow](https://stackoverflow.com/)
 
 ## Getting Started
 
-Once the packages have been installed, get started exploring a stock 
+Once the required packages have been installed, get started exploring a stock 
 by running an interactive python session or Jupyter Notebook in the same
 folder as stocker.py. 
 
@@ -45,11 +46,11 @@ Import the stocker class by running
 `from stocker import Stocker`
 
 Instantiate a stocker object by calling Stocker with a valid stock ticker (a list of all 3100
-tickers can be found at data/stock_list.csv:
+tickers can be found at data/stock_list.csv):
 
 `microsoft = Stocker('MSFT')`
 
-If succesful, you will recieve a message with date range of data:
+If succesful, you will recieve a message with the date range of data:
 
 `MSFT Stocker Initialized. Data covers 1986-03-13 to 2018-01-12.`
 
@@ -63,32 +64,25 @@ stock prices. Call any of the following on your stocker object, replacing
 
 `Stocker.plot_stock(start_date=None, end_date=None, stats=['Adj. Close'], plot_type='basic')`
 	
-Prints basic information and plots the history for the specified stat 
-over the specified date range. The default stat is Adjusted Closing price
+Prints basic info for the specified stats and plots the history for the stats
+over the specified date range. The default stat is Adjusted Closing price and 
 default start and end dates are the beginning and ending dates
 of the data. `plot_type` can be either basic, to plot the actual values on the 
-y-axis, or `pct` to plot the percentage change from average. 
-
-### Calculate profit from buy and hold strategy
-
-`Stocker.buy_and_hold(start_date=None, end_date=None, nshares=1)`
-
-Evaluates a buy and hold strategy from the start date to the end date
-with the specified number of shares. If no start date and end date are 
-specified, these default to the start and end date of the data. The buy and
-hold strategy means buying the stock on the start date and hold to the end date
-when we sell the stock. Prints the expected profit and plots the profit over time. 
+y-axis, or `pct` to plot percentage change from the average. 
 
 ### Make basic prophet model
 
 `model, future = Stocker.create_prophet_model(days=0, resample=False)`
 
-Make a Prophet Additive Model using 3 years of training data
-and make predictions number of days into the future. If days > 0, prints the 
-predicted price. Plots the historical data with the predictions and uncertainty overlaid.
+The number of training years for any Prophet model can be set with the 
+`Stocker.training_years` attribute. The default number of training years is 3.
 
-Returns model, the prophet model, and future, the future dataframe which can be used 
+Make a Prophet Additive Model using the specified number of training years
+and make predictions number of days into the future. If days > 0, prints the 
+predicted price. Also plots the historical data with the predictions and uncertainty overlaid.
+Returns the prophet model, and the future dataframe which can be used 
 for plotting components of the time series. 
+
 To see the trends and patterns of the prophet model, call 
 
 `import matplotlib.pyplot as plt
@@ -99,13 +93,14 @@ plt.show()`
 
 `Stocker.changepoint_date_analysis(search=None)`
 
-Finds the most significant changepoints in the dataset from a prophet model 
-using the past 3 years of data. The changepoints represent where the change in the
+Finds the most significant changepoints in the dataset from a prophet model trained 
+using the assigned years of training data. The changepoints represent where the change in the
 rate of change of the data is the greatest in either the negative or positive
-direction. In other words, a changepoint is where the second derivative of the data 
-is at a maximum. This method prints the 5 most significant changepoints by the 
-change in the rate of change and plots the 10 most significant overlaid on top of the 
-stock price data. The changepoints only come from the first 80% of the training data.
+direction. The changepoints occur where the change in the rate of the time series is greatest.
+This method prints the 5 most significant changepoints ranked by the 
+change in the rate and plots the 10 most significant overlaid on top of the 
+stock price data. The changepoints only come from the first 80% of the training data in 
+a Prophet model.
 
 A special bonus feature of this method is a Google Search Trends analysis. If a search term is 
 passed to the method, the method retrieves the Google Search Frequency for the specified term and plots
@@ -113,7 +108,18 @@ on the same graph as the changepoints and the stock price data. It also displays
 search queries and related rising search queires. If no 
 term is specified then this capability is not used. You can use
 this to determine if the stock price is correlated to certain search terms or if the 
-changepoints coincide with particular searches. 
+changepoints coincide with an increase in particular searches. 
+
+### Calculate profit from buy and hold strategy
+
+`Stocker.buy_and_hold(start_date=None, end_date=None, nshares=1)`
+
+Evaluates a buy and hold strategy from the start date to the end date
+with the specified number of shares. If no start date and end date are 
+specified, these default to the start and end date of the data. The buy and
+hold strategy means buying the stock on the start date and holding to the end date
+when we sell the stock. Prints the expected profit and plots the profit over time. 
+Recommended for those planning a trip back in time to maximize profits. 
 
 ### Find the best changepoint prior scale graphically
 
@@ -124,61 +130,70 @@ Makes a prophet model with each of the specified changepoint prior scales (cps).
 The cps controls the amount of overfitting in the model: a higher cps means a more
 flexible model which can lead to overfitting the training data (more variance), 
 and a lower cps leads to less flexibility and the possiblity of underfitting (high bias).
-Each model is fit with 3 years of data and makes predictions for 6 months. Output is 
-a graph showing the original observations, with the predictions from each model 
+Each model is fit with the assigned number of years of data and makes predictions for 6 months.
+Output is a graph showing the original observations, with the predictions from each model 
 and the associated uncertainty.
 
-This may take a little while to run. The results can be used to select the best 
-changepoint prior scale for the model. The cps is an attribute of a stocker object 
-and can be changed using `Stocker.changepoint_prior_scale = 0.05` 
+The cps is an attribute of a stocker object and can be changed using `Stocker.changepoint_prior_scale` 
+The default value for the cps is 0.05 which tends to be low for fitting stock data. 
 
 Altering the changepoint prior scale can have a significant effect on predictions,
 so try a few different values to see how they affect the model.
 
 ### Quantitaively compare different changepoint prior scales
 
-`Stocker.changepoint_prior_validation(changepoint_priors = [0.001, 0.05, 0.1, 0.2])`
+`Stocker.changepoint_prior_validation(self, start_date=None, end_date=None,
+				changepoint_priors = [0.001, 0.05, 0.1, 0.2])`
 
-Similar to the changepoint prior analysis except quantifies the differences between 
-cps values. A model is created with each changepoint prior, trained on 3 years of 
-data (2014-2016) and tested on 2017. The average error on the training and testing
+Quantifies the differences in performance on a validation set of the specified 
+cps values. A model is created with each changepoint prior, trained on the assigned
+number of training years prior to the test period and evaluated on the range
+passed to the method. The default validation period is from two years before the end of the 
+data to one year before the end of the data. The average error on the training and testing
 data for each prior is calculated and displayed as well as the average uncertainty
 (range) of the data for both the training and testing sets. The average error is the 
 mean of the absolute difference between the prediction and the correct value in dollars.
 The uncertainty is the upper estimate minus the lower estimate in dollars.
-A graph of these results is also produced. This method is useful for choosing
-a proper cps in combination with the analysis graphical results. 
+A graph of these results is also produced. This method is useful for choosing a 
+proper cps in combination with the graphical results. 
 
-### Evalaute the Prophet model predictions against real prices and compare profits
+### Evalaute the Prophet model predictions against real prices and play stock marker
 
 `Stocker.evaluate_prediction(start_date=None, end_date=None, nshares=1000)`
 
 Evalutes a trading strategy informed by the prophet model 
-between the specified start and end date. The model is trained on 3 years of data 
-prior to the test period and makes predictions for the specified date range. The 
-default evaluation range is the last year of the data. The predictions for the 
-evaluation period are compared to the known stock price values to determine the profits (or losses) 
-from using the prophet strategy. 
-
-The strategy states that for a given  day, we buy a stock if the model predicts it will increase.
-If the model predictsit will decrease, we do not play the market on that day. 
-Our profit, if we bought the stock, is the change in the price of the stock over that day. 
-Therefore, if we predict the stock will go up and the price does go up, we will make the change
-in price times the number of shares. If the price goes down, we lose the change times
-the number of shares. 
+between the specified start and end date. The start and end date for the evaluation 
+should be different than the start and end date used for validation the prior 
+otherwise you could end up overfitting the test set. The model is trained on the assigned 
+number of years of data prior to the test period and makes predictions for the specified date range. The 
+default evaluation range is the last year of the data. Numerical performance metrics are computed
+using the predictions and known test set values. These are: average absolute error on the testing
+and training data, percentage of time the model predicted the correct direction for the stock, and the
+percentage of the time the actual value was within the 80% confidence interval for the prediction. A 
+graph shows the predictions with uncertainty and the actual values. The final actual and predicted
+prices are also displayed. 
+
+If number of shares is passed to the method, we get to play the stock market over the 
+testing period with the specified number of shares. We compare the strategy informed 
+by the Prophet model with a simple buy and hold approach. 
+
+The strategy from the model states that for a given  day, we buy a stock if the model 
+predicts it will increase. If the model predicts a decrease, we do not play the market on that day. 
+Our earnings, if we bought the stock, will be the change in the price of the stock over that day
+multiplied by the number of shares.  Therefore, if we predict the stock will go up and the price 
+does go up, we will make the change in price times the number of shares. If the price goes down, 
+we lose the change in price times the number of shares. 
 
 Printed output is the final predicted price, the final actual price, the 
 profit from the model strategy, and the profit from a buy and hold strategy over the 
-same period. Graphs of the predictions versus the actual values and the expected 
-profit from both strategies over time are also displayed. 
+same period. A graph of the expected profit from both strategies over time is displayed. 
 
 ### Predict future prices
 
 `Stocker.predict_future(days=30)`
 
 Makes a prediction for the specified number of days in the future 
-using a prophet model trained on the past 3 years of data. Printed output 
-is the final predicted value of the stock, the days on which the stock is 
-expected to increase, and the days when it is expected to decrease.
-A graph also shows these results with uncertainty intervals.
+using a prophet model trained on the assigned number of years of data. Printed output 
+is the days on which the stock is expected to increase and the days when it is expected to decrease.
+A graph also shows these results with confidence intervals for the prediction.