WillKoehrsen ecd53446f5 Done with notebook пре 6 година
..
.pytest_cache bb5b1f4a26 Functions developed пре 6 година
data 9e9d442f44 Delete 2019-01-26_stats пре 6 година
images 18e8729860 Working on notebook пре 6 година
Development.ipynb 267ac79cdc Working on analysis пре 6 година
Fitting.ipynb 74c86b750f Ran fitting notebook пре 6 година
Medium Stats Analysis.ipynb 42af1df774 notebook complete пре 6 година
Time Series Analysis.ipynb 2c4d68f74e Time series пре 6 година
Work In Progress.ipynb 5740637a7a Done with notebook for article publication пре 6 година
bargraphs.py b6754548a1 Working on article plotting пре 6 година
data-science-writing-2018.ipynb ecd53446f5 Done with notebook пре 6 година
readme.md c185b4f12f Update readme.md пре 6 година
retrieval.py b6754548a1 Working on article plotting пре 6 година
view_extraction.py 62e7686310 Added graph extraction пре 6 година
visuals.py f10625a6df Better documentation пре 6 година

readme.md

Tools for analyzing Medium article statistics

The Medium stats Python toolkit is a suite of tools for retrieving, analyzing, predicting, and visualizing your Medium article stats. You can also run on my Medium statistics which are located in data/

  • Note: running on Mac may first require setting export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES from the command line to enable parallel processing

  • For complete usage refer to Medium Stats Analysis

  • Data retrieval code lives in retrieval.py

  • Visualization and analysis code is in visuals.py

  • See also the Medium article "Medium Analysis in Python"

  • Contributions are welcome and appreciated

  • For help contact wjk68@case.edu or twitter.com/@koehrsen_will

Basic usage

Use your own Medium statistics

  1. Go to the stats page https://medium.com/me/stats
  2. Scroll all the way down to the bottom so all the articles are loaded
  3. Right click, and hit 'save as'
  4. Save the file as stats.html in the data/ directory. You can also save the responses to do a similar analysis.

If you don't do this, you can still go to the next step and use the provided data!

Retrieving Statistics

  • Open up a Jupyter Notebook or Python terminal in the medium/ directory and run
from retrieval import get_data
df = get_data(fname='stats.html')

Analysis and Visualization

  • Interactive plots are not rendered on GitHub. To view the plots with their full capability, use NBviewer (Medium Stats Analysis on NBviewer)
  • All plots can be opened in the plotly online editor to finish up for publication

  • Histogram: make_hist(df, x, category=None)

  • Cumulative plot: make_cum_plot(df, y, category=None, ranges=False)

  • Scatter plots: make_scatter_plot(df, x, y, fits=None, xlog=False, ylog=False, category=None, scale=None, sizeref=2, annotations=None, ranges=False, title_override=None)

  • Scatter plot with three variables: pass in category or scale to make_scatter_plot

  • Univariate Linear Regression: make_linear_regression(df, x, y, intercept_0)

  • Univariate polynomial fitting: make_poly_fits(df, x, y, degree=6)

  • Multivariate Linear Regression: pass in list of x to make_linear_regression

  • Future extrapolation: make_extrapolation(df, y, years, degree=4)

  • More methods will be coming soon!

  • Submit pull requests with your own code, or open issues for suggestions!