Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

Select Page

Think Stats: Probability and Statistics for Programmers

Think Stats: Probability and Statistics for Programmers

Think Stats is an introduction to Probability and Statistics for Python programmers. It emphasizes simple techniques you can use to explore real data sets and answer interesting questions. This book presents a case study using data from the National Institutes of Health. Readers are encouraged to work on a project with real datasets.

If you have basic skills in Python, you can use them to learn concepts in probability and statistics. Think Stats is based on a Python library for probability distributions (PMFs and CDFs). Many of the exercises use short programs to run experiments and help readers develop understanding.

Most introductory books don’t cover Bayesian statistics, but Think Stats is based on the idea that Bayesian methods are too important to postpone. By taking advantage of the PMF and CDF libraries, it is possible for beginners to learn the concepts and solve challenging problems.

It takes a computational approach, which has several advantages:

  • Students write programs as a way of developing and testing their understanding. For example, they write functions to compute a least squares fit, residuals, and the coefficient of determination. Writing and testing this code requires them to understand the concepts and implicitly corrects misunderstandings.
  • Students run experiments to test statistical behavior. For example, they explore the Central Limit Theorem (CLT) by generating samples from several distributions. When they see that the sum of values from a Pareto distribution doesn’t converge to normal, they remember the assumptions the CLT is based on.
  • Some ideas that are hard to grasp mathematically are easy to understand by simulation. For example, we approximate p-values by running Monte Carlo simulations, which reinforces the meaning of the p-value.
  • Using discrete distributions and computation makes it possible to present topics like Bayesian estimation that are not usually covered in an introductory class. For example, one exercise asks students to compute the posterior distribution for the ‘German tank problem,’ which is difficult analytically but surprisingly easy computationally.
  • Because students work in a general-purpose programming language (Python), they are able to import data from almost any source. They are not limited to data that has been cleaned and formatted for a particular statistics tool.

Think Stats: Probability and Statistics for Programmers

by Allen B. Downey (PDF, Online reading) – 140 pages

Think Stats: Probability and Statistics for Programmers by Allen B. Downey