Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] How to Use Pandas Rolling – A Simple Illustrated Guide

#1
How to Use Pandas Rolling – A Simple Illustrated Guide


This article will demonstrate how to use a pandas dataframe method called rolling().

What does the pandas.DataFrame.rolling() method do?

In short, it performs rolling windows calculations.

It is often used when working with time-series data or signal processing. I will shortly dive into a few practical examples to clarify what this means in practice.

The method will be given a parameter that specifies how big the window the desired calculations should be performed in.

A simple example of using time series data could be that each row of a pandas dataframe represents a day with some values.

Let’s say that the desired window size is five days. The rolling method is given a five as input, and it will perform the expected calculation based on steps of five days. 

Before an example of this, let’s see the method, its syntax, and its parameters.  

pandas.DataFrame.rolling()


Dataframe.rolling(window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None, method=’single’)

Let’s dive into the parameters one by one:

window


window: int, offset, or BaseIndexer subclass

This is the size of the moving window.

If an integer, the fixed number of observations is used for each window.

If an offset, the time period of each window. Each window will be variable-sized based on the observations included in the time period. This is only valid for datetime-like indexes.

If a BaseIndexer subclass, the window boundaries are based on the defined get_window_bounds() method. Additional rolling keywords argument, namely min_periods, center, and closed will be passed to get_window_bounds().

min_periods


min_periods: int, default None

This is the minimum number of observations in the window required to have a value.

Otherwise, the result is assigned np.nan.

  • For a window that is specified by an offset, min_periods will default to 1.
  • For a window specified by an integer, min_periods will default to the size of the window. 

center


center: bool, default False

If False, set the window labels as the right edge of the window index. If True, set the window labels as the center of the window index. 

win_type


win_type: str, default None

If None, all points are evenly weighted.

If a string, it must be a valid window function from scipy.signal

Some of the Scipy window types require additional parameters to be passed in the aggregation function.

The additional parameters must match the keywords specified in the Scipy window type method signature. 

on


on: str, optional

For a Dataframe, a column label or index level on which to calculate the rolling window, rather than the Dataframes index.

The provided integer column is ignored and excluded from the result since an integer index is not used to calculate the rolling window. 

axis


axis: int or str, default 0

If 0 or 'index', roll across the rows. If 1 or 'columns', roll across the columns. 

closed


closed: str, default None

  • If 'right', the first point in the window is excluded from calculations.
  • If 'left', the last point in the window is excluded from calculations.
  • If 'both', then no points in the window are excluded from the calculations.
  • If 'neither', the first and last points in the window are excluded from the calculations.

Default None means 'right'.

method


method: str {'single', 'table'}, default 'single'

Execute the rolling operation per single column or row for 'single' or over the entire object for 'table'.

This argument is only implemented when specifying engine='numba' in the method call. 


This part was obtained from the official pandas documentation

Data


The data I will be working with for this tutorial is historical data for a stock, the amazon stock.

I use the python package yfinance to import the data. I will use data starting from 2021-04-01 and running one year forward in time.

The data only includes trading days, i.e., days when the stock market was open.

# Get the stock data from Yahoo finance
AmazonData1y = yfinance.Ticker("AMZN").history(period='1y', actions=False, end='2022-04-01')
display(AmazonData1y.head(20))

The resulting dataframe contains data about the opening price, the highest price, the lowest price, the closing price, and the trading volume for each day.

Calculating moving averages


The first calculations using the rolling method I will do are some different moving averages values. They are often applied in stock analysis.

? A moving average value is a statistic that captures the average change in a data series over time. (source)

Let’s calculate the moving averages for seven days and 15 days for the stock closing price and add those values as new columns to the existing amazon dataframe.

They are named 'MA7' and 'MA15'.

# Calculating the 7 and 15 day moving averages based on closing price
# and adding them as new columns
AmazonData1y['7MA'] = AmazonData1y['Close'].rolling(7).mean()
AmazonData1y['15MA'] = AmazonData1y['Close'].rolling(15).mean() display(AmazonData1y.head(20))

Since there is no data before 2021-04-01, no seven-day moving average can be calculated before 2021-04-13 and no 15-day moving average before 2021-04-23

Calculating the Sum of Trading Volume


Let’s now instead use the rolling method to calculate the sum of the volume from the last five trading days to spot if there was any spike in volume.

It is done in the same way as for the moving average, but here the sum() method is used together with the rolling method instead of the mean() method.

I will also add this as a new column to the existing Amazon dataframe. 

# Calculating 5 day volume using rolling
AmazonData1y['5VOL'] = AmazonData1y['Volume'].rolling(5).sum() display(AmazonData1y.head(20))

This metric might not be the most useful but it is a good way to explain how you could use the rolling method together with the sum() method. 

Using rolling() with Aggregation


If combining the rolling() method with the aggregation method agg(), it is easy to perform rolling calculations on multiple columns simultaneously.

Say that I would like to find the highest high and the lowest low for the last seven days. 

# Performing rolling calculations on multiple columns at the
# same time using .agg()
SevenHighAndLow = AmazonData1y.rolling(7).agg({'High': 'max', 'Low': 'min'}) display(SevenHighAndLow.head(20))

Plotting the Values


This part will be included to visualize the value calculated. It’s a bit more appealing than simply just looking at columns of a dataframe. 

First, let’s plot the calculated moving averages values alongside the closing price. 

# Plotting the closing price with the 7 and 15 day moving averages
AmazonData1y.plot(y=['Close', '7MA', '15MA'], kind='line', figsize=(14,12)) plt.title('Closing price, 7MA and 15MA', fontsize=16)
plt.xlabel('Date')
plt.ylabel('Stock price($)')
plt.show()

And then the accumulated 5 day volume alongside the closing price. 

# Plotting the closing price alongside the 5 day volume
AmazonData1y.plot(y=['Close', '5VOL'], secondary_y='5VOL', kind='line', ylabel='Stock Price ($)', figsize=(14,12)) plt.title('Closing price and 5 day accumulated volume', fontsize=16)
plt.xlabel('Date')
plt.ylabel('Volume')
plt.show()

Summary


This was a short tutorial on applying the rolling() method on a pandas dataframe using some statistics.

The goal of this article was to demonstrate some simple examples of how the rolling() method works, and I hope that it did accomplish that goal.

The rolling() method can be used for most statistics calculations, so try and explore it using other methods than those used for this article. 




https://www.sickgaming.net/blog/2022/04/...ted-guide/
Reply



Possibly Related Threads…
Thread Author Replies Views Last Post
  [Tut] Python Tuple Concatenation: A Simple Illustrated Guide xSicKxBot 0 1,943 08-21-2023, 10:25 AM
Last Post: xSicKxBot
  [Tut] Making $65 per Hour on Upwork with Pandas xSicKxBot 0 1,324 05-24-2023, 08:16 PM
Last Post: xSicKxBot
  [Tut] Pandas Series Object – A Helpful Guide with Examples xSicKxBot 0 1,314 05-01-2023, 01:30 AM
Last Post: xSicKxBot
  [Tut] Pandas Boolean Indexing xSicKxBot 0 1,311 04-16-2023, 10:54 AM
Last Post: xSicKxBot
  [Tut] Python List of Dicts to Pandas DataFrame xSicKxBot 0 1,534 04-11-2023, 04:15 AM
Last Post: xSicKxBot
  [Tut] PIP Install Django – A Helpful Illustrated Guide xSicKxBot 0 1,264 03-12-2023, 05:27 AM
Last Post: xSicKxBot
  [Tut] 10 Simple Ideas – Coding Projects to Create Real Value xSicKxBot 0 1,251 01-02-2023, 12:57 PM
Last Post: xSicKxBot
  [Tut] Python Library Hijacking – A Simple Demonstration on NumPy xSicKxBot 0 1,252 11-22-2022, 01:48 AM
Last Post: xSicKxBot
  [Tut] Parsing XML Files in Python – 4 Simple Ways xSicKxBot 0 1,181 11-13-2022, 10:49 AM
Last Post: xSicKxBot
  [Tut] How to Filter Data from an Excel File in Python with Pandas xSicKxBot 0 1,227 10-31-2022, 05:36 AM
Last Post: xSicKxBot

Forum Jump:


Users browsing this thread:
1 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016