Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Pandas DataFrame Indexing

#1
Pandas DataFrame Indexing

The Pandas DataFrame is a data structure that organizes data into a two-dimensional format. If you are familiar with Excel or Databases, the setup is similar. Each DataFrame contains a schema that defines a Column (Field) Name and a Data Type.

This article delves into the methods available for DataFrame Indexing. This article also assumes Excel exists on the computer.

Getting Started


Remember to add the required Starter Code to the top of each code snippet. This snippet will allow the code in this article to run error-free:

import pandas as pd

Before any data manipulation can occur, a new library will require installation. The pandas library enables access to/from a DataFrame.

To install this library, navigate to an IDE terminal. At the command prompt ($), execute the code below. For the terminal used in this example, the command prompt is a dollar sign ($). Your terminal prompt may be different.

$ pip install pandas

Hit the <Enter> key on the keyboard to start the installation process.

If the installation was successful, a message displays in the terminal indicating the same.

Create a CSV File


For this article, we will read in a CSV file. This file contains a small representation of Cars and their associated characteristics. To create this file, perform the following steps:

  • Using the mouse, highlight the data below.
  • With the appropriate lines highlighted, press CTRL+C to copy the contents to the system Clipboard.
  • Open Excel and click inside Cell A1.
  • Press CTRL+V to paste the contents of the system clipboard to the Worksheet.
  • From the File menu, click Save As and browse to select the current working directory.
  • From the Save As pop-up, enter the filename and select the file type as indicated below.
  • Click the Save button to complete this task.

Data:

Car;MPG;Cylinders;Displacement;Horsepower;Weight;Acceleration;Model;Origin
Chevrolet Chevelle Malibu;18.0;8;307.0;130.0;3504.;12.0;70;US
Buick Skylark 320;15.0;8;350.0;165.0;3693.;11.5;70;US
Plymouth Satellite;18.0;8;318.0;150.0;3436.;11.0;70;US
AMC Rebel SST;16.0;8;304.0;150.0;3433.;12.0;70;US
Ford Torino;17.0;8;302.0;140.0;3449.;10.5;70;US
Ford Galaxie 500;15.0;8;429.0;198.0;4341.;10.0;70;US
Chevrolet Impala;14.0;8;454.0;220.0;4354.;9.0;70;US
Plymouth Fury iii;14.0;8;440.0;215.0;4312.;8.5;70;US
Pontiac Catalina;14.0;8;455.0;225.0;4425.;10.0;70;US
AMC Ambassador DPL;15.0;8;390.0;190.0;3850.;8.5;70;US
Citroen DS-21 Pallas;12.0;4;133.0;115.0;3090.;17.5;70;Europe
Volkswagen 1131 Deluxe Sedan;26.0;4;97.00;46.00;1835.;20.5;70;Europe
Peugeot 504;25.0;4;110.0;87.00;2672.;17.5;70;Europe
Audi 100 LS;24.0;4;107.0;90.00;2430.;14.5;70;Europe
Saab 99e;25.0;4;104.0;95.00;2375.;17.5;70;Europe
BMW 2002;26.0;4;121.0;113.0;2234.;12.5;70;Europe
AMC Gremlin;21.0;6;199.0;90.00;2648.;15.0;70;US
Ford F250;10.0;8;360.0;215.0;4615.;14.0;70;US

DataFrame head()


If a DataFrame contains a large number of rows, this method comes in handy. This method offers the ability to view a selection of the top n rows in a DataFrame.

The syntax for this method is as follows:

DataFrame.head(n=5)

Parameter Description
n If empty, the first five (5) rows will display. If this value contains an integer, the first n number of rows will display. The number of rows does not include the header row.

For this example, we will display the top three (3) rows of the Cars DataFrame.

cols = ['Car', 'MPG', 'Weight', 'Model', 'Origin']
df = pd.read_csv('cars.csv', sep=';', usecols=cols)
print(df.head(3))
  • Line [2] creates a list of the DataFrame columns to view.
  • Line [3] creates a DataFrame by:
    • Reading in the CSV file created earlier.
    • Assigning the field separator character from the CSV file: the semi-colon (;).
    • Assigning the view columns to the usecols parameter.
  • Line [4] outputs the top three (3) DataFrame rows, plus the header row to the terminal

Output:


Car MPG Weight Model Origin
0 Chevrolet Chevelle Malibu 18.0 3504.0 70 US
1 Buick Skylark 15.0 3693.0 70 US
2 Plymouth Satellite 18.0 3436.0 70 US

DataFrame at Property


Use the at property when a single value (string) for a row/column label pair is required. This property is similar to loc: both provide label-based lookups.

The syntax for this property is as follows:

DataFrame.at
  • This property accepts a row number/column name combination.
  • The row number is an integer. The column name is a string.

Code – Example 1


This example retrieves the Car name located on row 10 of the DataFrame.

cols = ['Car', 'MPG', 'Weight', 'Model', 'Origin']
df = pd.read_csv('cars.csv', sep=';', usecols=cols)
result = df.at[10, 'Car']
print(result)
  • Line [2-3] explanation of this code outlined above.
  • Line [4] retrieves the name in red below using at().
  • Line [5] outputs this to the terminal.

Output


Car MPG Weight Model Origin
10 Citroen DS-21 Pallas 12.0 3090.0 70 Europe

Code – Example 2


This example retrieves the Car weight located on row 5 of the DataFrame.

cols = ['Car', 'MPG', 'Weight', 'Model', 'Origin']
df = pd.read_csv('cars.csv', sep=';', usecols=cols)
result = df.loc[5].at['Weight']
print(result)
  • Line [2-3] explanation of this code outlined above.
  • Line [4] retrieves the weight in red below using loc() and at().
  • Line [5] outputs this to the terminal.

Output


Car MPG Weight Model Origin
5 Ford Galaxie 500 15.0 4341.0 70 US

DataFrame iat Property


Use the iat property when a single value (integer) for a row/column label pair is required. This property is similar to loc. Both provide integer-based lookups. If out of range, an IndexError occurs.

The syntax for this method is as follows:

DataFrame.iat
  • This property takes one parameter: a row and column number combination. Both row and column values must be integers.

Code – Example 1


This example retrieves the Car name located on row 11 of the DataFrame.

cols = ['Car', 'MPG', 'Weight', 'Model', 'Origin']
df = pd.read_csv('cars.csv', sep=';', usecols=cols)
result = df.iat[11, 0]
print(result)
  • Line [2-3] explanation of this code outlined above.
  • Line [4] retrieves the name in red below at column 0, row 11 of the DataFrame using iat().
  • Line [5] outputs this to the terminal.

Output:


Car MPG Weight Model Origin
11 Volkswagen 1131 Deluxe Sedan 26.0 1835.0 70 Europe

Code – Example 2


This example retrieves the Car weight located on row 12 of the DataFrame.

cols = ['Car', 'MPG', 'Weight', 'Model', 'Origin']
df = pd.read_csv('cars.csv', sep=';', usecols=cols)
result = df.loc[12].iat[2]
print(result)
  • Line [2-3] explanation of this code outlined above.
  • Line [4] retrieves the weight in red below at row 12, column 2 of the DataFrame using loc() and at().
  • Line [5] outputs this to the terminal.

Output:


Car MPG Weight Model Origin
12 Peugot 504 25.0 2672.0 70 Europe

DataFrame loc


Use this property to access a group of rows/columns by label(s) or a Boolean array. This property is label-based. A KeyError occurs if an item is missing.

The syntax for this property is as follows:

DataFrame.loc[input]

Acceptable Inputs

  • A single label
  • A list or array of labels
  • A slice object with labels
  • A boolean array
  • A boolean series
  • An alignable Index. The index of the returned selection will be the input
  • A callable function

Code

This example retrieves the Car details from row 8 of the DataFrame using loc().

cols = ['Car', 'MPG', 'Weight', 'Model', 'Origin']
df = pd.read_csv('cars.csv', sep=';', usecols=cols)
result = df.loc[8]
print(result)
  • Line [2-3] explanation of this code outlined above.
  • Line [4] retrieves the information from line 8 of the DataFrame.
  • Line [5] outputs this to the terminal.

Output:


Car Pontiac Catalina
MPG 14.0
Weight 4425.0
Model 70
Origin US

As noted above, there are numerous ways to retrieve the data. Click here for additional details.

DataFrame iloc


This property is integer-location-based indexing based on selection by position. The position base is 0 to length-1 of the axis.  An IndexError occurs if an index is out-of-bounds, except slice indexers: they allow out-of-bounds indexing.

The syntax for this property is as follows:

DataFrame.iloc[input]

Acceptable Inputs

  • An integer
  • A list/array of integers
  • A slice object with integers
  • A boolean array
  • A callable function

Code

In this example, we display Cars whose origin is Europe. Below is one way to accomplish this task.

cols = ['Car', 'MPG', 'Weight', 'Model', 'Origin']
df = pd.read_csv('cars.csv', sep=';', usecols=cols)
result = df.iloc[10:15]
print(result)
  • Line [2-3] explanation of this code outlined above.
  • Line [4] retrieves the information from lines 10 to 14 of the DataFrame.
  • Line [5] outputs this to the terminal.

Output:


Car MPG Weight Model Origin
10 Citroen DS-21 Pallas 12.0 3090.0 70 Europe
11 11 Volkswagen 1131 Deluxe Sedan 26.0 1835.0 70 Europe
12 Peugeot 504 25.0 2672.0 70 Europe
13 Audi 100 LS 24.0 2430.0 70 Europe
14 Saab 99e 25.0 2375.0 70 Europe



https://www.sickgaming.net/blog/2021/12/...-indexing/
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

[-]
Discord

[-]
Active Threads
[Tut] How To Apply A Function To Each El...
Last Post: xSicKxBot
Today 04:46 PM
» Replies: 0
» Views: 3
(Indie Deal) Exotic Riddles Bundle, Dyin...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 1
Mobile - Pokémon Sword and Shield myster...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 0
.NET Framework January 2022 Cumulative U...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 0
AppleInsider - Philips Hue gains trio of...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 0
Fedora - Quarkus and Mutiny
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 1
News - Pokémon Unite Adjustments Arrive ...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 1
Xbox Wire - Besiege Console (Game Previe...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 0
News - Vulkan 1.3 Specification Released
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 0
(Indie Deal) Start saving early with Gea...
Last Post: xSicKxBot
Yesterday 05:23 PM
» Replies: 0
» Views: 42

[-]
Twitter



Discord Server © SickGaming.net 2012-2021