Temporary columns need to be created within a dataframe, which later requires removal. Following examples show how to remove or drop columns.
A list of columns can be dropped from a dataframe. The resulting entity is assigned to a new variable. This can cause strain on system memory for large datasets.
Drop or remove data columns
#read test data from csv file and display top 3 rows df1 = pd.read_csv( 'data_deposits.csv' ) print(df1.head(3)) #list of columns to remove cols_to_drop = [ 'city', 'deposit' ] #remove two columns and show final dataframe df2 = df1.drop(columns=cols_to_drop, axis=1) print(df2.head(3))
--[df1 before column removal]---------------- firstname lastname city age deposit 0 Herman Sanchez Miami 52 9300 1 Phil Parker Miami 45 5010 2 Bradie Garnett Denver 36 6300 --[df2 post column removal]------------------ firstname lastname age 0 Herman Sanchez 52 1 Phil Parker 45 2 Bradie Garnett 36 ---------------------------------------------------
Removing columns with the inplace option is better for memory management.
Remove columns from dataframe inplace
#read test data from csv file and display top 3 rows df1 = pd.read_csv( 'data_deposits.csv' ) print(df1.head(3)) #list of columns to remove cols_to_drop = [ 'city', 'deposit' ] #drop columns inplace df1.drop( columns=cols_to_drop, axis=1, inplace=True )
--[df1 before column removal]---------------- firstname lastname city age deposit 0 Herman Sanchez Miami 52 9300 1 Phil Parker Miami 45 5010 2 Bradie Garnett Denver 36 6300 --[df1 post column removal]------------------ firstname lastname age 0 Herman Sanchez 52 1 Phil Parker 45 2 Bradie Garnett 36 ---------------------------------------------------
The result is the same. We just do not need any new variables. The first example could have reassigned the dropped column dataframe back to df1. In that case the memory consumption would be a temporary or transient phenomena.
# Python - Delete multiple elements from a list
# SEO Google page rank and web traffic
# Python: Random access generator for multi value sublist yield
# Python: Enumerate counter for loops over list, tuple, string
# Pandas - Read, skip and customize column headers for read_csv
# Pandas - Selecting data rows and columns using read_csv
# Pandas - Space, tab and custom data separators
# Sample data for Python tutorials
# Pandas - Purge duplicate rows
# Pandas - Concatenate or vertically merge dataframes
# Pandas - Search and replace values in columns
# Pandas - Count rows and columns in dataframe
# Pandas - Adding new static columns
# Python - Hardware and operating system information
# Pandas - Remove or drop columns from Pandas dataframe