real panda
June 7, 2019 * Python Programming

Pandas - Search and replace values in columns

Conditional replacement of values along a certain dataframe column is very useful. Example can be to remove typographical errors from a string that is scattered amongst rows.

Match a list and replace

For a single column we want to replace all values that match elements in a list, with a single replacement value. Specifically we will replace the city name with Houston, if the current records are either Denver or Seattle.

Find and replace values in dataframe column

import pandas as pd

#load selected data
df1 = pd.read_csv(
  'data_deposits.txt', 
  usecols = [
    'lastname',
    'firstname',
    'city'
  ]
)
print(df1.head(8))

#replace matching values
df1 = df1.replace(
    to_replace={'city':['Seattle','Denver']}, 
    value='Houston'
)
print(df1.head(8))
--[df1 original]---------------
  firstname lastname     city
0    Herman  Sanchez    Miami
1      Phil   Parker    Miami
2    Bradie  Garnett   Denver
3    Rudolf   Crooks    Miami
4    Marcos   Miller  Seattle
5      Chad  Garnett    Miami
6     Sally    Evans   Denver
7      Chad   Parker  Seattle

--[df1 replaced]---------------    
  firstname lastname     city
0    Herman  Sanchez    Miami
1      Phil   Parker    Miami
2    Bradie  Garnett  Houston
3    Rudolf   Crooks    Miami
4    Marcos   Miller  Houston
5      Chad  Garnett    Miami
6     Sally    Evans  Houston
7      Chad   Parker  Houston

-------------------------------

In the above tutorial, we have only loaded part of the column data. To learn more check out the selective data loading tricks.

The replacement of city names was a success, with all 'Denver' and 'Seattle' entries been replaced with 'Houston'. Replace only works for a single column, where we cannot impose filter conditions on different columns. For such capability, look at merge, split and combine examples which are generalized methods of manipulating data.

References