real panda
May 1, 2019 * Python Programming

Pandas - Read csv text files into Dataframe

Text files are simple objects for storing and sharing data; although not as efficient. Pandas library provides a very powerful interface to read a delimited data file into Dataframe easily.

Easy data loading with read_csv() using minimal options

Let us use function read_csv() with minimal parameters to load and view a csv file. Just the filename 'data_deposits.csv' is required.

It is assumed that csv file is well behaved:

  • csv file is text, delimited by comma
  • each row starts on a new line
  • top row is header, translated to column names

Copy the Python code below into loadcsv.py. Place csv data file in the same folder. Execute code with Python. The file will be loaded, and will show top 3 and bottom 3 rows of data for all the columns.

Compact read_csv function with filename

#use the pandas package
import pandas as pd

#basic csv import
df = pd.read_csv(
  'data_deposits.csv'
)

#show top & bottom 3 rows
print(df.head(3))
print(df.tail(3))

#column data types
print(df.dtypes)

#dataframe (rows, cols)
print(df.shape)
Output for code:
--[ df head 3 ]-----------------------------
  firstname lastname    city  age  deposit
0    Herman  Sanchez   Miami   52     9300
1      Phil   Parker   Miami   45     5010
2    Bradie  Garnett  Denver   36     6300

--[ df tail 3 ]-----------------------------
  firstname lastname     city  age  deposit
5      Chad  Garnett    Miami   38     7420
6     Sally    Evans   Denver   25     3170
7      Chad   Parker  Seattle   55    12600

--[ df dtype ]-------------------------------
firstname    object
lastname     object
city         object
age           int64
deposit       int64
dtype: object

--[ df rows,cols ]---------------------------
(8, 5)
---------------------------------------------

The file is effectively loaded with just a single line of code. Data and headers are auto parsed. It is also easy to inspect the contents and structure.

Although most applications would just required the above knowledge, it surely is not the end of importing data using read_csv(). Many delimited files can use alternate separators like space, tab or semi-colon. They can contain comment lines, and text inside single or double quotes. Read_csv provides comprehensive options to handle many such situations.

Check out the following lessons for more in-depth information on how to customize read_csv()

References