Top 25 pandas tricks.ipynb¶

url: https://nbviewer.org/github/justmarkham/pandas-videos/blob/master/top_25_pandas_tricks.ipynb

In [1]:

import pandas as pd
import numpy as np

In [2]:

drinks = pd.read_csv('http://bit.ly/drinksbycountry')
movies = pd.read_csv('http://bit.ly/imdbratings')
orders = pd.read_csv('http://bit.ly/chiporders', sep='\t')
orders['item_price'] = orders.item_price.str.replace('$', '').astype('float')
stocks = pd.read_csv('http://bit.ly/smallstocks', parse_dates=['Date'])
titanic = pd.read_csv('http://bit.ly/kaggletrain')
ufo = pd.read_csv('http://bit.ly/uforeports', parse_dates=['Time'])

C:\Users\huise\AppData\Local\Temp\ipykernel_26448\3279769219.py:4: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.
  orders['item_price'] = orders.item_price.str.replace('$', '').astype('float')

3.칼럼명 바꾸는 법¶

In [3]:

#데이터 프레임 생성
df = pd.DataFrame({'col one':[100, 200], 'col two':[300, 400]})
df

Out[3]:

	col one	col two
0	100	300
1	200	400

rename(): fuction이나 dict를 이용하여 컬럼명을 변경한다.¶

Parameters¶

mapper: 칼럼 혹은 index명을 바꿀시 axis에 적용할 function이나 dict를 넣어준다.
columns: 칼럼 명을 바꿀시 적용할 function이나 dict를 넣어준다.
index: 인덱스 명을 바꿀시 적용할 function이나 dict를 넣어준다.
axis: 0 or 'index', 1 or 'columns'를 통해 mapper사용시 적용해줄 axis를 설정해준다.

In [4]:

df.rename(mapper = {'col one': 'col_one', 'col two':'col_two'},axis='columns')

Out[4]:

	col_one	col_two
0	100	300
1	200	400

In [5]:

df.rename(columns = {'col one': 'col_one', 'col two':'col_two'})

Out[5]:

	col_one	col_two
0	100	300
1	200	400

In [6]:

df.rename(mapper= {0:1,1:2},axis='index')

Out[6]:

	col one	col two
1	100	300
2	200	400

In [7]:

df.rename(index= {0:1,1:2})

Out[7]:

	col one	col two
1	100	300
2	200	400

pandas.DataFrame.columns을 이용¶

In [8]:

df = pd.DataFrame({'col one':[100, 200], 'col two':[300, 400]})
df

Out[8]:

	col one	col two
0	100	300
1	200	400

In [9]:

#직접 두개의 칼럼 명을 바꿔준다
df.columns=['col_one', 'col_two']
df

Out[9]:

	col_one	col_two
0	100	300
1	200	400

In [10]:

df = pd.DataFrame({'col one':[100, 200], 'col two':[300, 400]})
df

Out[10]:

	col one	col two
0	100	300
1	200	400

In [11]:

df.columns=df.columns.str.replace(' ','_')
df

Out[11]:

	col_one	col_two
0	100	300
1	200	400

pandas.DataFrame.add_prefix: 접두사를 추가해줌¶

Parameters¶

prefix: 추가할 접두사 str형태로 입력

In [12]:

df.add_prefix(prefix='X_')

Out[12]:

	X_col_one	X_col_two
0	100	300
1	200	400

pandas.DataFrame.add_suffix: 접미사를 추가해줌¶

Parameters¶

suffix: 추가할 접미사 str형태로 입력

In [13]:

df.add_suffix(suffix='_Y')

Out[13]:

	col_one_Y	col_two_Y
0	100	300
1	200	400

4. 행의 순서 뒤집기¶

In [14]:

drinks.head()

Out[14]:

	country	beer_servings	spirit_servings	wine_servings	total_litres_of_pure_alcohol	continent
0	Afghanistan	0	0	0	0.0	Asia
1	Albania	89	132	54	4.9	Europe
2	Algeria	25	0	14	0.7	Africa
3	Andorra	245	138	312	12.4	Europe
4	Angola	217	57	45	5.9	Africa

In [15]:

# loc[::-1]으로 행순서를 뒤집어 줄 수 있음.
drinks.loc[::-1].head()

Out[15]:

	country	beer_servings	spirit_servings	wine_servings	total_litres_of_pure_alcohol	continent
192	Zimbabwe	64	18	4	4.7	Africa
191	Zambia	32	19	4	2.5	Africa
190	Yemen	6	0	0	0.1	Asia
189	Vietnam	111	2	1	2.0	Asia
188	Venezuela	333	100	3	7.7	South America

In [16]:

# reset_index로 인덱스도 0부터 시작하게 만들어 줌
drinks.loc[::-1].reset_index(drop=True).head()

Out[16]:

	country	beer_servings	spirit_servings	wine_servings	total_litres_of_pure_alcohol	continent
0	Zimbabwe	64	18	4	4.7	Africa
1	Zambia	32	19	4	2.5	Africa
2	Yemen	6	0	0	0.1	Asia
3	Vietnam	111	2	1	2.0	Asia
4	Venezuela	333	100	3	7.7	South America

5. 칼럼 순서 뒤집기¶

In [17]:

#loc[:,::-1]으로 칼럼 순서도 뒤집어 줄 수 있음
drinks.loc[:, ::-1].head()

Out[17]:

	continent	total_litres_of_pure_alcohol	wine_servings	spirit_servings	beer_servings	country
0	Asia	0.0	0	0	0	Afghanistan
1	Europe	4.9	54	132	89	Albania
2	Africa	0.7	14	0	25	Algeria
3	Europe	12.4	312	138	245	Andorra
4	Africa	5.9	45	57	217	Angola

6. data type으로 칼럼 선택하기¶

In [18]:

drinks.dtypes

Out[18]:

country                          object
beer_servings                     int64
spirit_servings                   int64
wine_servings                     int64
total_litres_of_pure_alcohol    float64
continent                        object
dtype: object

pandas.DataFrame.select_dtypes : 선택할 혹은 제외할 dtype을 입력하여 해당 조건에 맞는 칼럼만 나오게 함¶

Parameters¶

inclde, exclude: 포함 혹은 제외할 dtype을 입력해줌

In [19]:

drinks.dtypes

Out[19]:

country                          object
beer_servings                     int64
spirit_servings                   int64
wine_servings                     int64
total_litres_of_pure_alcohol    float64
continent                        object
dtype: object

In [20]:

# number: float과 int
drinks.select_dtypes(include='number').head()

Out[20]:

	beer_servings	spirit_servings	wine_servings	total_litres_of_pure_alcohol
0	0	0	0	0.0
1	89	132	54	4.9
2	25	0	14	0.7
3	245	138	312	12.4
4	217	57	45	5.9

In [21]:

# object: object
drinks.select_dtypes(include='object').head()

Out[21]:

	country	continent
0	Afghanistan	Asia
1	Albania	Europe
2	Algeria	Africa
3	Andorra	Europe
4	Angola	Africa

In [22]:

# 리스트를 이용하여 여러 dtype도 선택가능 
drinks.select_dtypes(include=['number', 'object', 'category', 'datetime']).head()

Out[22]:

	country	beer_servings	spirit_servings	wine_servings	total_litres_of_pure_alcohol	continent
0	Afghanistan	0	0	0	0.0	Asia
1	Albania	89	132	54	4.9	Europe
2	Algeria	25	0	14	0.7	Africa
3	Andorra	245	138	312	12.4	Europe
4	Angola	217	57	45	5.9	Africa

In [23]:

# exclude를 쓰면 number dtype 외의 것을 선택
drinks.select_dtypes(exclude='number').head()

Out[23]:

	country	continent
0	Afghanistan	Asia
1	Albania	Europe
2	Algeria	Africa
3	Andorra	Europe
4	Angola	Africa

[pandas] 문자열 처리.str(1): 대/소문자 변경, 문자 분류 (0)	2022.08.08
top_25_pandas_tricks(2) (0)	2022.08.03

희승이의 데이터 공부

Top 25 pandas tricks(1)

Top 25 pandas tricks.ipynb¶

3.칼럼명 바꾸는 법¶

rename(): fuction이나 dict를 이용하여 컬럼명을 변경한다.¶

Parameters¶

pandas.DataFrame.columns을 이용¶

pandas.DataFrame.add_prefix: 접두사를 추가해줌¶

Parameters¶

pandas.DataFrame.add_suffix: 접미사를 추가해줌¶

Parameters¶

4. 행의 순서 뒤집기¶

5. 칼럼 순서 뒤집기¶

6. data type으로 칼럼 선택하기¶

pandas.DataFrame.select_dtypes : 선택할 혹은 제외할 dtype을 입력하여 해당 조건에 맞는 칼럼만 나오게 함¶

Parameters¶

'코딩 > 판다승(판다스공부하는희승)' 카테고리의 다른 글

티스토리툴바

Top 25 pandas tricks(1)

Top 25 pandas tricks.ipynb¶

3.칼럼명 바꾸는 법¶

rename(): fuction이나 dict를 이용하여 컬럼명을 변경한다.¶

Parameters¶

pandas.DataFrame.columns을 이용¶

pandas.DataFrame.add_prefix: 접두사를 추가해줌¶

Parameters¶

pandas.DataFrame.add_suffix: 접미사를 추가해줌¶

Parameters¶

4. 행의 순서 뒤집기¶

5. 칼럼 순서 뒤집기¶

6. data type으로 칼럼 선택하기¶

pandas.DataFrame.select_dtypes : 선택할 혹은 제외할 dtype을 입력하여 해당 조건에 맞는 칼럼만 나오게 함¶

Parameters¶

'코딩 > 판다승(판다스공부하는희승)' 카테고리의 다른 글

'코딩/판다승(판다스공부하는희승)' Related Articles

티스토리툴바