concat([dataFrame1,dataFrame2,…],index_ingore=False)参数说明:index_ingore=False(表示合并的索引不延续),index_ingore=True(表示合并的索引可延续)
实例:
----------------df--------------------------AB078173259340418----------------df1--------------------------AB078173340#-------------->这里并没有2出现,索引不连续418----------------df2--------------------------AB078173240318
df.append(df1,index_ignore=True)参数说明:index_ingore=False(表示索引不延续),index_ingore=True(表示索引延续)
----------------df--------------------------AB056112253318412----------------df1--------------------------AB056112253318412581635711
将同一个数据不同列合并
参数配置:
pd.merge(left,right,how="inner",on=None,left_on=None,right_on=None,left_index=False,right_index=False,sort=False,suffixes=("_x","_y"),copy=True,indicator=False,validate=None,)参数说明:
实例1:
----------------df1--------------------------keydata10a01b12c2----------------df2--------------------------keydata20a01b12c2----------------df---------------------------keydata1data20a001b112c22
实例2:
----------------right-------------------------key1key2lval0fooone41fooone52barone63bartwo7----------------left--------------------------key1key2lval0fooone11footwo22barone3----------------df---------------------------key1key2lval_xlval_y0fooone1.04.01fooone1.05.02footwo2.0NaN3barone3.06.04bartwoNaN7.0
去除完全重复的行数据
data.drop_duplicates(inplace=True)
---------------去重前的df---------------------------brandstylerating0YumYumcup4.01YumYumcup4.02Indomiecup3.53Indomiepack15.04Indomiepack5.0---------------去重后的df---------------------------brandstylerating0YumYumcup4.02Indomiecup3.53Indomiepack15.04Indomiepack5.0
使用subset去除某几列重复的行数据
data.drop_duplicates(subset=[‘A’,‘B’],keep=‘first’,inplace=True)
brandstylerating0YumYumcup4.02Indomiecup3.5
使用keep删除重复项并保留最后一次出现
brandstylerating1YumYumcup4.02Indomiecup3.54Indomiepack5.0