python - pandas: Filling missing values within a group -
i have data experiment, , within each trial there single values, surrounded na
's, want fill out entire trial:
df = pd.dataframe({'trial': [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3], 'cs_name': [np.nan, 'a1', np.nan, np.nan, np.nan, np.nan, 'b2', np.nan, 'a1', np.nan, np.nan, np.nan]}) out[177]: cs_name trial 0 nan 1 1 a1 1 2 nan 1 3 nan 1 4 nan 2 5 nan 2 6 b2 2 7 nan 2 8 a1 3 9 nan 3 10 nan 3 11 nan 3
i'm able fill these values within whole trial using both bfill()
, ffill()
, i'm wondering if there better way achieve this.
df['cs_name'] = df.groupby('trial')['cs_name'].ffill() df['cs_name'] = df.groupby('trial')['cs_name'].bfill()
expected output:
cs_name trial 0 a1 1 1 a1 1 2 a1 1 3 a1 1 4 b2 2 5 b2 2 6 b2 2 7 b2 2 8 a1 3 9 a1 3 10 a1 3 11 a1 3
an alternative approach use first_valid_index
, transform
:
in [11]: g = df.groupby('trial') in [12]: g['cs_name'].transform(lambda s: s.loc[s.first_valid_index()]) out[12]: 0 a1 1 a1 2 a1 3 a1 4 b2 5 b2 6 b2 7 b2 8 a1 9 a1 10 a1 11 a1 name: cs_name, dtype: object
this ought more efficient using ffill followed bfill...
and use change cs_name
column:
df['cs_name'] = g['cs_name'].transform(lambda s: s.loc[s.first_valid_index()])
note: think nice enhancement have method grab first non-null object in pandas, in numpy it's an open request, don't think there method (i wrong!)...
Comments
Post a Comment