resample before pct_change() and missing values












2















I have a dataframe:



import pandas as pd
df = pd.DataFrame([['A', 'G1', '2019-01-01', 11],
['A', 'G1', '2019-01-02', 12],
['A', 'G1', '2019-01-04', 14],
['B', 'G2', '2019-01-01', 11],
['B', 'G2', '2019-01-03', 13],
['B', 'G2', '2019-01-06', 16]],
columns=['cust', 'group', 'date', 'val'])
df


enter image description here



df = df.groupby(['cust', 'group', 'date']).sum()
df


enter image description here



The dataframe is grouped and now I would like to calculate pct_change, but only if there are previous date.
If I do it like this:



df['pct'] = df.groupby(['cust', 'group']).val.pct_change()
df


enter image description here



I will get pct_change, but with no respect to the missing dates.
For example in group ('A', 'G1'), pct for date 2019-01-04 should be np.nan because there is no (previous) date 2019-01-03.



Maybe the solution would be to resample by day, where each new row will have np.nan as val, and than to do pct_change.



I tried to use df.resample('1D', level=2) but than I get an error:




TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'




For group ('B', 'G2') all pct_change should be np.nan because none of the rows has previous date.



Expected result is:



enter image description here



How to calculate pct_change respecting missing dates?



Solution:



new_df = pd.DataFrame()

for x, y in df.groupby(['cust', 'group']):
resampled=y.set_index('date').resample('D').val.mean().to_frame().rename({'val': 'resamp_val'}, axis=1)
resampled = resampled.join(y.set_index('date')).fillna({'cust':x[0],'group':x[1]})
resampled['resamp_val_pct'] = resampled.resamp_val.pct_change(fill_method=None)

new_df = pd.concat([new_df, resampled])

new_df = new_df[['cust', 'group', 'val', 'resamp_val', 'resamp_val_pct']]
new_df


enter image description here










share|improve this question

























  • What is expected output?

    – jezrael
    Jan 18 at 14:35











  • I just provided expected result.

    – user3225309
    Jan 18 at 14:52
















2















I have a dataframe:



import pandas as pd
df = pd.DataFrame([['A', 'G1', '2019-01-01', 11],
['A', 'G1', '2019-01-02', 12],
['A', 'G1', '2019-01-04', 14],
['B', 'G2', '2019-01-01', 11],
['B', 'G2', '2019-01-03', 13],
['B', 'G2', '2019-01-06', 16]],
columns=['cust', 'group', 'date', 'val'])
df


enter image description here



df = df.groupby(['cust', 'group', 'date']).sum()
df


enter image description here



The dataframe is grouped and now I would like to calculate pct_change, but only if there are previous date.
If I do it like this:



df['pct'] = df.groupby(['cust', 'group']).val.pct_change()
df


enter image description here



I will get pct_change, but with no respect to the missing dates.
For example in group ('A', 'G1'), pct for date 2019-01-04 should be np.nan because there is no (previous) date 2019-01-03.



Maybe the solution would be to resample by day, where each new row will have np.nan as val, and than to do pct_change.



I tried to use df.resample('1D', level=2) but than I get an error:




TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'




For group ('B', 'G2') all pct_change should be np.nan because none of the rows has previous date.



Expected result is:



enter image description here



How to calculate pct_change respecting missing dates?



Solution:



new_df = pd.DataFrame()

for x, y in df.groupby(['cust', 'group']):
resampled=y.set_index('date').resample('D').val.mean().to_frame().rename({'val': 'resamp_val'}, axis=1)
resampled = resampled.join(y.set_index('date')).fillna({'cust':x[0],'group':x[1]})
resampled['resamp_val_pct'] = resampled.resamp_val.pct_change(fill_method=None)

new_df = pd.concat([new_df, resampled])

new_df = new_df[['cust', 'group', 'val', 'resamp_val', 'resamp_val_pct']]
new_df


enter image description here










share|improve this question

























  • What is expected output?

    – jezrael
    Jan 18 at 14:35











  • I just provided expected result.

    – user3225309
    Jan 18 at 14:52














2












2








2








I have a dataframe:



import pandas as pd
df = pd.DataFrame([['A', 'G1', '2019-01-01', 11],
['A', 'G1', '2019-01-02', 12],
['A', 'G1', '2019-01-04', 14],
['B', 'G2', '2019-01-01', 11],
['B', 'G2', '2019-01-03', 13],
['B', 'G2', '2019-01-06', 16]],
columns=['cust', 'group', 'date', 'val'])
df


enter image description here



df = df.groupby(['cust', 'group', 'date']).sum()
df


enter image description here



The dataframe is grouped and now I would like to calculate pct_change, but only if there are previous date.
If I do it like this:



df['pct'] = df.groupby(['cust', 'group']).val.pct_change()
df


enter image description here



I will get pct_change, but with no respect to the missing dates.
For example in group ('A', 'G1'), pct for date 2019-01-04 should be np.nan because there is no (previous) date 2019-01-03.



Maybe the solution would be to resample by day, where each new row will have np.nan as val, and than to do pct_change.



I tried to use df.resample('1D', level=2) but than I get an error:




TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'




For group ('B', 'G2') all pct_change should be np.nan because none of the rows has previous date.



Expected result is:



enter image description here



How to calculate pct_change respecting missing dates?



Solution:



new_df = pd.DataFrame()

for x, y in df.groupby(['cust', 'group']):
resampled=y.set_index('date').resample('D').val.mean().to_frame().rename({'val': 'resamp_val'}, axis=1)
resampled = resampled.join(y.set_index('date')).fillna({'cust':x[0],'group':x[1]})
resampled['resamp_val_pct'] = resampled.resamp_val.pct_change(fill_method=None)

new_df = pd.concat([new_df, resampled])

new_df = new_df[['cust', 'group', 'val', 'resamp_val', 'resamp_val_pct']]
new_df


enter image description here










share|improve this question
















I have a dataframe:



import pandas as pd
df = pd.DataFrame([['A', 'G1', '2019-01-01', 11],
['A', 'G1', '2019-01-02', 12],
['A', 'G1', '2019-01-04', 14],
['B', 'G2', '2019-01-01', 11],
['B', 'G2', '2019-01-03', 13],
['B', 'G2', '2019-01-06', 16]],
columns=['cust', 'group', 'date', 'val'])
df


enter image description here



df = df.groupby(['cust', 'group', 'date']).sum()
df


enter image description here



The dataframe is grouped and now I would like to calculate pct_change, but only if there are previous date.
If I do it like this:



df['pct'] = df.groupby(['cust', 'group']).val.pct_change()
df


enter image description here



I will get pct_change, but with no respect to the missing dates.
For example in group ('A', 'G1'), pct for date 2019-01-04 should be np.nan because there is no (previous) date 2019-01-03.



Maybe the solution would be to resample by day, where each new row will have np.nan as val, and than to do pct_change.



I tried to use df.resample('1D', level=2) but than I get an error:




TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'




For group ('B', 'G2') all pct_change should be np.nan because none of the rows has previous date.



Expected result is:



enter image description here



How to calculate pct_change respecting missing dates?



Solution:



new_df = pd.DataFrame()

for x, y in df.groupby(['cust', 'group']):
resampled=y.set_index('date').resample('D').val.mean().to_frame().rename({'val': 'resamp_val'}, axis=1)
resampled = resampled.join(y.set_index('date')).fillna({'cust':x[0],'group':x[1]})
resampled['resamp_val_pct'] = resampled.resamp_val.pct_change(fill_method=None)

new_df = pd.concat([new_df, resampled])

new_df = new_df[['cust', 'group', 'val', 'resamp_val', 'resamp_val_pct']]
new_df


enter image description here







python pandas resampling






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited yesterday







user3225309

















asked Jan 18 at 14:21









user3225309user3225309

40111




40111













  • What is expected output?

    – jezrael
    Jan 18 at 14:35











  • I just provided expected result.

    – user3225309
    Jan 18 at 14:52



















  • What is expected output?

    – jezrael
    Jan 18 at 14:35











  • I just provided expected result.

    – user3225309
    Jan 18 at 14:52

















What is expected output?

– jezrael
Jan 18 at 14:35





What is expected output?

– jezrael
Jan 18 at 14:35













I just provided expected result.

– user3225309
Jan 18 at 14:52





I just provided expected result.

– user3225309
Jan 18 at 14:52












2 Answers
2






active

oldest

votes


















2














Check with groupby , then you need resample first and get the pct change with Boolean mask ,since pct_change will ignore NaN



d={}
for x, y in df.groupby(['cust', 'group']):
s=y.set_index('date').resample('D').val.mean()
d[x]=pd.concat([s,s.pct_change().mask(s.shift().isnull()|s.isnull())],1)
newdf=pd.concat(d)
newdf.columns=['val','pct']
newdf
Out[651]:
val pct
date
A G1 2019-01-01 11.0 NaN
2019-01-02 12.0 0.090909
2019-01-03 NaN NaN
2019-01-04 14.0 NaN
B G2 2019-01-01 11.0 NaN
2019-01-02 NaN NaN
2019-01-03 13.0 NaN
2019-01-04 NaN NaN
2019-01-05 NaN NaN
2019-01-06 16.0 NaN


You can add reset_index(inplace=True) at the end to make all index back to columns






share|improve this answer
























  • First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.

    – user3225309
    yesterday



















1














May be you could try comparing the difference between the consecutive rows is not equal to 1 day and then change the pct_change.



df= df.groupby(['cust', 'group', 'date'])
.agg({'val':'sum','date':[min,max]}).reset_index()
df.columns = ['%s%s' % (a, '_%s' % b if b else '') for a, b in df.columns]

df['date_diff']=df['date'].diff()
df['pct_change_val']=df.val_sum.pct_change()
df['pct_change_final'] = df.apply(lambda row: np.NaN if pd.isnull(row.date_diff)
else np.NaN if row.date_diff != np.timedelta64(1, 'D') else row.pct_change_val ,axis=1)


#output:

cust group date date_min date_max val_sum date_diff pct_change_val pct_change_final
0 A G1 2019-01-01 2019-01-01 2019-01-01 11
1 A G1 2019-01-02 2019-01-02 2019-01-02 12 1 days 00:00:00.000000000 0.09090909090909083 0.09090909090909083
2 A G1 2019-01-04 2019-01-04 2019-01-04 14 2 days 00:00:00.000000000 0.16666666666666674
3 B G2 2019-01-01 2019-01-01 2019-01-01 11 -3 days +00:00:00.000000000 -0.2142857142857143
4 B G2 2019-01-03 2019-01-03 2019-01-03 13 2 days 00:00:00.000000000 0.18181818181818188
5 B G2 2019-01-06 2019-01-06 2019-01-06 16 3 days 00:00:00.000000000 0.23076923076923084





share|improve this answer


























  • This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).

    – user3225309
    yesterday













  • I have updated the solution. hope it helps.

    – AI_Learning
    yesterday











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54255901%2fresample-before-pct-change-and-missing-values%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














Check with groupby , then you need resample first and get the pct change with Boolean mask ,since pct_change will ignore NaN



d={}
for x, y in df.groupby(['cust', 'group']):
s=y.set_index('date').resample('D').val.mean()
d[x]=pd.concat([s,s.pct_change().mask(s.shift().isnull()|s.isnull())],1)
newdf=pd.concat(d)
newdf.columns=['val','pct']
newdf
Out[651]:
val pct
date
A G1 2019-01-01 11.0 NaN
2019-01-02 12.0 0.090909
2019-01-03 NaN NaN
2019-01-04 14.0 NaN
B G2 2019-01-01 11.0 NaN
2019-01-02 NaN NaN
2019-01-03 13.0 NaN
2019-01-04 NaN NaN
2019-01-05 NaN NaN
2019-01-06 16.0 NaN


You can add reset_index(inplace=True) at the end to make all index back to columns






share|improve this answer
























  • First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.

    – user3225309
    yesterday
















2














Check with groupby , then you need resample first and get the pct change with Boolean mask ,since pct_change will ignore NaN



d={}
for x, y in df.groupby(['cust', 'group']):
s=y.set_index('date').resample('D').val.mean()
d[x]=pd.concat([s,s.pct_change().mask(s.shift().isnull()|s.isnull())],1)
newdf=pd.concat(d)
newdf.columns=['val','pct']
newdf
Out[651]:
val pct
date
A G1 2019-01-01 11.0 NaN
2019-01-02 12.0 0.090909
2019-01-03 NaN NaN
2019-01-04 14.0 NaN
B G2 2019-01-01 11.0 NaN
2019-01-02 NaN NaN
2019-01-03 13.0 NaN
2019-01-04 NaN NaN
2019-01-05 NaN NaN
2019-01-06 16.0 NaN


You can add reset_index(inplace=True) at the end to make all index back to columns






share|improve this answer
























  • First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.

    – user3225309
    yesterday














2












2








2







Check with groupby , then you need resample first and get the pct change with Boolean mask ,since pct_change will ignore NaN



d={}
for x, y in df.groupby(['cust', 'group']):
s=y.set_index('date').resample('D').val.mean()
d[x]=pd.concat([s,s.pct_change().mask(s.shift().isnull()|s.isnull())],1)
newdf=pd.concat(d)
newdf.columns=['val','pct']
newdf
Out[651]:
val pct
date
A G1 2019-01-01 11.0 NaN
2019-01-02 12.0 0.090909
2019-01-03 NaN NaN
2019-01-04 14.0 NaN
B G2 2019-01-01 11.0 NaN
2019-01-02 NaN NaN
2019-01-03 13.0 NaN
2019-01-04 NaN NaN
2019-01-05 NaN NaN
2019-01-06 16.0 NaN


You can add reset_index(inplace=True) at the end to make all index back to columns






share|improve this answer













Check with groupby , then you need resample first and get the pct change with Boolean mask ,since pct_change will ignore NaN



d={}
for x, y in df.groupby(['cust', 'group']):
s=y.set_index('date').resample('D').val.mean()
d[x]=pd.concat([s,s.pct_change().mask(s.shift().isnull()|s.isnull())],1)
newdf=pd.concat(d)
newdf.columns=['val','pct']
newdf
Out[651]:
val pct
date
A G1 2019-01-01 11.0 NaN
2019-01-02 12.0 0.090909
2019-01-03 NaN NaN
2019-01-04 14.0 NaN
B G2 2019-01-01 11.0 NaN
2019-01-02 NaN NaN
2019-01-03 13.0 NaN
2019-01-04 NaN NaN
2019-01-05 NaN NaN
2019-01-06 16.0 NaN


You can add reset_index(inplace=True) at the end to make all index back to columns







share|improve this answer












share|improve this answer



share|improve this answer










answered Jan 18 at 15:10









W-BW-B

106k83165




106k83165













  • First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.

    – user3225309
    yesterday



















  • First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.

    – user3225309
    yesterday

















First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.

– user3225309
yesterday





First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.

– user3225309
yesterday













1














May be you could try comparing the difference between the consecutive rows is not equal to 1 day and then change the pct_change.



df= df.groupby(['cust', 'group', 'date'])
.agg({'val':'sum','date':[min,max]}).reset_index()
df.columns = ['%s%s' % (a, '_%s' % b if b else '') for a, b in df.columns]

df['date_diff']=df['date'].diff()
df['pct_change_val']=df.val_sum.pct_change()
df['pct_change_final'] = df.apply(lambda row: np.NaN if pd.isnull(row.date_diff)
else np.NaN if row.date_diff != np.timedelta64(1, 'D') else row.pct_change_val ,axis=1)


#output:

cust group date date_min date_max val_sum date_diff pct_change_val pct_change_final
0 A G1 2019-01-01 2019-01-01 2019-01-01 11
1 A G1 2019-01-02 2019-01-02 2019-01-02 12 1 days 00:00:00.000000000 0.09090909090909083 0.09090909090909083
2 A G1 2019-01-04 2019-01-04 2019-01-04 14 2 days 00:00:00.000000000 0.16666666666666674
3 B G2 2019-01-01 2019-01-01 2019-01-01 11 -3 days +00:00:00.000000000 -0.2142857142857143
4 B G2 2019-01-03 2019-01-03 2019-01-03 13 2 days 00:00:00.000000000 0.18181818181818188
5 B G2 2019-01-06 2019-01-06 2019-01-06 16 3 days 00:00:00.000000000 0.23076923076923084





share|improve this answer


























  • This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).

    – user3225309
    yesterday













  • I have updated the solution. hope it helps.

    – AI_Learning
    yesterday
















1














May be you could try comparing the difference between the consecutive rows is not equal to 1 day and then change the pct_change.



df= df.groupby(['cust', 'group', 'date'])
.agg({'val':'sum','date':[min,max]}).reset_index()
df.columns = ['%s%s' % (a, '_%s' % b if b else '') for a, b in df.columns]

df['date_diff']=df['date'].diff()
df['pct_change_val']=df.val_sum.pct_change()
df['pct_change_final'] = df.apply(lambda row: np.NaN if pd.isnull(row.date_diff)
else np.NaN if row.date_diff != np.timedelta64(1, 'D') else row.pct_change_val ,axis=1)


#output:

cust group date date_min date_max val_sum date_diff pct_change_val pct_change_final
0 A G1 2019-01-01 2019-01-01 2019-01-01 11
1 A G1 2019-01-02 2019-01-02 2019-01-02 12 1 days 00:00:00.000000000 0.09090909090909083 0.09090909090909083
2 A G1 2019-01-04 2019-01-04 2019-01-04 14 2 days 00:00:00.000000000 0.16666666666666674
3 B G2 2019-01-01 2019-01-01 2019-01-01 11 -3 days +00:00:00.000000000 -0.2142857142857143
4 B G2 2019-01-03 2019-01-03 2019-01-03 13 2 days 00:00:00.000000000 0.18181818181818188
5 B G2 2019-01-06 2019-01-06 2019-01-06 16 3 days 00:00:00.000000000 0.23076923076923084





share|improve this answer


























  • This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).

    – user3225309
    yesterday













  • I have updated the solution. hope it helps.

    – AI_Learning
    yesterday














1












1








1







May be you could try comparing the difference between the consecutive rows is not equal to 1 day and then change the pct_change.



df= df.groupby(['cust', 'group', 'date'])
.agg({'val':'sum','date':[min,max]}).reset_index()
df.columns = ['%s%s' % (a, '_%s' % b if b else '') for a, b in df.columns]

df['date_diff']=df['date'].diff()
df['pct_change_val']=df.val_sum.pct_change()
df['pct_change_final'] = df.apply(lambda row: np.NaN if pd.isnull(row.date_diff)
else np.NaN if row.date_diff != np.timedelta64(1, 'D') else row.pct_change_val ,axis=1)


#output:

cust group date date_min date_max val_sum date_diff pct_change_val pct_change_final
0 A G1 2019-01-01 2019-01-01 2019-01-01 11
1 A G1 2019-01-02 2019-01-02 2019-01-02 12 1 days 00:00:00.000000000 0.09090909090909083 0.09090909090909083
2 A G1 2019-01-04 2019-01-04 2019-01-04 14 2 days 00:00:00.000000000 0.16666666666666674
3 B G2 2019-01-01 2019-01-01 2019-01-01 11 -3 days +00:00:00.000000000 -0.2142857142857143
4 B G2 2019-01-03 2019-01-03 2019-01-03 13 2 days 00:00:00.000000000 0.18181818181818188
5 B G2 2019-01-06 2019-01-06 2019-01-06 16 3 days 00:00:00.000000000 0.23076923076923084





share|improve this answer















May be you could try comparing the difference between the consecutive rows is not equal to 1 day and then change the pct_change.



df= df.groupby(['cust', 'group', 'date'])
.agg({'val':'sum','date':[min,max]}).reset_index()
df.columns = ['%s%s' % (a, '_%s' % b if b else '') for a, b in df.columns]

df['date_diff']=df['date'].diff()
df['pct_change_val']=df.val_sum.pct_change()
df['pct_change_final'] = df.apply(lambda row: np.NaN if pd.isnull(row.date_diff)
else np.NaN if row.date_diff != np.timedelta64(1, 'D') else row.pct_change_val ,axis=1)


#output:

cust group date date_min date_max val_sum date_diff pct_change_val pct_change_final
0 A G1 2019-01-01 2019-01-01 2019-01-01 11
1 A G1 2019-01-02 2019-01-02 2019-01-02 12 1 days 00:00:00.000000000 0.09090909090909083 0.09090909090909083
2 A G1 2019-01-04 2019-01-04 2019-01-04 14 2 days 00:00:00.000000000 0.16666666666666674
3 B G2 2019-01-01 2019-01-01 2019-01-01 11 -3 days +00:00:00.000000000 -0.2142857142857143
4 B G2 2019-01-03 2019-01-03 2019-01-03 13 2 days 00:00:00.000000000 0.18181818181818188
5 B G2 2019-01-06 2019-01-06 2019-01-06 16 3 days 00:00:00.000000000 0.23076923076923084






share|improve this answer














share|improve this answer



share|improve this answer








edited yesterday

























answered Jan 18 at 15:30









AI_LearningAI_Learning

3,1612732




3,1612732













  • This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).

    – user3225309
    yesterday













  • I have updated the solution. hope it helps.

    – AI_Learning
    yesterday



















  • This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).

    – user3225309
    yesterday













  • I have updated the solution. hope it helps.

    – AI_Learning
    yesterday

















This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).

– user3225309
yesterday







This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).

– user3225309
yesterday















I have updated the solution. hope it helps.

– AI_Learning
yesterday





I have updated the solution. hope it helps.

– AI_Learning
yesterday


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54255901%2fresample-before-pct-change-and-missing-values%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Liquibase includeAll doesn't find base path

How to use setInterval in EJS file?

Petrus Granier-Deferre