resample before pct_change() and missing values
I have a dataframe:
import pandas as pd
df = pd.DataFrame([['A', 'G1', '2019-01-01', 11],
['A', 'G1', '2019-01-02', 12],
['A', 'G1', '2019-01-04', 14],
['B', 'G2', '2019-01-01', 11],
['B', 'G2', '2019-01-03', 13],
['B', 'G2', '2019-01-06', 16]],
columns=['cust', 'group', 'date', 'val'])
df
df = df.groupby(['cust', 'group', 'date']).sum()
df
The dataframe is grouped and now I would like to calculate pct_change
, but only if there are previous date.
If I do it like this:
df['pct'] = df.groupby(['cust', 'group']).val.pct_change()
df
I will get pct_change
, but with no respect to the missing dates.
For example in group ('A', 'G1')
, pct
for date 2019-01-04
should be np.nan
because there is no (previous) date 2019-01-03
.
Maybe the solution would be to resample by day, where each new row will have np.nan
as val
, and than to do pct_change
.
I tried to use df.resample('1D', level=2)
but than I get an error:
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'
For group ('B', 'G2')
all pct_change
should be np.nan
because none of the rows has previous date.
Expected result is:
How to calculate pct_change
respecting missing dates?
Solution:
new_df = pd.DataFrame()
for x, y in df.groupby(['cust', 'group']):
resampled=y.set_index('date').resample('D').val.mean().to_frame().rename({'val': 'resamp_val'}, axis=1)
resampled = resampled.join(y.set_index('date')).fillna({'cust':x[0],'group':x[1]})
resampled['resamp_val_pct'] = resampled.resamp_val.pct_change(fill_method=None)
new_df = pd.concat([new_df, resampled])
new_df = new_df[['cust', 'group', 'val', 'resamp_val', 'resamp_val_pct']]
new_df
python pandas resampling
add a comment |
I have a dataframe:
import pandas as pd
df = pd.DataFrame([['A', 'G1', '2019-01-01', 11],
['A', 'G1', '2019-01-02', 12],
['A', 'G1', '2019-01-04', 14],
['B', 'G2', '2019-01-01', 11],
['B', 'G2', '2019-01-03', 13],
['B', 'G2', '2019-01-06', 16]],
columns=['cust', 'group', 'date', 'val'])
df
df = df.groupby(['cust', 'group', 'date']).sum()
df
The dataframe is grouped and now I would like to calculate pct_change
, but only if there are previous date.
If I do it like this:
df['pct'] = df.groupby(['cust', 'group']).val.pct_change()
df
I will get pct_change
, but with no respect to the missing dates.
For example in group ('A', 'G1')
, pct
for date 2019-01-04
should be np.nan
because there is no (previous) date 2019-01-03
.
Maybe the solution would be to resample by day, where each new row will have np.nan
as val
, and than to do pct_change
.
I tried to use df.resample('1D', level=2)
but than I get an error:
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'
For group ('B', 'G2')
all pct_change
should be np.nan
because none of the rows has previous date.
Expected result is:
How to calculate pct_change
respecting missing dates?
Solution:
new_df = pd.DataFrame()
for x, y in df.groupby(['cust', 'group']):
resampled=y.set_index('date').resample('D').val.mean().to_frame().rename({'val': 'resamp_val'}, axis=1)
resampled = resampled.join(y.set_index('date')).fillna({'cust':x[0],'group':x[1]})
resampled['resamp_val_pct'] = resampled.resamp_val.pct_change(fill_method=None)
new_df = pd.concat([new_df, resampled])
new_df = new_df[['cust', 'group', 'val', 'resamp_val', 'resamp_val_pct']]
new_df
python pandas resampling
What is expected output?
– jezrael
Jan 18 at 14:35
I just provided expected result.
– user3225309
Jan 18 at 14:52
add a comment |
I have a dataframe:
import pandas as pd
df = pd.DataFrame([['A', 'G1', '2019-01-01', 11],
['A', 'G1', '2019-01-02', 12],
['A', 'G1', '2019-01-04', 14],
['B', 'G2', '2019-01-01', 11],
['B', 'G2', '2019-01-03', 13],
['B', 'G2', '2019-01-06', 16]],
columns=['cust', 'group', 'date', 'val'])
df
df = df.groupby(['cust', 'group', 'date']).sum()
df
The dataframe is grouped and now I would like to calculate pct_change
, but only if there are previous date.
If I do it like this:
df['pct'] = df.groupby(['cust', 'group']).val.pct_change()
df
I will get pct_change
, but with no respect to the missing dates.
For example in group ('A', 'G1')
, pct
for date 2019-01-04
should be np.nan
because there is no (previous) date 2019-01-03
.
Maybe the solution would be to resample by day, where each new row will have np.nan
as val
, and than to do pct_change
.
I tried to use df.resample('1D', level=2)
but than I get an error:
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'
For group ('B', 'G2')
all pct_change
should be np.nan
because none of the rows has previous date.
Expected result is:
How to calculate pct_change
respecting missing dates?
Solution:
new_df = pd.DataFrame()
for x, y in df.groupby(['cust', 'group']):
resampled=y.set_index('date').resample('D').val.mean().to_frame().rename({'val': 'resamp_val'}, axis=1)
resampled = resampled.join(y.set_index('date')).fillna({'cust':x[0],'group':x[1]})
resampled['resamp_val_pct'] = resampled.resamp_val.pct_change(fill_method=None)
new_df = pd.concat([new_df, resampled])
new_df = new_df[['cust', 'group', 'val', 'resamp_val', 'resamp_val_pct']]
new_df
python pandas resampling
I have a dataframe:
import pandas as pd
df = pd.DataFrame([['A', 'G1', '2019-01-01', 11],
['A', 'G1', '2019-01-02', 12],
['A', 'G1', '2019-01-04', 14],
['B', 'G2', '2019-01-01', 11],
['B', 'G2', '2019-01-03', 13],
['B', 'G2', '2019-01-06', 16]],
columns=['cust', 'group', 'date', 'val'])
df
df = df.groupby(['cust', 'group', 'date']).sum()
df
The dataframe is grouped and now I would like to calculate pct_change
, but only if there are previous date.
If I do it like this:
df['pct'] = df.groupby(['cust', 'group']).val.pct_change()
df
I will get pct_change
, but with no respect to the missing dates.
For example in group ('A', 'G1')
, pct
for date 2019-01-04
should be np.nan
because there is no (previous) date 2019-01-03
.
Maybe the solution would be to resample by day, where each new row will have np.nan
as val
, and than to do pct_change
.
I tried to use df.resample('1D', level=2)
but than I get an error:
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'MultiIndex'
For group ('B', 'G2')
all pct_change
should be np.nan
because none of the rows has previous date.
Expected result is:
How to calculate pct_change
respecting missing dates?
Solution:
new_df = pd.DataFrame()
for x, y in df.groupby(['cust', 'group']):
resampled=y.set_index('date').resample('D').val.mean().to_frame().rename({'val': 'resamp_val'}, axis=1)
resampled = resampled.join(y.set_index('date')).fillna({'cust':x[0],'group':x[1]})
resampled['resamp_val_pct'] = resampled.resamp_val.pct_change(fill_method=None)
new_df = pd.concat([new_df, resampled])
new_df = new_df[['cust', 'group', 'val', 'resamp_val', 'resamp_val_pct']]
new_df
python pandas resampling
python pandas resampling
edited yesterday
user3225309
asked Jan 18 at 14:21
user3225309user3225309
40111
40111
What is expected output?
– jezrael
Jan 18 at 14:35
I just provided expected result.
– user3225309
Jan 18 at 14:52
add a comment |
What is expected output?
– jezrael
Jan 18 at 14:35
I just provided expected result.
– user3225309
Jan 18 at 14:52
What is expected output?
– jezrael
Jan 18 at 14:35
What is expected output?
– jezrael
Jan 18 at 14:35
I just provided expected result.
– user3225309
Jan 18 at 14:52
I just provided expected result.
– user3225309
Jan 18 at 14:52
add a comment |
2 Answers
2
active
oldest
votes
Check with groupby
, then you need resample
first and get the pct change with Boolean mask ,since pct_change will ignore NaN
d={}
for x, y in df.groupby(['cust', 'group']):
s=y.set_index('date').resample('D').val.mean()
d[x]=pd.concat([s,s.pct_change().mask(s.shift().isnull()|s.isnull())],1)
newdf=pd.concat(d)
newdf.columns=['val','pct']
newdf
Out[651]:
val pct
date
A G1 2019-01-01 11.0 NaN
2019-01-02 12.0 0.090909
2019-01-03 NaN NaN
2019-01-04 14.0 NaN
B G2 2019-01-01 11.0 NaN
2019-01-02 NaN NaN
2019-01-03 13.0 NaN
2019-01-04 NaN NaN
2019-01-05 NaN NaN
2019-01-06 16.0 NaN
You can add reset_index(inplace=True) at the end to make all index back to columns
First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.
– user3225309
yesterday
add a comment |
May be you could try comparing the difference between the consecutive rows is not equal to 1 day and then change the pct_change.
df= df.groupby(['cust', 'group', 'date'])
.agg({'val':'sum','date':[min,max]}).reset_index()
df.columns = ['%s%s' % (a, '_%s' % b if b else '') for a, b in df.columns]
df['date_diff']=df['date'].diff()
df['pct_change_val']=df.val_sum.pct_change()
df['pct_change_final'] = df.apply(lambda row: np.NaN if pd.isnull(row.date_diff)
else np.NaN if row.date_diff != np.timedelta64(1, 'D') else row.pct_change_val ,axis=1)
#output:
cust group date date_min date_max val_sum date_diff pct_change_val pct_change_final
0 A G1 2019-01-01 2019-01-01 2019-01-01 11
1 A G1 2019-01-02 2019-01-02 2019-01-02 12 1 days 00:00:00.000000000 0.09090909090909083 0.09090909090909083
2 A G1 2019-01-04 2019-01-04 2019-01-04 14 2 days 00:00:00.000000000 0.16666666666666674
3 B G2 2019-01-01 2019-01-01 2019-01-01 11 -3 days +00:00:00.000000000 -0.2142857142857143
4 B G2 2019-01-03 2019-01-03 2019-01-03 13 2 days 00:00:00.000000000 0.18181818181818188
5 B G2 2019-01-06 2019-01-06 2019-01-06 16 3 days 00:00:00.000000000 0.23076923076923084
This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).
– user3225309
yesterday
I have updated the solution. hope it helps.
– AI_Learning
yesterday
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54255901%2fresample-before-pct-change-and-missing-values%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Check with groupby
, then you need resample
first and get the pct change with Boolean mask ,since pct_change will ignore NaN
d={}
for x, y in df.groupby(['cust', 'group']):
s=y.set_index('date').resample('D').val.mean()
d[x]=pd.concat([s,s.pct_change().mask(s.shift().isnull()|s.isnull())],1)
newdf=pd.concat(d)
newdf.columns=['val','pct']
newdf
Out[651]:
val pct
date
A G1 2019-01-01 11.0 NaN
2019-01-02 12.0 0.090909
2019-01-03 NaN NaN
2019-01-04 14.0 NaN
B G2 2019-01-01 11.0 NaN
2019-01-02 NaN NaN
2019-01-03 13.0 NaN
2019-01-04 NaN NaN
2019-01-05 NaN NaN
2019-01-06 16.0 NaN
You can add reset_index(inplace=True) at the end to make all index back to columns
First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.
– user3225309
yesterday
add a comment |
Check with groupby
, then you need resample
first and get the pct change with Boolean mask ,since pct_change will ignore NaN
d={}
for x, y in df.groupby(['cust', 'group']):
s=y.set_index('date').resample('D').val.mean()
d[x]=pd.concat([s,s.pct_change().mask(s.shift().isnull()|s.isnull())],1)
newdf=pd.concat(d)
newdf.columns=['val','pct']
newdf
Out[651]:
val pct
date
A G1 2019-01-01 11.0 NaN
2019-01-02 12.0 0.090909
2019-01-03 NaN NaN
2019-01-04 14.0 NaN
B G2 2019-01-01 11.0 NaN
2019-01-02 NaN NaN
2019-01-03 13.0 NaN
2019-01-04 NaN NaN
2019-01-05 NaN NaN
2019-01-06 16.0 NaN
You can add reset_index(inplace=True) at the end to make all index back to columns
First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.
– user3225309
yesterday
add a comment |
Check with groupby
, then you need resample
first and get the pct change with Boolean mask ,since pct_change will ignore NaN
d={}
for x, y in df.groupby(['cust', 'group']):
s=y.set_index('date').resample('D').val.mean()
d[x]=pd.concat([s,s.pct_change().mask(s.shift().isnull()|s.isnull())],1)
newdf=pd.concat(d)
newdf.columns=['val','pct']
newdf
Out[651]:
val pct
date
A G1 2019-01-01 11.0 NaN
2019-01-02 12.0 0.090909
2019-01-03 NaN NaN
2019-01-04 14.0 NaN
B G2 2019-01-01 11.0 NaN
2019-01-02 NaN NaN
2019-01-03 13.0 NaN
2019-01-04 NaN NaN
2019-01-05 NaN NaN
2019-01-06 16.0 NaN
You can add reset_index(inplace=True) at the end to make all index back to columns
Check with groupby
, then you need resample
first and get the pct change with Boolean mask ,since pct_change will ignore NaN
d={}
for x, y in df.groupby(['cust', 'group']):
s=y.set_index('date').resample('D').val.mean()
d[x]=pd.concat([s,s.pct_change().mask(s.shift().isnull()|s.isnull())],1)
newdf=pd.concat(d)
newdf.columns=['val','pct']
newdf
Out[651]:
val pct
date
A G1 2019-01-01 11.0 NaN
2019-01-02 12.0 0.090909
2019-01-03 NaN NaN
2019-01-04 14.0 NaN
B G2 2019-01-01 11.0 NaN
2019-01-02 NaN NaN
2019-01-03 13.0 NaN
2019-01-04 NaN NaN
2019-01-05 NaN NaN
2019-01-06 16.0 NaN
You can add reset_index(inplace=True) at the end to make all index back to columns
answered Jan 18 at 15:10
W-BW-B
106k83165
106k83165
First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.
– user3225309
yesterday
add a comment |
First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.
– user3225309
yesterday
First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.
– user3225309
yesterday
First I read the answer from AI_Learning in which I asked for resampling by group, the solution you have provided. I modified your example a bit, I will edit my question in order to present the solution.
– user3225309
yesterday
add a comment |
May be you could try comparing the difference between the consecutive rows is not equal to 1 day and then change the pct_change.
df= df.groupby(['cust', 'group', 'date'])
.agg({'val':'sum','date':[min,max]}).reset_index()
df.columns = ['%s%s' % (a, '_%s' % b if b else '') for a, b in df.columns]
df['date_diff']=df['date'].diff()
df['pct_change_val']=df.val_sum.pct_change()
df['pct_change_final'] = df.apply(lambda row: np.NaN if pd.isnull(row.date_diff)
else np.NaN if row.date_diff != np.timedelta64(1, 'D') else row.pct_change_val ,axis=1)
#output:
cust group date date_min date_max val_sum date_diff pct_change_val pct_change_final
0 A G1 2019-01-01 2019-01-01 2019-01-01 11
1 A G1 2019-01-02 2019-01-02 2019-01-02 12 1 days 00:00:00.000000000 0.09090909090909083 0.09090909090909083
2 A G1 2019-01-04 2019-01-04 2019-01-04 14 2 days 00:00:00.000000000 0.16666666666666674
3 B G2 2019-01-01 2019-01-01 2019-01-01 11 -3 days +00:00:00.000000000 -0.2142857142857143
4 B G2 2019-01-03 2019-01-03 2019-01-03 13 2 days 00:00:00.000000000 0.18181818181818188
5 B G2 2019-01-06 2019-01-06 2019-01-06 16 3 days 00:00:00.000000000 0.23076923076923084
This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).
– user3225309
yesterday
I have updated the solution. hope it helps.
– AI_Learning
yesterday
add a comment |
May be you could try comparing the difference between the consecutive rows is not equal to 1 day and then change the pct_change.
df= df.groupby(['cust', 'group', 'date'])
.agg({'val':'sum','date':[min,max]}).reset_index()
df.columns = ['%s%s' % (a, '_%s' % b if b else '') for a, b in df.columns]
df['date_diff']=df['date'].diff()
df['pct_change_val']=df.val_sum.pct_change()
df['pct_change_final'] = df.apply(lambda row: np.NaN if pd.isnull(row.date_diff)
else np.NaN if row.date_diff != np.timedelta64(1, 'D') else row.pct_change_val ,axis=1)
#output:
cust group date date_min date_max val_sum date_diff pct_change_val pct_change_final
0 A G1 2019-01-01 2019-01-01 2019-01-01 11
1 A G1 2019-01-02 2019-01-02 2019-01-02 12 1 days 00:00:00.000000000 0.09090909090909083 0.09090909090909083
2 A G1 2019-01-04 2019-01-04 2019-01-04 14 2 days 00:00:00.000000000 0.16666666666666674
3 B G2 2019-01-01 2019-01-01 2019-01-01 11 -3 days +00:00:00.000000000 -0.2142857142857143
4 B G2 2019-01-03 2019-01-03 2019-01-03 13 2 days 00:00:00.000000000 0.18181818181818188
5 B G2 2019-01-06 2019-01-06 2019-01-06 16 3 days 00:00:00.000000000 0.23076923076923084
This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).
– user3225309
yesterday
I have updated the solution. hope it helps.
– AI_Learning
yesterday
add a comment |
May be you could try comparing the difference between the consecutive rows is not equal to 1 day and then change the pct_change.
df= df.groupby(['cust', 'group', 'date'])
.agg({'val':'sum','date':[min,max]}).reset_index()
df.columns = ['%s%s' % (a, '_%s' % b if b else '') for a, b in df.columns]
df['date_diff']=df['date'].diff()
df['pct_change_val']=df.val_sum.pct_change()
df['pct_change_final'] = df.apply(lambda row: np.NaN if pd.isnull(row.date_diff)
else np.NaN if row.date_diff != np.timedelta64(1, 'D') else row.pct_change_val ,axis=1)
#output:
cust group date date_min date_max val_sum date_diff pct_change_val pct_change_final
0 A G1 2019-01-01 2019-01-01 2019-01-01 11
1 A G1 2019-01-02 2019-01-02 2019-01-02 12 1 days 00:00:00.000000000 0.09090909090909083 0.09090909090909083
2 A G1 2019-01-04 2019-01-04 2019-01-04 14 2 days 00:00:00.000000000 0.16666666666666674
3 B G2 2019-01-01 2019-01-01 2019-01-01 11 -3 days +00:00:00.000000000 -0.2142857142857143
4 B G2 2019-01-03 2019-01-03 2019-01-03 13 2 days 00:00:00.000000000 0.18181818181818188
5 B G2 2019-01-06 2019-01-06 2019-01-06 16 3 days 00:00:00.000000000 0.23076923076923084
May be you could try comparing the difference between the consecutive rows is not equal to 1 day and then change the pct_change.
df= df.groupby(['cust', 'group', 'date'])
.agg({'val':'sum','date':[min,max]}).reset_index()
df.columns = ['%s%s' % (a, '_%s' % b if b else '') for a, b in df.columns]
df['date_diff']=df['date'].diff()
df['pct_change_val']=df.val_sum.pct_change()
df['pct_change_final'] = df.apply(lambda row: np.NaN if pd.isnull(row.date_diff)
else np.NaN if row.date_diff != np.timedelta64(1, 'D') else row.pct_change_val ,axis=1)
#output:
cust group date date_min date_max val_sum date_diff pct_change_val pct_change_final
0 A G1 2019-01-01 2019-01-01 2019-01-01 11
1 A G1 2019-01-02 2019-01-02 2019-01-02 12 1 days 00:00:00.000000000 0.09090909090909083 0.09090909090909083
2 A G1 2019-01-04 2019-01-04 2019-01-04 14 2 days 00:00:00.000000000 0.16666666666666674
3 B G2 2019-01-01 2019-01-01 2019-01-01 11 -3 days +00:00:00.000000000 -0.2142857142857143
4 B G2 2019-01-03 2019-01-03 2019-01-03 13 2 days 00:00:00.000000000 0.18181818181818188
5 B G2 2019-01-06 2019-01-06 2019-01-06 16 3 days 00:00:00.000000000 0.23076923076923084
edited yesterday
answered Jan 18 at 15:30
AI_LearningAI_Learning
3,1612732
3,1612732
This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).
– user3225309
yesterday
I have updated the solution. hope it helps.
– AI_Learning
yesterday
add a comment |
This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).
– user3225309
yesterday
I have updated the solution. hope it helps.
– AI_Learning
yesterday
This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).
– user3225309
yesterday
This works. Thanks. I got an idea for another approach. Would it be possible to find min/max dates for each group and than resample by day? Afterwards, I could use pct_change. For example if for group X min date is 2019-01-01 and max is 2019-01-05, I could resample the group, and than do the same for rest of groups. In that way I will have a dataframe in proper format for pct_change (and some others operations).
– user3225309
yesterday
I have updated the solution. hope it helps.
– AI_Learning
yesterday
I have updated the solution. hope it helps.
– AI_Learning
yesterday
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54255901%2fresample-before-pct-change-and-missing-values%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What is expected output?
– jezrael
Jan 18 at 14:35
I just provided expected result.
– user3225309
Jan 18 at 14:52