Need to reshape my dataframe (lots of column names)

I am trying to reshape a dataframe in pandas. I currently have one id variable, and the rest of the variables are in the following format: "variableyear", where year is between 2000 and 2016. I want to to make a new variable year (which extracts the year from my variableyear variable) and creates a column named variable. Here is an example dataset that looks similar to my real dataset (as my data is confidential):



    |  name   | income2015 | income2016 | children2015 | children2016 | education2015 | education2016 

 ---|---------|------------|------------|--------------|--------------|---------------|--------------- 

  0 | John    |          1 |          4 |            7 |           10 |            13 |            16 

  1 | Phillip |          2 |          5 |            8 |           11 |            14 |            17 

  2 | Carl    |          3 |          6 |            9 |           12 |            15 |            18

This is what I want:

    |  name   | year | income | children | education 

 ---|---------|------|--------|----------|----------- 

  0 | John    | 2015 |      1 |        7 |        13 

  1 | Phillip | 2015 |      2 |        8 |        14 

  2 | Carl    | 2015 |      3 |        9 |        15 

  3 | John    | 2016 |      4 |       10 |        16 

  4 | Phillip | 2016 |      5 |       11 |        17 

  5 | Carl    | 2016 |      6 |       12 |        18

I have already tried the following:

df2 = pd.melt(df, id_vars=['name'], value_vars=df.columns[1:])

df2['year'] = df2['variable'].map(lambda x: x[-4:])

df2['variable'] = df2['variable'].map(lambda x: x[:-4])

which gives me this:

       |          |           |      |      

 ------|----------|-----------|------|------ 

  name | variable | value     | year |      

  0    | John     | income    | 1    | 2015 

  1    | Phillip  | income    | 2    | 2015 

  2    | Carl     | income    | 3    | 2015 

  3    | John     | income    | 4    | 2016 

  4    | Phillip  | income    | 5    | 2016 

  5    | Carl     | income    | 6    | 2016 

  6    | John     | children  | 7    | 2015 

  7    | Phillip  | children  | 8    | 2015 

  8    | Carl     | children  | 9    | 2015 

  9    | John     | children  | 10   | 2016 

  10   | Phillip  | children  | 11   | 2016 

  11   | Carl     | children  | 12   | 2016 

  12   | John     | education | 13   | 2015 

  13   | Phillip  | education | 14   | 2015 

  14   | Carl     | education | 15   | 2015 

  15   | John     | education | 16   | 2016 

  16   | Phillip  | education | 17   | 2016 

  17   | Carl     | education | 18   | 2016

But now I have to reshape again... Is there an easier to do this?

Also, here is my df in dictionary format:

{'children2015': {0: 7, 1: 8, 2: 9}, 'children2016': {0: 10, 1: 11, 2: 12}, 'education2015': {0: 13, 1: 14, 2: 15}, 'education2016': {0: 16, 1: 17, 2: 18}, 'income2015': {0: 1, 1: 2, 2: 3}, 'income2016': {0: 4, 1: 5, 2: 6}, 'name': {0: 'John', 1: 'Phillip', 2: 'Carl'}}

asked Jan 19 at 20:49

Jimbo

add a comment |



    |  name   | income2015 | income2016 | children2015 | children2016 | education2015 | education2016 

 ---|---------|------------|------------|--------------|--------------|---------------|--------------- 

  0 | John    |          1 |          4 |            7 |           10 |            13 |            16 

  1 | Phillip |          2 |          5 |            8 |           11 |            14 |            17 

  2 | Carl    |          3 |          6 |            9 |           12 |            15 |            18

This is what I want:

    |  name   | year | income | children | education 

 ---|---------|------|--------|----------|----------- 

  0 | John    | 2015 |      1 |        7 |        13 

  1 | Phillip | 2015 |      2 |        8 |        14 

  2 | Carl    | 2015 |      3 |        9 |        15 

  3 | John    | 2016 |      4 |       10 |        16 

  4 | Phillip | 2016 |      5 |       11 |        17 

  5 | Carl    | 2016 |      6 |       12 |        18

I have already tried the following:

df2 = pd.melt(df, id_vars=['name'], value_vars=df.columns[1:])

df2['year'] = df2['variable'].map(lambda x: x[-4:])

df2['variable'] = df2['variable'].map(lambda x: x[:-4])

which gives me this:

       |          |           |      |      

 ------|----------|-----------|------|------ 

  name | variable | value     | year |      

  0    | John     | income    | 1    | 2015 

  1    | Phillip  | income    | 2    | 2015 

  2    | Carl     | income    | 3    | 2015 

  3    | John     | income    | 4    | 2016 

  4    | Phillip  | income    | 5    | 2016 

  5    | Carl     | income    | 6    | 2016 

  6    | John     | children  | 7    | 2015 

  7    | Phillip  | children  | 8    | 2015 

  8    | Carl     | children  | 9    | 2015 

  9    | John     | children  | 10   | 2016 

  10   | Phillip  | children  | 11   | 2016 

  11   | Carl     | children  | 12   | 2016 

  12   | John     | education | 13   | 2015 

  13   | Phillip  | education | 14   | 2015 

  14   | Carl     | education | 15   | 2015 

  15   | John     | education | 16   | 2016 

  16   | Phillip  | education | 17   | 2016 

  17   | Carl     | education | 18   | 2016

But now I have to reshape again... Is there an easier to do this?

Also, here is my df in dictionary format:

{'children2015': {0: 7, 1: 8, 2: 9}, 'children2016': {0: 10, 1: 11, 2: 12}, 'education2015': {0: 13, 1: 14, 2: 15}, 'education2016': {0: 16, 1: 17, 2: 18}, 'income2015': {0: 1, 1: 2, 2: 3}, 'income2016': {0: 4, 1: 5, 2: 6}, 'name': {0: 'John', 1: 'Phillip', 2: 'Carl'}}

asked Jan 19 at 20:49

Jimbo

add a comment |



    |  name   | income2015 | income2016 | children2015 | children2016 | education2015 | education2016 

 ---|---------|------------|------------|--------------|--------------|---------------|--------------- 

  0 | John    |          1 |          4 |            7 |           10 |            13 |            16 

  1 | Phillip |          2 |          5 |            8 |           11 |            14 |            17 

  2 | Carl    |          3 |          6 |            9 |           12 |            15 |            18

This is what I want:

    |  name   | year | income | children | education 

 ---|---------|------|--------|----------|----------- 

  0 | John    | 2015 |      1 |        7 |        13 

  1 | Phillip | 2015 |      2 |        8 |        14 

  2 | Carl    | 2015 |      3 |        9 |        15 

  3 | John    | 2016 |      4 |       10 |        16 

  4 | Phillip | 2016 |      5 |       11 |        17 

  5 | Carl    | 2016 |      6 |       12 |        18

I have already tried the following:

df2 = pd.melt(df, id_vars=['name'], value_vars=df.columns[1:])

df2['year'] = df2['variable'].map(lambda x: x[-4:])

df2['variable'] = df2['variable'].map(lambda x: x[:-4])

which gives me this:

       |          |           |      |      

 ------|----------|-----------|------|------ 

  name | variable | value     | year |      

  0    | John     | income    | 1    | 2015 

  1    | Phillip  | income    | 2    | 2015 

  2    | Carl     | income    | 3    | 2015 

  3    | John     | income    | 4    | 2016 

  4    | Phillip  | income    | 5    | 2016 

  5    | Carl     | income    | 6    | 2016 

  6    | John     | children  | 7    | 2015 

  7    | Phillip  | children  | 8    | 2015 

  8    | Carl     | children  | 9    | 2015 

  9    | John     | children  | 10   | 2016 

  10   | Phillip  | children  | 11   | 2016 

  11   | Carl     | children  | 12   | 2016 

  12   | John     | education | 13   | 2015 

  13   | Phillip  | education | 14   | 2015 

  14   | Carl     | education | 15   | 2015 

  15   | John     | education | 16   | 2016 

  16   | Phillip  | education | 17   | 2016 

  17   | Carl     | education | 18   | 2016

But now I have to reshape again... Is there an easier to do this?

Also, here is my df in dictionary format:

{'children2015': {0: 7, 1: 8, 2: 9}, 'children2016': {0: 10, 1: 11, 2: 12}, 'education2015': {0: 13, 1: 14, 2: 15}, 'education2016': {0: 16, 1: 17, 2: 18}, 'income2015': {0: 1, 1: 2, 2: 3}, 'income2016': {0: 4, 1: 5, 2: 6}, 'name': {0: 'John', 1: 'Phillip', 2: 'Carl'}}

asked Jan 19 at 20:49

Jimbo



    |  name   | income2015 | income2016 | children2015 | children2016 | education2015 | education2016 

 ---|---------|------------|------------|--------------|--------------|---------------|--------------- 

  0 | John    |          1 |          4 |            7 |           10 |            13 |            16 

  1 | Phillip |          2 |          5 |            8 |           11 |            14 |            17 

  2 | Carl    |          3 |          6 |            9 |           12 |            15 |            18

This is what I want:

    |  name   | year | income | children | education 

 ---|---------|------|--------|----------|----------- 

  0 | John    | 2015 |      1 |        7 |        13 

  1 | Phillip | 2015 |      2 |        8 |        14 

  2 | Carl    | 2015 |      3 |        9 |        15 

  3 | John    | 2016 |      4 |       10 |        16 

  4 | Phillip | 2016 |      5 |       11 |        17 

  5 | Carl    | 2016 |      6 |       12 |        18

I have already tried the following:

df2 = pd.melt(df, id_vars=['name'], value_vars=df.columns[1:])

df2['year'] = df2['variable'].map(lambda x: x[-4:])

df2['variable'] = df2['variable'].map(lambda x: x[:-4])

which gives me this:

       |          |           |      |      

 ------|----------|-----------|------|------ 

  name | variable | value     | year |      

  0    | John     | income    | 1    | 2015 

  1    | Phillip  | income    | 2    | 2015 

  2    | Carl     | income    | 3    | 2015 

  3    | John     | income    | 4    | 2016 

  4    | Phillip  | income    | 5    | 2016 

  5    | Carl     | income    | 6    | 2016 

  6    | John     | children  | 7    | 2015 

  7    | Phillip  | children  | 8    | 2015 

  8    | Carl     | children  | 9    | 2015 

  9    | John     | children  | 10   | 2016 

  10   | Phillip  | children  | 11   | 2016 

  11   | Carl     | children  | 12   | 2016 

  12   | John     | education | 13   | 2015 

  13   | Phillip  | education | 14   | 2015 

  14   | Carl     | education | 15   | 2015 

  15   | John     | education | 16   | 2016 

  16   | Phillip  | education | 17   | 2016 

  17   | Carl     | education | 18   | 2016

But now I have to reshape again... Is there an easier to do this?

Also, here is my df in dictionary format:

{'children2015': {0: 7, 1: 8, 2: 9}, 'children2016': {0: 10, 1: 11, 2: 12}, 'education2015': {0: 13, 1: 14, 2: 15}, 'education2016': {0: 16, 1: 17, 2: 18}, 'income2015': {0: 1, 1: 2, 2: 3}, 'income2016': {0: 4, 1: 5, 2: 6}, 'name': {0: 'John', 1: 'Phillip', 2: 'Carl'}}

python python-3.x pandas dataframe reshape

asked Jan 19 at 20:49

Jimbo

asked Jan 19 at 20:49

Jimbo

asked Jan 19 at 20:49

Jimbo

asked Jan 19 at 20:49

Jimbo

asked Jan 19 at 20:49

Jimbo

add a comment |

1 Answer
1

active

oldest

votes

You can actually use pd.wide_to_long for just this. In the stubnames arg you could use a set of variable names (that excludes name and drop the last 4 characters) in your df using this code: set([x[:-4] for x in df.columns[1:]]).

pd.wide_to_long(df,stubnames=set([x[:-4] for x in df.columns[1:]]),i=['name'],j='year').reset_index()

Output:

    name    year    education   income  children

0   John    2015    13          1       7

1   Phillip 2015    14          2       8

2   Carl    2015    15          3       9

3   John    2016    16          4       10

4   Phillip 2016    17          5       11

5   Carl    2016    18          6       12

answered Jan 19 at 20:52

Joe Patten

1,3801414

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54271274%2fneed-to-reshape-my-dataframe-lots-of-column-names%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

pd.wide_to_long(df,stubnames=set([x[:-4] for x in df.columns[1:]]),i=['name'],j='year').reset_index()

Output:

    name    year    education   income  children

0   John    2015    13          1       7

1   Phillip 2015    14          2       8

2   Carl    2015    15          3       9

3   John    2016    16          4       10

4   Phillip 2016    17          5       11

5   Carl    2016    18          6       12

answered Jan 19 at 20:52

Joe Patten

1,3801414

add a comment |

pd.wide_to_long(df,stubnames=set([x[:-4] for x in df.columns[1:]]),i=['name'],j='year').reset_index()

Output:

    name    year    education   income  children

0   John    2015    13          1       7

1   Phillip 2015    14          2       8

2   Carl    2015    15          3       9

3   John    2016    16          4       10

4   Phillip 2016    17          5       11

5   Carl    2016    18          6       12

answered Jan 19 at 20:52

Joe Patten

1,3801414

add a comment |

pd.wide_to_long(df,stubnames=set([x[:-4] for x in df.columns[1:]]),i=['name'],j='year').reset_index()

Output:

    name    year    education   income  children

0   John    2015    13          1       7

1   Phillip 2015    14          2       8

2   Carl    2015    15          3       9

3   John    2016    16          4       10

4   Phillip 2016    17          5       11

5   Carl    2016    18          6       12

answered Jan 19 at 20:52

Joe Patten

1,3801414

pd.wide_to_long(df,stubnames=set([x[:-4] for x in df.columns[1:]]),i=['name'],j='year').reset_index()

Output:

    name    year    education   income  children

0   John    2015    13          1       7

1   Phillip 2015    14          2       8

2   Carl    2015    15          3       9

3   John    2016    16          4       10

4   Phillip 2016    17          5       11

5   Carl    2016    18          6       12

answered Jan 19 at 20:52

Joe Patten

1,3801414

answered Jan 19 at 20:52

Joe Patten

1,3801414

answered Jan 19 at 20:52

Joe Patten

1,3801414

answered Jan 19 at 20:52

Joe Patten

1,3801414

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Brtdku