Combine/merge DataFrames with different indexes and different column names

I have problems to merge two dataframes in the desired way. I unsuccessfully tried out a lot with merge and join methods but I did not achieve the desired result.

import pandas as pd



d = {'A': [1, 1, 0, 1, 0, 1, 0],

     'B': [0, 0, 0, 0, 0, 1, 1]

     }

df = pd.DataFrame(data=d, index=["A", "B", "C", "D", "E", "F", "G"])

print(df)



d = {'A2': ["D", "A", "A", "B", "C", "C", "E", "X", "F", "G"],

     'B2': ["DD", "AA", "AA", "BB", "CC", "CC", "EE", "XX", "FF", "GG"],

     'C3': [1, 1, 11, 35, 53, 2, 76, 45, 5, 34]}





df2 = pd.DataFrame(data=d)

print(df2)

Console output:

   A  B

A  1  0

B  1  0

C  0  0

D  1  0

E  0  0

F  1  1

G  0  1



  A2  B2  C3

0  A  AA   1

1  A  AA  11

2  B  BB  35

3  C  CC  53

4  C  CC   2

5  E  EE  76

6  X  XX  45

7  F  FF   5

8  G  GG  34

I'm looking for a way to compute the following: Via the index of df I can look up in column A2 of df2 the value of B2 which should be added to df.

Desired result:

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  DD

E  0  0  EE

F  1  1  FF

G  0  1  GG

(This is only dummy data, just duplicating the index and write it in column B2 of df is not sufficient)

edited Jan 18 at 20:30

asked Jan 18 at 20:22

d4rty

1,32821232

1

where is DD come from

– W-B
Jan 18 at 20:28

df2 is duplicated with respect to 'A2' so it's not guaranteed to be a unique mapping. What do we do in this case (i.e. is just drop_duplicates good or can A ever be mapped to multiple values like AA and AB?)

– ALollz
Jan 18 at 20:29

@W-B i fixed the issue in the MWE

– d4rty
Jan 18 at 20:30

1

@ALollz In df2 if two rows have the same A2 value than it follows that they also have the same B2 value. This is guaranteed by the data

– d4rty
Jan 18 at 20:32

add a comment |

I have problems to merge two dataframes in the desired way. I unsuccessfully tried out a lot with merge and join methods but I did not achieve the desired result.

import pandas as pd



d = {'A': [1, 1, 0, 1, 0, 1, 0],

     'B': [0, 0, 0, 0, 0, 1, 1]

     }

df = pd.DataFrame(data=d, index=["A", "B", "C", "D", "E", "F", "G"])

print(df)



d = {'A2': ["D", "A", "A", "B", "C", "C", "E", "X", "F", "G"],

     'B2': ["DD", "AA", "AA", "BB", "CC", "CC", "EE", "XX", "FF", "GG"],

     'C3': [1, 1, 11, 35, 53, 2, 76, 45, 5, 34]}





df2 = pd.DataFrame(data=d)

print(df2)

Console output:

   A  B

A  1  0

B  1  0

C  0  0

D  1  0

E  0  0

F  1  1

G  0  1



  A2  B2  C3

0  A  AA   1

1  A  AA  11

2  B  BB  35

3  C  CC  53

4  C  CC   2

5  E  EE  76

6  X  XX  45

7  F  FF   5

8  G  GG  34

I'm looking for a way to compute the following: Via the index of df I can look up in column A2 of df2 the value of B2 which should be added to df.

Desired result:

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  DD

E  0  0  EE

F  1  1  FF

G  0  1  GG

(This is only dummy data, just duplicating the index and write it in column B2 of df is not sufficient)

edited Jan 18 at 20:30

asked Jan 18 at 20:22

d4rty

1,32821232

1

where is DD come from

– W-B
Jan 18 at 20:28

df2 is duplicated with respect to 'A2' so it's not guaranteed to be a unique mapping. What do we do in this case (i.e. is just drop_duplicates good or can A ever be mapped to multiple values like AA and AB?)

– ALollz
Jan 18 at 20:29

@W-B i fixed the issue in the MWE

– d4rty
Jan 18 at 20:30

1

@ALollz In df2 if two rows have the same A2 value than it follows that they also have the same B2 value. This is guaranteed by the data

– d4rty
Jan 18 at 20:32

add a comment |

I have problems to merge two dataframes in the desired way. I unsuccessfully tried out a lot with merge and join methods but I did not achieve the desired result.

import pandas as pd



d = {'A': [1, 1, 0, 1, 0, 1, 0],

     'B': [0, 0, 0, 0, 0, 1, 1]

     }

df = pd.DataFrame(data=d, index=["A", "B", "C", "D", "E", "F", "G"])

print(df)



d = {'A2': ["D", "A", "A", "B", "C", "C", "E", "X", "F", "G"],

     'B2': ["DD", "AA", "AA", "BB", "CC", "CC", "EE", "XX", "FF", "GG"],

     'C3': [1, 1, 11, 35, 53, 2, 76, 45, 5, 34]}





df2 = pd.DataFrame(data=d)

print(df2)

Console output:

   A  B

A  1  0

B  1  0

C  0  0

D  1  0

E  0  0

F  1  1

G  0  1



  A2  B2  C3

0  A  AA   1

1  A  AA  11

2  B  BB  35

3  C  CC  53

4  C  CC   2

5  E  EE  76

6  X  XX  45

7  F  FF   5

8  G  GG  34

I'm looking for a way to compute the following: Via the index of df I can look up in column A2 of df2 the value of B2 which should be added to df.

Desired result:

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  DD

E  0  0  EE

F  1  1  FF

G  0  1  GG

(This is only dummy data, just duplicating the index and write it in column B2 of df is not sufficient)

edited Jan 18 at 20:30

asked Jan 18 at 20:22

d4rty

1,32821232

I have problems to merge two dataframes in the desired way. I unsuccessfully tried out a lot with merge and join methods but I did not achieve the desired result.

import pandas as pd



d = {'A': [1, 1, 0, 1, 0, 1, 0],

     'B': [0, 0, 0, 0, 0, 1, 1]

     }

df = pd.DataFrame(data=d, index=["A", "B", "C", "D", "E", "F", "G"])

print(df)



d = {'A2': ["D", "A", "A", "B", "C", "C", "E", "X", "F", "G"],

     'B2': ["DD", "AA", "AA", "BB", "CC", "CC", "EE", "XX", "FF", "GG"],

     'C3': [1, 1, 11, 35, 53, 2, 76, 45, 5, 34]}





df2 = pd.DataFrame(data=d)

print(df2)

Console output:

   A  B

A  1  0

B  1  0

C  0  0

D  1  0

E  0  0

F  1  1

G  0  1



  A2  B2  C3

0  A  AA   1

1  A  AA  11

2  B  BB  35

3  C  CC  53

4  C  CC   2

5  E  EE  76

6  X  XX  45

7  F  FF   5

8  G  GG  34

I'm looking for a way to compute the following: Via the index of df I can look up in column A2 of df2 the value of B2 which should be added to df.

Desired result:

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  DD

E  0  0  EE

F  1  1  FF

G  0  1  GG

(This is only dummy data, just duplicating the index and write it in column B2 of df is not sufficient)

pandas dataframe

edited Jan 18 at 20:30

asked Jan 18 at 20:22

d4rty

1,32821232

edited Jan 18 at 20:30

asked Jan 18 at 20:22

d4rty

1,32821232

edited Jan 18 at 20:30

asked Jan 18 at 20:22

d4rty

1,32821232

asked Jan 18 at 20:22

d4rty

1,32821232

asked Jan 18 at 20:22

d4rty

1,32821232

1

where is DD come from

– W-B
Jan 18 at 20:28

df2 is duplicated with respect to 'A2' so it's not guaranteed to be a unique mapping. What do we do in this case (i.e. is just drop_duplicates good or can A ever be mapped to multiple values like AA and AB?)

– ALollz
Jan 18 at 20:29

@W-B i fixed the issue in the MWE

– d4rty
Jan 18 at 20:30

1

@ALollz In df2 if two rows have the same A2 value than it follows that they also have the same B2 value. This is guaranteed by the data

– d4rty
Jan 18 at 20:32

add a comment |

1

where is DD come from

– W-B
Jan 18 at 20:28

df2 is duplicated with respect to 'A2' so it's not guaranteed to be a unique mapping. What do we do in this case (i.e. is just drop_duplicates good or can A ever be mapped to multiple values like AA and AB?)

– ALollz
Jan 18 at 20:29

@W-B i fixed the issue in the MWE

– d4rty
Jan 18 at 20:30

1

@ALollz In df2 if two rows have the same A2 value than it follows that they also have the same B2 value. This is guaranteed by the data

– d4rty
Jan 18 at 20:32

where is DD come from

– W-B
Jan 18 at 20:28

df2 is duplicated with respect to 'A2' so it's not guaranteed to be a unique mapping. What do we do in this case (i.e. is just drop_duplicates good or can A ever be mapped to multiple values like AA and AB?)

– ALollz
Jan 18 at 20:29

@W-B i fixed the issue in the MWE

– d4rty
Jan 18 at 20:30

@ALollz In df2 if two rows have the same A2 value than it follows that they also have the same B2 value. This is guaranteed by the data

– d4rty
Jan 18 at 20:32

add a comment |

2 Answers
2

active

oldest

votes

set_index and assign it

df['B2']=df2.drop_duplicates('A2').set_index('A2')['B2']

df

Out[728]: 

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  DD

E  0  0  EE

F  1  1  FF

G  0  1  GG

answered Jan 18 at 20:32

W-B

107k83165

add a comment |

I know this has been already answered by W-B in a very elegant way.

However, since I have spent the time to solve this in a less professional way, let me post also my solution.

From:

I'm looking for a way to compute the following: Via the index of df I
can look up in column A2 of df2 the value of B2 which should be added
to df.

I understood I should do:

get index list form df. So A, B, C...

look values in df2['B2'] in the same index than df2['A2'] for each element of df index

create a new column ['B2'] in df, where we copy these values from df2['B2'] matching the index from df to the elements on df2['A2']

This is my code:

import pandas as pd



d = {'A': [1, 1, 0, 1, 0, 1, 0],

     'B': [0, 0, 0, 0, 0, 1, 1]

     }

df = pd.DataFrame(data=d, index=["A", "B", "C", "D", "E", "F", "G"])

print(df)



d = {'A2': ["D", "A", "A", "B", "C", "C", "E", "X", "F", "G"],

     'B2': ["DD", "AA", "AA", "BB", "CC", "CC", "EE", "XX", "FF", "GG"],

     'C3': [1, 1, 11, 35, 53, 2, 76, 45, 5, 34]}





df2 = pd.DataFrame(data=d)

print(df2)



llista=

for i in df.index:

    m=df2['A2'].loc[df2['A2']==i].index

    if m[0]:

        print m[0],i

        llista.append(df2['B2'].iloc[m[0]])

    else:

        llista.append()

df['B2'] = llista

Output is:

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  

E  0  0  EE

F  1  1  FF

G  0  1  GG

As you can see is different than the accepted post. This is because there is no 'D' index in df2['A2']

answered Jan 18 at 21:20

daniel

4471717

Both answers are right and they deserve a vote up. However, I have to say that I found this one to be the right one.

– stram hp
Jan 19 at 14:23

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54260920%2fcombine-merge-dataframes-with-different-indexes-and-different-column-names%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

set_index and assign it

df['B2']=df2.drop_duplicates('A2').set_index('A2')['B2']

df

Out[728]: 

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  DD

E  0  0  EE

F  1  1  FF

G  0  1  GG

answered Jan 18 at 20:32

W-B

107k83165

add a comment |

set_index and assign it

df['B2']=df2.drop_duplicates('A2').set_index('A2')['B2']

df

Out[728]: 

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  DD

E  0  0  EE

F  1  1  FF

G  0  1  GG

answered Jan 18 at 20:32

W-B

107k83165

add a comment |

set_index and assign it

df['B2']=df2.drop_duplicates('A2').set_index('A2')['B2']

df

Out[728]: 

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  DD

E  0  0  EE

F  1  1  FF

G  0  1  GG

answered Jan 18 at 20:32

W-B

107k83165

set_index and assign it

df['B2']=df2.drop_duplicates('A2').set_index('A2')['B2']

df

Out[728]: 

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  DD

E  0  0  EE

F  1  1  FF

G  0  1  GG

answered Jan 18 at 20:32

W-B

107k83165

answered Jan 18 at 20:32

W-B

107k83165

answered Jan 18 at 20:32

W-B

107k83165

answered Jan 18 at 20:32

W-B

107k83165

add a comment |

I know this has been already answered by W-B in a very elegant way.

However, since I have spent the time to solve this in a less professional way, let me post also my solution.

From:

I'm looking for a way to compute the following: Via the index of df I
can look up in column A2 of df2 the value of B2 which should be added
to df.

I understood I should do:

get index list form df. So A, B, C...

look values in df2['B2'] in the same index than df2['A2'] for each element of df index

create a new column ['B2'] in df, where we copy these values from df2['B2'] matching the index from df to the elements on df2['A2']

This is my code:

import pandas as pd



d = {'A': [1, 1, 0, 1, 0, 1, 0],

     'B': [0, 0, 0, 0, 0, 1, 1]

     }

df = pd.DataFrame(data=d, index=["A", "B", "C", "D", "E", "F", "G"])

print(df)



d = {'A2': ["D", "A", "A", "B", "C", "C", "E", "X", "F", "G"],

     'B2': ["DD", "AA", "AA", "BB", "CC", "CC", "EE", "XX", "FF", "GG"],

     'C3': [1, 1, 11, 35, 53, 2, 76, 45, 5, 34]}





df2 = pd.DataFrame(data=d)

print(df2)



llista=

for i in df.index:

    m=df2['A2'].loc[df2['A2']==i].index

    if m[0]:

        print m[0],i

        llista.append(df2['B2'].iloc[m[0]])

    else:

        llista.append()

df['B2'] = llista

Output is:

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  

E  0  0  EE

F  1  1  FF

G  0  1  GG

As you can see is different than the accepted post. This is because there is no 'D' index in df2['A2']

answered Jan 18 at 21:20

daniel

4471717

Both answers are right and they deserve a vote up. However, I have to say that I found this one to be the right one.

– stram hp
Jan 19 at 14:23

add a comment |

I know this has been already answered by W-B in a very elegant way.

However, since I have spent the time to solve this in a less professional way, let me post also my solution.

From:

I'm looking for a way to compute the following: Via the index of df I
can look up in column A2 of df2 the value of B2 which should be added
to df.

I understood I should do:

get index list form df. So A, B, C...

look values in df2['B2'] in the same index than df2['A2'] for each element of df index

create a new column ['B2'] in df, where we copy these values from df2['B2'] matching the index from df to the elements on df2['A2']

This is my code:

import pandas as pd



d = {'A': [1, 1, 0, 1, 0, 1, 0],

     'B': [0, 0, 0, 0, 0, 1, 1]

     }

df = pd.DataFrame(data=d, index=["A", "B", "C", "D", "E", "F", "G"])

print(df)



d = {'A2': ["D", "A", "A", "B", "C", "C", "E", "X", "F", "G"],

     'B2': ["DD", "AA", "AA", "BB", "CC", "CC", "EE", "XX", "FF", "GG"],

     'C3': [1, 1, 11, 35, 53, 2, 76, 45, 5, 34]}





df2 = pd.DataFrame(data=d)

print(df2)



llista=

for i in df.index:

    m=df2['A2'].loc[df2['A2']==i].index

    if m[0]:

        print m[0],i

        llista.append(df2['B2'].iloc[m[0]])

    else:

        llista.append()

df['B2'] = llista

Output is:

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  

E  0  0  EE

F  1  1  FF

G  0  1  GG

As you can see is different than the accepted post. This is because there is no 'D' index in df2['A2']

answered Jan 18 at 21:20

daniel

4471717

Both answers are right and they deserve a vote up. However, I have to say that I found this one to be the right one.

– stram hp
Jan 19 at 14:23

add a comment |

I know this has been already answered by W-B in a very elegant way.

However, since I have spent the time to solve this in a less professional way, let me post also my solution.

From:

I'm looking for a way to compute the following: Via the index of df I
can look up in column A2 of df2 the value of B2 which should be added
to df.

I understood I should do:

get index list form df. So A, B, C...

look values in df2['B2'] in the same index than df2['A2'] for each element of df index

create a new column ['B2'] in df, where we copy these values from df2['B2'] matching the index from df to the elements on df2['A2']

This is my code:

import pandas as pd



d = {'A': [1, 1, 0, 1, 0, 1, 0],

     'B': [0, 0, 0, 0, 0, 1, 1]

     }

df = pd.DataFrame(data=d, index=["A", "B", "C", "D", "E", "F", "G"])

print(df)



d = {'A2': ["D", "A", "A", "B", "C", "C", "E", "X", "F", "G"],

     'B2': ["DD", "AA", "AA", "BB", "CC", "CC", "EE", "XX", "FF", "GG"],

     'C3': [1, 1, 11, 35, 53, 2, 76, 45, 5, 34]}





df2 = pd.DataFrame(data=d)

print(df2)



llista=

for i in df.index:

    m=df2['A2'].loc[df2['A2']==i].index

    if m[0]:

        print m[0],i

        llista.append(df2['B2'].iloc[m[0]])

    else:

        llista.append()

df['B2'] = llista

Output is:

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  

E  0  0  EE

F  1  1  FF

G  0  1  GG

As you can see is different than the accepted post. This is because there is no 'D' index in df2['A2']

answered Jan 18 at 21:20

daniel

4471717

I know this has been already answered by W-B in a very elegant way.

However, since I have spent the time to solve this in a less professional way, let me post also my solution.

From:

I'm looking for a way to compute the following: Via the index of df I
can look up in column A2 of df2 the value of B2 which should be added
to df.

I understood I should do:

get index list form df. So A, B, C...

look values in df2['B2'] in the same index than df2['A2'] for each element of df index

create a new column ['B2'] in df, where we copy these values from df2['B2'] matching the index from df to the elements on df2['A2']

This is my code:

import pandas as pd



d = {'A': [1, 1, 0, 1, 0, 1, 0],

     'B': [0, 0, 0, 0, 0, 1, 1]

     }

df = pd.DataFrame(data=d, index=["A", "B", "C", "D", "E", "F", "G"])

print(df)



d = {'A2': ["D", "A", "A", "B", "C", "C", "E", "X", "F", "G"],

     'B2': ["DD", "AA", "AA", "BB", "CC", "CC", "EE", "XX", "FF", "GG"],

     'C3': [1, 1, 11, 35, 53, 2, 76, 45, 5, 34]}





df2 = pd.DataFrame(data=d)

print(df2)



llista=

for i in df.index:

    m=df2['A2'].loc[df2['A2']==i].index

    if m[0]:

        print m[0],i

        llista.append(df2['B2'].iloc[m[0]])

    else:

        llista.append()

df['B2'] = llista

Output is:

   A  B  B2

A  1  0  AA

B  1  0  BB

C  0  0  CC

D  1  0  

E  0  0  EE

F  1  1  FF

G  0  1  GG

As you can see is different than the accepted post. This is because there is no 'D' index in df2['A2']

answered Jan 18 at 21:20

daniel

4471717

answered Jan 18 at 21:20

daniel

4471717

answered Jan 18 at 21:20

daniel

4471717

answered Jan 18 at 21:20

daniel

4471717

Both answers are right and they deserve a vote up. However, I have to say that I found this one to be the right one.

– stram hp
Jan 19 at 14:23

add a comment |

Both answers are right and they deserve a vote up. However, I have to say that I found this one to be the right one.

– stram hp
Jan 19 at 14:23

Both answers are right and they deserve a vote up. However, I have to say that I found this one to be the right one.

– stram hp
Jan 19 at 14:23

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Brtdku