How can I compare two data frames in pandas and update values based on keys?












1















I have two data frames and I want to use pandas syntax or methods to compare them and update values from the larger data frame to the smaller data frame based on similar keys.



import numpy
import pandas as pd

temp = pd.read_csv('.\..\..\test.csv')
temp2 = pd.read_excel('.\..\..\main.xlsx')

lenOfFile = len(temp.iloc[:, 1])
lenOfFile2 = len(temp2.iloc[:, 1])
dict1 = {}
dict2 = {}

for i in range(lenOfFile):
dict1[temp.iloc[i, 0]] = temp.iloc[i, 1]

for i in range(lenOfFile2):
dict2[temp2.iloc[i, 0]] = temp2.iloc[i, 1]

for i in dict1:
if i in dict2:
dict1[i] = dict2[i]
else:
dict1[i] = "Not in dict2"


I want the same behavior as what I wrote.










share|improve this question



























    1















    I have two data frames and I want to use pandas syntax or methods to compare them and update values from the larger data frame to the smaller data frame based on similar keys.



    import numpy
    import pandas as pd

    temp = pd.read_csv('.\..\..\test.csv')
    temp2 = pd.read_excel('.\..\..\main.xlsx')

    lenOfFile = len(temp.iloc[:, 1])
    lenOfFile2 = len(temp2.iloc[:, 1])
    dict1 = {}
    dict2 = {}

    for i in range(lenOfFile):
    dict1[temp.iloc[i, 0]] = temp.iloc[i, 1]

    for i in range(lenOfFile2):
    dict2[temp2.iloc[i, 0]] = temp2.iloc[i, 1]

    for i in dict1:
    if i in dict2:
    dict1[i] = dict2[i]
    else:
    dict1[i] = "Not in dict2"


    I want the same behavior as what I wrote.










    share|improve this question

























      1












      1








      1








      I have two data frames and I want to use pandas syntax or methods to compare them and update values from the larger data frame to the smaller data frame based on similar keys.



      import numpy
      import pandas as pd

      temp = pd.read_csv('.\..\..\test.csv')
      temp2 = pd.read_excel('.\..\..\main.xlsx')

      lenOfFile = len(temp.iloc[:, 1])
      lenOfFile2 = len(temp2.iloc[:, 1])
      dict1 = {}
      dict2 = {}

      for i in range(lenOfFile):
      dict1[temp.iloc[i, 0]] = temp.iloc[i, 1]

      for i in range(lenOfFile2):
      dict2[temp2.iloc[i, 0]] = temp2.iloc[i, 1]

      for i in dict1:
      if i in dict2:
      dict1[i] = dict2[i]
      else:
      dict1[i] = "Not in dict2"


      I want the same behavior as what I wrote.










      share|improve this question














      I have two data frames and I want to use pandas syntax or methods to compare them and update values from the larger data frame to the smaller data frame based on similar keys.



      import numpy
      import pandas as pd

      temp = pd.read_csv('.\..\..\test.csv')
      temp2 = pd.read_excel('.\..\..\main.xlsx')

      lenOfFile = len(temp.iloc[:, 1])
      lenOfFile2 = len(temp2.iloc[:, 1])
      dict1 = {}
      dict2 = {}

      for i in range(lenOfFile):
      dict1[temp.iloc[i, 0]] = temp.iloc[i, 1]

      for i in range(lenOfFile2):
      dict2[temp2.iloc[i, 0]] = temp2.iloc[i, 1]

      for i in dict1:
      if i in dict2:
      dict1[i] = dict2[i]
      else:
      dict1[i] = "Not in dict2"


      I want the same behavior as what I wrote.







      python pandas dataframe






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 18 at 23:56









      paulpaul

      365




      365
























          1 Answer
          1






          active

          oldest

          votes


















          0














          You should have put a Minimal, Complete and Verifiable Example. Please, make sure in the future we can run your code just by pasting into our IDE. I spent way too much time on that question haha



          import pandas as pd

          temp = pd.DataFrame({'A' : [20, 4, 60, 4, 8], 'B' : [2, 4, 5, 6, 7]})
          temp2 = pd.DataFrame({'A' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'B' : [1, 2, 3, 10, 5, 6, 70, 8, 9, 10]})
          print(temp)
          print(temp2)
          # A B
          # 0 20 2
          # 1 4 4
          # 2 60 5
          # 3 4 6
          # 4 8 7

          # A B
          # 0 1 1
          # 1 2 2
          # 2 3 3
          # 3 4 10
          # 4 5 5
          # 5 6 6
          # 6 7 70
          # 7 8 8
          # 8 9 9
          # 9 10 10

          # Make a mapping of the values of our second mask.
          mapping = dict(zip(temp2['A'], temp2['B']))

          # We apply the mapping to each row. If we find the occurence, replace, else, default.
          temp['B'] = temp['A'].apply(lambda x:mapping[x] if x in mapping else 'No matching')
          print(temp)
          # A B
          # 0 20 No matching
          # 1 4 10
          # 2 60 No matching
          # 3 4 10
          # 4 8 8





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54262874%2fhow-can-i-compare-two-data-frames-in-pandas-and-update-values-based-on-keys%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            You should have put a Minimal, Complete and Verifiable Example. Please, make sure in the future we can run your code just by pasting into our IDE. I spent way too much time on that question haha



            import pandas as pd

            temp = pd.DataFrame({'A' : [20, 4, 60, 4, 8], 'B' : [2, 4, 5, 6, 7]})
            temp2 = pd.DataFrame({'A' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'B' : [1, 2, 3, 10, 5, 6, 70, 8, 9, 10]})
            print(temp)
            print(temp2)
            # A B
            # 0 20 2
            # 1 4 4
            # 2 60 5
            # 3 4 6
            # 4 8 7

            # A B
            # 0 1 1
            # 1 2 2
            # 2 3 3
            # 3 4 10
            # 4 5 5
            # 5 6 6
            # 6 7 70
            # 7 8 8
            # 8 9 9
            # 9 10 10

            # Make a mapping of the values of our second mask.
            mapping = dict(zip(temp2['A'], temp2['B']))

            # We apply the mapping to each row. If we find the occurence, replace, else, default.
            temp['B'] = temp['A'].apply(lambda x:mapping[x] if x in mapping else 'No matching')
            print(temp)
            # A B
            # 0 20 No matching
            # 1 4 10
            # 2 60 No matching
            # 3 4 10
            # 4 8 8





            share|improve this answer




























              0














              You should have put a Minimal, Complete and Verifiable Example. Please, make sure in the future we can run your code just by pasting into our IDE. I spent way too much time on that question haha



              import pandas as pd

              temp = pd.DataFrame({'A' : [20, 4, 60, 4, 8], 'B' : [2, 4, 5, 6, 7]})
              temp2 = pd.DataFrame({'A' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'B' : [1, 2, 3, 10, 5, 6, 70, 8, 9, 10]})
              print(temp)
              print(temp2)
              # A B
              # 0 20 2
              # 1 4 4
              # 2 60 5
              # 3 4 6
              # 4 8 7

              # A B
              # 0 1 1
              # 1 2 2
              # 2 3 3
              # 3 4 10
              # 4 5 5
              # 5 6 6
              # 6 7 70
              # 7 8 8
              # 8 9 9
              # 9 10 10

              # Make a mapping of the values of our second mask.
              mapping = dict(zip(temp2['A'], temp2['B']))

              # We apply the mapping to each row. If we find the occurence, replace, else, default.
              temp['B'] = temp['A'].apply(lambda x:mapping[x] if x in mapping else 'No matching')
              print(temp)
              # A B
              # 0 20 No matching
              # 1 4 10
              # 2 60 No matching
              # 3 4 10
              # 4 8 8





              share|improve this answer


























                0












                0








                0







                You should have put a Minimal, Complete and Verifiable Example. Please, make sure in the future we can run your code just by pasting into our IDE. I spent way too much time on that question haha



                import pandas as pd

                temp = pd.DataFrame({'A' : [20, 4, 60, 4, 8], 'B' : [2, 4, 5, 6, 7]})
                temp2 = pd.DataFrame({'A' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'B' : [1, 2, 3, 10, 5, 6, 70, 8, 9, 10]})
                print(temp)
                print(temp2)
                # A B
                # 0 20 2
                # 1 4 4
                # 2 60 5
                # 3 4 6
                # 4 8 7

                # A B
                # 0 1 1
                # 1 2 2
                # 2 3 3
                # 3 4 10
                # 4 5 5
                # 5 6 6
                # 6 7 70
                # 7 8 8
                # 8 9 9
                # 9 10 10

                # Make a mapping of the values of our second mask.
                mapping = dict(zip(temp2['A'], temp2['B']))

                # We apply the mapping to each row. If we find the occurence, replace, else, default.
                temp['B'] = temp['A'].apply(lambda x:mapping[x] if x in mapping else 'No matching')
                print(temp)
                # A B
                # 0 20 No matching
                # 1 4 10
                # 2 60 No matching
                # 3 4 10
                # 4 8 8





                share|improve this answer













                You should have put a Minimal, Complete and Verifiable Example. Please, make sure in the future we can run your code just by pasting into our IDE. I spent way too much time on that question haha



                import pandas as pd

                temp = pd.DataFrame({'A' : [20, 4, 60, 4, 8], 'B' : [2, 4, 5, 6, 7]})
                temp2 = pd.DataFrame({'A' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'B' : [1, 2, 3, 10, 5, 6, 70, 8, 9, 10]})
                print(temp)
                print(temp2)
                # A B
                # 0 20 2
                # 1 4 4
                # 2 60 5
                # 3 4 6
                # 4 8 7

                # A B
                # 0 1 1
                # 1 2 2
                # 2 3 3
                # 3 4 10
                # 4 5 5
                # 5 6 6
                # 6 7 70
                # 7 8 8
                # 8 9 9
                # 9 10 10

                # Make a mapping of the values of our second mask.
                mapping = dict(zip(temp2['A'], temp2['B']))

                # We apply the mapping to each row. If we find the occurence, replace, else, default.
                temp['B'] = temp['A'].apply(lambda x:mapping[x] if x in mapping else 'No matching')
                print(temp)
                # A B
                # 0 20 No matching
                # 1 4 10
                # 2 60 No matching
                # 3 4 10
                # 4 8 8






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Jan 19 at 2:39









                IMCoinsIMCoins

                1,531419




                1,531419






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54262874%2fhow-can-i-compare-two-data-frames-in-pandas-and-update-values-based-on-keys%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Liquibase includeAll doesn't find base path

                    How to use setInterval in EJS file?

                    Petrus Granier-Deferre