How can a decision tree classifier work with global constraints?












0















I generated a decision tree classifier with sklearn in Python, which works well in terms of accuracy. I train the classifier with the optimal solution of a linear program, which returns an optimal assignment of items to classes while, considering a global cost-constraint (i.e. assigning item 1 to class A comes at a cost of x. Total resulting costs over all items and classes need to be smaller than a value y).



After reclassifying all items with the classifier, while the accuracy is acceptable, the global cost-constraint is violated in most classification runs. Naturally so, since the standard decision tree from sklearn in python does not consider the constraint.



Is there a way to incorporate global constraints to be upheld after classification? Is there a way to force the tree to consider all already classified items when making the next assignment choice? I assume this would require to establish some sort of cost- or penalty-function to be checked during classification by the tree.










share|improve this question





























    0















    I generated a decision tree classifier with sklearn in Python, which works well in terms of accuracy. I train the classifier with the optimal solution of a linear program, which returns an optimal assignment of items to classes while, considering a global cost-constraint (i.e. assigning item 1 to class A comes at a cost of x. Total resulting costs over all items and classes need to be smaller than a value y).



    After reclassifying all items with the classifier, while the accuracy is acceptable, the global cost-constraint is violated in most classification runs. Naturally so, since the standard decision tree from sklearn in python does not consider the constraint.



    Is there a way to incorporate global constraints to be upheld after classification? Is there a way to force the tree to consider all already classified items when making the next assignment choice? I assume this would require to establish some sort of cost- or penalty-function to be checked during classification by the tree.










    share|improve this question



























      0












      0








      0


      1






      I generated a decision tree classifier with sklearn in Python, which works well in terms of accuracy. I train the classifier with the optimal solution of a linear program, which returns an optimal assignment of items to classes while, considering a global cost-constraint (i.e. assigning item 1 to class A comes at a cost of x. Total resulting costs over all items and classes need to be smaller than a value y).



      After reclassifying all items with the classifier, while the accuracy is acceptable, the global cost-constraint is violated in most classification runs. Naturally so, since the standard decision tree from sklearn in python does not consider the constraint.



      Is there a way to incorporate global constraints to be upheld after classification? Is there a way to force the tree to consider all already classified items when making the next assignment choice? I assume this would require to establish some sort of cost- or penalty-function to be checked during classification by the tree.










      share|improve this question
















      I generated a decision tree classifier with sklearn in Python, which works well in terms of accuracy. I train the classifier with the optimal solution of a linear program, which returns an optimal assignment of items to classes while, considering a global cost-constraint (i.e. assigning item 1 to class A comes at a cost of x. Total resulting costs over all items and classes need to be smaller than a value y).



      After reclassifying all items with the classifier, while the accuracy is acceptable, the global cost-constraint is violated in most classification runs. Naturally so, since the standard decision tree from sklearn in python does not consider the constraint.



      Is there a way to incorporate global constraints to be upheld after classification? Is there a way to force the tree to consider all already classified items when making the next assignment choice? I assume this would require to establish some sort of cost- or penalty-function to be checked during classification by the tree.







      python machine-learning scikit-learn classification decision-tree






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 19 at 21:51







      JoeS

















      asked Jan 19 at 11:43









      JoeSJoeS

      134




      134
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Decision trees as implemented in sklearn are built only based on a splitting criteria that considers Gini coefficient, entropy or information gain. Custom loss functions are not possible.



          However Gradient Boosted Trees, such as XGboost, LightGBM and CatBoost allow to specify your own loss functions. A tutorial can be found here:
          https://towardsdatascience.com/custom-loss-functions-for-gradient-boosting-f79c1b40466d



          You would then incorporate a penalty term for violating your constraint into the loss function.






          share|improve this answer
























          • thank you very much for the comment, @jonnor! very helpful and much appreciated!

            – JoeS
            Jan 24 at 9:10











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54266710%2fhow-can-a-decision-tree-classifier-work-with-global-constraints%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          Decision trees as implemented in sklearn are built only based on a splitting criteria that considers Gini coefficient, entropy or information gain. Custom loss functions are not possible.



          However Gradient Boosted Trees, such as XGboost, LightGBM and CatBoost allow to specify your own loss functions. A tutorial can be found here:
          https://towardsdatascience.com/custom-loss-functions-for-gradient-boosting-f79c1b40466d



          You would then incorporate a penalty term for violating your constraint into the loss function.






          share|improve this answer
























          • thank you very much for the comment, @jonnor! very helpful and much appreciated!

            – JoeS
            Jan 24 at 9:10
















          0














          Decision trees as implemented in sklearn are built only based on a splitting criteria that considers Gini coefficient, entropy or information gain. Custom loss functions are not possible.



          However Gradient Boosted Trees, such as XGboost, LightGBM and CatBoost allow to specify your own loss functions. A tutorial can be found here:
          https://towardsdatascience.com/custom-loss-functions-for-gradient-boosting-f79c1b40466d



          You would then incorporate a penalty term for violating your constraint into the loss function.






          share|improve this answer
























          • thank you very much for the comment, @jonnor! very helpful and much appreciated!

            – JoeS
            Jan 24 at 9:10














          0












          0








          0







          Decision trees as implemented in sklearn are built only based on a splitting criteria that considers Gini coefficient, entropy or information gain. Custom loss functions are not possible.



          However Gradient Boosted Trees, such as XGboost, LightGBM and CatBoost allow to specify your own loss functions. A tutorial can be found here:
          https://towardsdatascience.com/custom-loss-functions-for-gradient-boosting-f79c1b40466d



          You would then incorporate a penalty term for violating your constraint into the loss function.






          share|improve this answer













          Decision trees as implemented in sklearn are built only based on a splitting criteria that considers Gini coefficient, entropy or information gain. Custom loss functions are not possible.



          However Gradient Boosted Trees, such as XGboost, LightGBM and CatBoost allow to specify your own loss functions. A tutorial can be found here:
          https://towardsdatascience.com/custom-loss-functions-for-gradient-boosting-f79c1b40466d



          You would then incorporate a penalty term for violating your constraint into the loss function.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 23 at 0:48









          jonnorjonnor

          70349




          70349













          • thank you very much for the comment, @jonnor! very helpful and much appreciated!

            – JoeS
            Jan 24 at 9:10



















          • thank you very much for the comment, @jonnor! very helpful and much appreciated!

            – JoeS
            Jan 24 at 9:10

















          thank you very much for the comment, @jonnor! very helpful and much appreciated!

          – JoeS
          Jan 24 at 9:10





          thank you very much for the comment, @jonnor! very helpful and much appreciated!

          – JoeS
          Jan 24 at 9:10


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54266710%2fhow-can-a-decision-tree-classifier-work-with-global-constraints%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Liquibase includeAll doesn't find base path

          How to use setInterval in EJS file?

          Petrus Granier-Deferre