How can a decision tree classifier work with global constraints?
I generated a decision tree classifier with sklearn in Python, which works well in terms of accuracy. I train the classifier with the optimal solution of a linear program, which returns an optimal assignment of items to classes while, considering a global cost-constraint (i.e. assigning item 1 to class A comes at a cost of x. Total resulting costs over all items and classes need to be smaller than a value y).
After reclassifying all items with the classifier, while the accuracy is acceptable, the global cost-constraint is violated in most classification runs. Naturally so, since the standard decision tree from sklearn in python does not consider the constraint.
Is there a way to incorporate global constraints to be upheld after classification? Is there a way to force the tree to consider all already classified items when making the next assignment choice? I assume this would require to establish some sort of cost- or penalty-function to be checked during classification by the tree.
python machine-learning scikit-learn classification decision-tree
add a comment |
I generated a decision tree classifier with sklearn in Python, which works well in terms of accuracy. I train the classifier with the optimal solution of a linear program, which returns an optimal assignment of items to classes while, considering a global cost-constraint (i.e. assigning item 1 to class A comes at a cost of x. Total resulting costs over all items and classes need to be smaller than a value y).
After reclassifying all items with the classifier, while the accuracy is acceptable, the global cost-constraint is violated in most classification runs. Naturally so, since the standard decision tree from sklearn in python does not consider the constraint.
Is there a way to incorporate global constraints to be upheld after classification? Is there a way to force the tree to consider all already classified items when making the next assignment choice? I assume this would require to establish some sort of cost- or penalty-function to be checked during classification by the tree.
python machine-learning scikit-learn classification decision-tree
add a comment |
I generated a decision tree classifier with sklearn in Python, which works well in terms of accuracy. I train the classifier with the optimal solution of a linear program, which returns an optimal assignment of items to classes while, considering a global cost-constraint (i.e. assigning item 1 to class A comes at a cost of x. Total resulting costs over all items and classes need to be smaller than a value y).
After reclassifying all items with the classifier, while the accuracy is acceptable, the global cost-constraint is violated in most classification runs. Naturally so, since the standard decision tree from sklearn in python does not consider the constraint.
Is there a way to incorporate global constraints to be upheld after classification? Is there a way to force the tree to consider all already classified items when making the next assignment choice? I assume this would require to establish some sort of cost- or penalty-function to be checked during classification by the tree.
python machine-learning scikit-learn classification decision-tree
I generated a decision tree classifier with sklearn in Python, which works well in terms of accuracy. I train the classifier with the optimal solution of a linear program, which returns an optimal assignment of items to classes while, considering a global cost-constraint (i.e. assigning item 1 to class A comes at a cost of x. Total resulting costs over all items and classes need to be smaller than a value y).
After reclassifying all items with the classifier, while the accuracy is acceptable, the global cost-constraint is violated in most classification runs. Naturally so, since the standard decision tree from sklearn in python does not consider the constraint.
Is there a way to incorporate global constraints to be upheld after classification? Is there a way to force the tree to consider all already classified items when making the next assignment choice? I assume this would require to establish some sort of cost- or penalty-function to be checked during classification by the tree.
python machine-learning scikit-learn classification decision-tree
python machine-learning scikit-learn classification decision-tree
edited Jan 19 at 21:51
JoeS
asked Jan 19 at 11:43
JoeSJoeS
134
134
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Decision trees as implemented in sklearn are built only based on a splitting criteria that considers Gini coefficient, entropy or information gain. Custom loss functions are not possible.
However Gradient Boosted Trees, such as XGboost, LightGBM and CatBoost allow to specify your own loss functions. A tutorial can be found here:
https://towardsdatascience.com/custom-loss-functions-for-gradient-boosting-f79c1b40466d
You would then incorporate a penalty term for violating your constraint into the loss function.
thank you very much for the comment, @jonnor! very helpful and much appreciated!
– JoeS
Jan 24 at 9:10
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54266710%2fhow-can-a-decision-tree-classifier-work-with-global-constraints%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Decision trees as implemented in sklearn are built only based on a splitting criteria that considers Gini coefficient, entropy or information gain. Custom loss functions are not possible.
However Gradient Boosted Trees, such as XGboost, LightGBM and CatBoost allow to specify your own loss functions. A tutorial can be found here:
https://towardsdatascience.com/custom-loss-functions-for-gradient-boosting-f79c1b40466d
You would then incorporate a penalty term for violating your constraint into the loss function.
thank you very much for the comment, @jonnor! very helpful and much appreciated!
– JoeS
Jan 24 at 9:10
add a comment |
Decision trees as implemented in sklearn are built only based on a splitting criteria that considers Gini coefficient, entropy or information gain. Custom loss functions are not possible.
However Gradient Boosted Trees, such as XGboost, LightGBM and CatBoost allow to specify your own loss functions. A tutorial can be found here:
https://towardsdatascience.com/custom-loss-functions-for-gradient-boosting-f79c1b40466d
You would then incorporate a penalty term for violating your constraint into the loss function.
thank you very much for the comment, @jonnor! very helpful and much appreciated!
– JoeS
Jan 24 at 9:10
add a comment |
Decision trees as implemented in sklearn are built only based on a splitting criteria that considers Gini coefficient, entropy or information gain. Custom loss functions are not possible.
However Gradient Boosted Trees, such as XGboost, LightGBM and CatBoost allow to specify your own loss functions. A tutorial can be found here:
https://towardsdatascience.com/custom-loss-functions-for-gradient-boosting-f79c1b40466d
You would then incorporate a penalty term for violating your constraint into the loss function.
Decision trees as implemented in sklearn are built only based on a splitting criteria that considers Gini coefficient, entropy or information gain. Custom loss functions are not possible.
However Gradient Boosted Trees, such as XGboost, LightGBM and CatBoost allow to specify your own loss functions. A tutorial can be found here:
https://towardsdatascience.com/custom-loss-functions-for-gradient-boosting-f79c1b40466d
You would then incorporate a penalty term for violating your constraint into the loss function.
answered Jan 23 at 0:48
jonnorjonnor
70349
70349
thank you very much for the comment, @jonnor! very helpful and much appreciated!
– JoeS
Jan 24 at 9:10
add a comment |
thank you very much for the comment, @jonnor! very helpful and much appreciated!
– JoeS
Jan 24 at 9:10
thank you very much for the comment, @jonnor! very helpful and much appreciated!
– JoeS
Jan 24 at 9:10
thank you very much for the comment, @jonnor! very helpful and much appreciated!
– JoeS
Jan 24 at 9:10
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54266710%2fhow-can-a-decision-tree-classifier-work-with-global-constraints%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown