pandas DataFrame.query expression that returns all rows by default
I have discovered the pandas DataFrame.query method and it almost does exactly what I needed it to (and implemented my own parser for, since I hadn't realized it existed but really I should be using the standard method).
I would like my users to be able to specify the query in a configuration file. The syntax seems intuitive enough that I can expect my non-programmer (but engineer) users to figure it out.
There's just one thing missing: a way to select everything in the dataframe. Sometimes what my users want to use is every row, so they would put 'All' or something into that configuration option. In fact, that will be the default option.
I tried df.query('True') but that raised a KeyError. I tried df.query('1') but that returned the row with index 1. The empty string raised a ValueError.
The only things I can think of are 1) put an if clause every time I need to do this type of query (probably 3 or 4 times in the code) or 2) subclass DataFrame and either reimplement query, or add a query_with_all method:
import pandas as pd
class MyDataFrame(pd.DataFrame):
def query_with_all(self, query_string):
if query_string.lower() == 'all':
return self
else:
return self.query(query_string)
And then use my own class every time instead of the pandas one. Is this the only way to do this?
python pandas dataframe
|
show 4 more comments
I have discovered the pandas DataFrame.query method and it almost does exactly what I needed it to (and implemented my own parser for, since I hadn't realized it existed but really I should be using the standard method).
I would like my users to be able to specify the query in a configuration file. The syntax seems intuitive enough that I can expect my non-programmer (but engineer) users to figure it out.
There's just one thing missing: a way to select everything in the dataframe. Sometimes what my users want to use is every row, so they would put 'All' or something into that configuration option. In fact, that will be the default option.
I tried df.query('True') but that raised a KeyError. I tried df.query('1') but that returned the row with index 1. The empty string raised a ValueError.
The only things I can think of are 1) put an if clause every time I need to do this type of query (probably 3 or 4 times in the code) or 2) subclass DataFrame and either reimplement query, or add a query_with_all method:
import pandas as pd
class MyDataFrame(pd.DataFrame):
def query_with_all(self, query_string):
if query_string.lower() == 'all':
return self
else:
return self.query(query_string)
And then use my own class every time instead of the pandas one. Is this the only way to do this?
python pandas dataframe
If the users knows the column names upfront, he coulddf.query('a == a')whereais one of the columns, but doesn't seem clean. Ah, may not work for rows withnull
– Zero
Oct 19 '17 at 3:38
Or, have a globalall_true = [True]*len(df)and then refer itdf.query('@all_true ')perhaps? Or, have a all True reserved column if that isn't a constraint and referdf.query('_all_true_col')?
– Zero
Oct 19 '17 at 3:42
Zero, the columns will change, but there is one column that is absolutely required to be there and not be Null, so I will keep that in mind as an option. I don't think I would make my users put that in the config file, but rather would replace 'all' with that for internal use. But still not as clean as I would like, as you mention..
– moink
Oct 19 '17 at 3:43
Zero, as to your second suggestion, I would need to use the same query on different dataframes of different lengths, without knowing the length ahead of time.
– moink
Oct 19 '17 at 3:48
2
@Thomas, I ended up implementing my own module with something quite similar to the code I showed, though I didn't end up using inheritance, and several other functions on queries
– moink
Jun 22 '18 at 12:02
|
show 4 more comments
I have discovered the pandas DataFrame.query method and it almost does exactly what I needed it to (and implemented my own parser for, since I hadn't realized it existed but really I should be using the standard method).
I would like my users to be able to specify the query in a configuration file. The syntax seems intuitive enough that I can expect my non-programmer (but engineer) users to figure it out.
There's just one thing missing: a way to select everything in the dataframe. Sometimes what my users want to use is every row, so they would put 'All' or something into that configuration option. In fact, that will be the default option.
I tried df.query('True') but that raised a KeyError. I tried df.query('1') but that returned the row with index 1. The empty string raised a ValueError.
The only things I can think of are 1) put an if clause every time I need to do this type of query (probably 3 or 4 times in the code) or 2) subclass DataFrame and either reimplement query, or add a query_with_all method:
import pandas as pd
class MyDataFrame(pd.DataFrame):
def query_with_all(self, query_string):
if query_string.lower() == 'all':
return self
else:
return self.query(query_string)
And then use my own class every time instead of the pandas one. Is this the only way to do this?
python pandas dataframe
I have discovered the pandas DataFrame.query method and it almost does exactly what I needed it to (and implemented my own parser for, since I hadn't realized it existed but really I should be using the standard method).
I would like my users to be able to specify the query in a configuration file. The syntax seems intuitive enough that I can expect my non-programmer (but engineer) users to figure it out.
There's just one thing missing: a way to select everything in the dataframe. Sometimes what my users want to use is every row, so they would put 'All' or something into that configuration option. In fact, that will be the default option.
I tried df.query('True') but that raised a KeyError. I tried df.query('1') but that returned the row with index 1. The empty string raised a ValueError.
The only things I can think of are 1) put an if clause every time I need to do this type of query (probably 3 or 4 times in the code) or 2) subclass DataFrame and either reimplement query, or add a query_with_all method:
import pandas as pd
class MyDataFrame(pd.DataFrame):
def query_with_all(self, query_string):
if query_string.lower() == 'all':
return self
else:
return self.query(query_string)
And then use my own class every time instead of the pandas one. Is this the only way to do this?
python pandas dataframe
python pandas dataframe
edited Dec 20 '18 at 10:09
moink
asked Oct 19 '17 at 3:31
moinkmoink
24529
24529
If the users knows the column names upfront, he coulddf.query('a == a')whereais one of the columns, but doesn't seem clean. Ah, may not work for rows withnull
– Zero
Oct 19 '17 at 3:38
Or, have a globalall_true = [True]*len(df)and then refer itdf.query('@all_true ')perhaps? Or, have a all True reserved column if that isn't a constraint and referdf.query('_all_true_col')?
– Zero
Oct 19 '17 at 3:42
Zero, the columns will change, but there is one column that is absolutely required to be there and not be Null, so I will keep that in mind as an option. I don't think I would make my users put that in the config file, but rather would replace 'all' with that for internal use. But still not as clean as I would like, as you mention..
– moink
Oct 19 '17 at 3:43
Zero, as to your second suggestion, I would need to use the same query on different dataframes of different lengths, without knowing the length ahead of time.
– moink
Oct 19 '17 at 3:48
2
@Thomas, I ended up implementing my own module with something quite similar to the code I showed, though I didn't end up using inheritance, and several other functions on queries
– moink
Jun 22 '18 at 12:02
|
show 4 more comments
If the users knows the column names upfront, he coulddf.query('a == a')whereais one of the columns, but doesn't seem clean. Ah, may not work for rows withnull
– Zero
Oct 19 '17 at 3:38
Or, have a globalall_true = [True]*len(df)and then refer itdf.query('@all_true ')perhaps? Or, have a all True reserved column if that isn't a constraint and referdf.query('_all_true_col')?
– Zero
Oct 19 '17 at 3:42
Zero, the columns will change, but there is one column that is absolutely required to be there and not be Null, so I will keep that in mind as an option. I don't think I would make my users put that in the config file, but rather would replace 'all' with that for internal use. But still not as clean as I would like, as you mention..
– moink
Oct 19 '17 at 3:43
Zero, as to your second suggestion, I would need to use the same query on different dataframes of different lengths, without knowing the length ahead of time.
– moink
Oct 19 '17 at 3:48
2
@Thomas, I ended up implementing my own module with something quite similar to the code I showed, though I didn't end up using inheritance, and several other functions on queries
– moink
Jun 22 '18 at 12:02
If the users knows the column names upfront, he could
df.query('a == a') where a is one of the columns, but doesn't seem clean. Ah, may not work for rows with null– Zero
Oct 19 '17 at 3:38
If the users knows the column names upfront, he could
df.query('a == a') where a is one of the columns, but doesn't seem clean. Ah, may not work for rows with null– Zero
Oct 19 '17 at 3:38
Or, have a global
all_true = [True]*len(df) and then refer it df.query('@all_true ') perhaps? Or, have a all True reserved column if that isn't a constraint and refer df.query('_all_true_col')?– Zero
Oct 19 '17 at 3:42
Or, have a global
all_true = [True]*len(df) and then refer it df.query('@all_true ') perhaps? Or, have a all True reserved column if that isn't a constraint and refer df.query('_all_true_col')?– Zero
Oct 19 '17 at 3:42
Zero, the columns will change, but there is one column that is absolutely required to be there and not be Null, so I will keep that in mind as an option. I don't think I would make my users put that in the config file, but rather would replace 'all' with that for internal use. But still not as clean as I would like, as you mention..
– moink
Oct 19 '17 at 3:43
Zero, the columns will change, but there is one column that is absolutely required to be there and not be Null, so I will keep that in mind as an option. I don't think I would make my users put that in the config file, but rather would replace 'all' with that for internal use. But still not as clean as I would like, as you mention..
– moink
Oct 19 '17 at 3:43
Zero, as to your second suggestion, I would need to use the same query on different dataframes of different lengths, without knowing the length ahead of time.
– moink
Oct 19 '17 at 3:48
Zero, as to your second suggestion, I would need to use the same query on different dataframes of different lengths, without knowing the length ahead of time.
– moink
Oct 19 '17 at 3:48
2
2
@Thomas, I ended up implementing my own module with something quite similar to the code I showed, though I didn't end up using inheritance, and several other functions on queries
– moink
Jun 22 '18 at 12:02
@Thomas, I ended up implementing my own module with something quite similar to the code I showed, though I didn't end up using inheritance, and several other functions on queries
– moink
Jun 22 '18 at 12:02
|
show 4 more comments
2 Answers
2
active
oldest
votes
Keep things simple, and use a function:
def query_with_all(data_frame, query_string):
if query_string == "all":
return data_frame
return data_frame.query(query_string)
Whenever you need to use this type of query, just call the function with the data frame and the query string. There's no need to use any extra if statements or subclass pd.Dataframe.
If you're restricted to using df.query, you can use a global variable
ALL = slice(None)
df.query('@ALL', engine='python')
If you're not allowed to use global variables, and if your DataFrame isn't MultiIndexed, you can use
df.query('tuple()')
All of these will property handle NaN values.
Well, this is an obvious choice (and also in the OP), but the idea would be to keep this inside query if at all possible?
– coldspeed
Dec 20 '18 at 5:54
1
@coldspeed Sorry for not reading your post / the comments thoroughly. I've added two solutions that stay (mostly) inside the query.
– Joshua
Dec 20 '18 at 6:51
2
Hmm, I've tried both, and both throw errors. Did you use any options withquery? The first one gives "ValueError: unknown type object" and the second one "TypeError: unsupported expression type: <class 'tuple'>". Any idea?
– coldspeed
Dec 20 '18 at 6:53
What versions are running? I have pd.__version__ = '0.23.4', np.__version__ = '1.15.4', sys.version='3.7.1 (default, Oct 23 2018, 14:07:42) n[Clang 4.0.1 (tags/RELEASE_401/final)]'.
– Joshua
Dec 20 '18 at 16:45
Same versions, no difference. I think these will work if you add engine='python' as an argument. Your second option will not work on MultiIndexed dataframes.
– coldspeed
Dec 20 '18 at 18:59
|
show 1 more comment
df.query('ilevel_0 in ilevel_0') will always return the full dataframe, also when the index contains NaN values or even when the dataframe is completely empty.
In you particular case you could then define a global variable all_true = 'ilevel_0 in ilevel_0' (as suggested in the comments by Zero) so that your engineers could use the name of the global variable in their config file instead.
This statement is just a dirty way to properly query True like you already tried. ilevel_0 is a more formal way of making sure you are referring the index. See the docs here for more details on using in and ilevel_0: https://pandas.pydata.org/pandas-docs/stable/indexing.html#the-query-method
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f46822423%2fpandas-dataframe-query-expression-that-returns-all-rows-by-default%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Keep things simple, and use a function:
def query_with_all(data_frame, query_string):
if query_string == "all":
return data_frame
return data_frame.query(query_string)
Whenever you need to use this type of query, just call the function with the data frame and the query string. There's no need to use any extra if statements or subclass pd.Dataframe.
If you're restricted to using df.query, you can use a global variable
ALL = slice(None)
df.query('@ALL', engine='python')
If you're not allowed to use global variables, and if your DataFrame isn't MultiIndexed, you can use
df.query('tuple()')
All of these will property handle NaN values.
Well, this is an obvious choice (and also in the OP), but the idea would be to keep this inside query if at all possible?
– coldspeed
Dec 20 '18 at 5:54
1
@coldspeed Sorry for not reading your post / the comments thoroughly. I've added two solutions that stay (mostly) inside the query.
– Joshua
Dec 20 '18 at 6:51
2
Hmm, I've tried both, and both throw errors. Did you use any options withquery? The first one gives "ValueError: unknown type object" and the second one "TypeError: unsupported expression type: <class 'tuple'>". Any idea?
– coldspeed
Dec 20 '18 at 6:53
What versions are running? I have pd.__version__ = '0.23.4', np.__version__ = '1.15.4', sys.version='3.7.1 (default, Oct 23 2018, 14:07:42) n[Clang 4.0.1 (tags/RELEASE_401/final)]'.
– Joshua
Dec 20 '18 at 16:45
Same versions, no difference. I think these will work if you add engine='python' as an argument. Your second option will not work on MultiIndexed dataframes.
– coldspeed
Dec 20 '18 at 18:59
|
show 1 more comment
Keep things simple, and use a function:
def query_with_all(data_frame, query_string):
if query_string == "all":
return data_frame
return data_frame.query(query_string)
Whenever you need to use this type of query, just call the function with the data frame and the query string. There's no need to use any extra if statements or subclass pd.Dataframe.
If you're restricted to using df.query, you can use a global variable
ALL = slice(None)
df.query('@ALL', engine='python')
If you're not allowed to use global variables, and if your DataFrame isn't MultiIndexed, you can use
df.query('tuple()')
All of these will property handle NaN values.
Well, this is an obvious choice (and also in the OP), but the idea would be to keep this inside query if at all possible?
– coldspeed
Dec 20 '18 at 5:54
1
@coldspeed Sorry for not reading your post / the comments thoroughly. I've added two solutions that stay (mostly) inside the query.
– Joshua
Dec 20 '18 at 6:51
2
Hmm, I've tried both, and both throw errors. Did you use any options withquery? The first one gives "ValueError: unknown type object" and the second one "TypeError: unsupported expression type: <class 'tuple'>". Any idea?
– coldspeed
Dec 20 '18 at 6:53
What versions are running? I have pd.__version__ = '0.23.4', np.__version__ = '1.15.4', sys.version='3.7.1 (default, Oct 23 2018, 14:07:42) n[Clang 4.0.1 (tags/RELEASE_401/final)]'.
– Joshua
Dec 20 '18 at 16:45
Same versions, no difference. I think these will work if you add engine='python' as an argument. Your second option will not work on MultiIndexed dataframes.
– coldspeed
Dec 20 '18 at 18:59
|
show 1 more comment
Keep things simple, and use a function:
def query_with_all(data_frame, query_string):
if query_string == "all":
return data_frame
return data_frame.query(query_string)
Whenever you need to use this type of query, just call the function with the data frame and the query string. There's no need to use any extra if statements or subclass pd.Dataframe.
If you're restricted to using df.query, you can use a global variable
ALL = slice(None)
df.query('@ALL', engine='python')
If you're not allowed to use global variables, and if your DataFrame isn't MultiIndexed, you can use
df.query('tuple()')
All of these will property handle NaN values.
Keep things simple, and use a function:
def query_with_all(data_frame, query_string):
if query_string == "all":
return data_frame
return data_frame.query(query_string)
Whenever you need to use this type of query, just call the function with the data frame and the query string. There's no need to use any extra if statements or subclass pd.Dataframe.
If you're restricted to using df.query, you can use a global variable
ALL = slice(None)
df.query('@ALL', engine='python')
If you're not allowed to use global variables, and if your DataFrame isn't MultiIndexed, you can use
df.query('tuple()')
All of these will property handle NaN values.
edited Dec 20 '18 at 19:01
coldspeed
126k23127214
126k23127214
answered Dec 20 '18 at 1:15
JoshuaJoshua
1,623717
1,623717
Well, this is an obvious choice (and also in the OP), but the idea would be to keep this inside query if at all possible?
– coldspeed
Dec 20 '18 at 5:54
1
@coldspeed Sorry for not reading your post / the comments thoroughly. I've added two solutions that stay (mostly) inside the query.
– Joshua
Dec 20 '18 at 6:51
2
Hmm, I've tried both, and both throw errors. Did you use any options withquery? The first one gives "ValueError: unknown type object" and the second one "TypeError: unsupported expression type: <class 'tuple'>". Any idea?
– coldspeed
Dec 20 '18 at 6:53
What versions are running? I have pd.__version__ = '0.23.4', np.__version__ = '1.15.4', sys.version='3.7.1 (default, Oct 23 2018, 14:07:42) n[Clang 4.0.1 (tags/RELEASE_401/final)]'.
– Joshua
Dec 20 '18 at 16:45
Same versions, no difference. I think these will work if you add engine='python' as an argument. Your second option will not work on MultiIndexed dataframes.
– coldspeed
Dec 20 '18 at 18:59
|
show 1 more comment
Well, this is an obvious choice (and also in the OP), but the idea would be to keep this inside query if at all possible?
– coldspeed
Dec 20 '18 at 5:54
1
@coldspeed Sorry for not reading your post / the comments thoroughly. I've added two solutions that stay (mostly) inside the query.
– Joshua
Dec 20 '18 at 6:51
2
Hmm, I've tried both, and both throw errors. Did you use any options withquery? The first one gives "ValueError: unknown type object" and the second one "TypeError: unsupported expression type: <class 'tuple'>". Any idea?
– coldspeed
Dec 20 '18 at 6:53
What versions are running? I have pd.__version__ = '0.23.4', np.__version__ = '1.15.4', sys.version='3.7.1 (default, Oct 23 2018, 14:07:42) n[Clang 4.0.1 (tags/RELEASE_401/final)]'.
– Joshua
Dec 20 '18 at 16:45
Same versions, no difference. I think these will work if you add engine='python' as an argument. Your second option will not work on MultiIndexed dataframes.
– coldspeed
Dec 20 '18 at 18:59
Well, this is an obvious choice (and also in the OP), but the idea would be to keep this inside query if at all possible?
– coldspeed
Dec 20 '18 at 5:54
Well, this is an obvious choice (and also in the OP), but the idea would be to keep this inside query if at all possible?
– coldspeed
Dec 20 '18 at 5:54
1
1
@coldspeed Sorry for not reading your post / the comments thoroughly. I've added two solutions that stay (mostly) inside the query.
– Joshua
Dec 20 '18 at 6:51
@coldspeed Sorry for not reading your post / the comments thoroughly. I've added two solutions that stay (mostly) inside the query.
– Joshua
Dec 20 '18 at 6:51
2
2
Hmm, I've tried both, and both throw errors. Did you use any options with
query? The first one gives "ValueError: unknown type object" and the second one "TypeError: unsupported expression type: <class 'tuple'>". Any idea?– coldspeed
Dec 20 '18 at 6:53
Hmm, I've tried both, and both throw errors. Did you use any options with
query? The first one gives "ValueError: unknown type object" and the second one "TypeError: unsupported expression type: <class 'tuple'>". Any idea?– coldspeed
Dec 20 '18 at 6:53
What versions are running? I have pd.__version__ = '0.23.4', np.__version__ = '1.15.4', sys.version='3.7.1 (default, Oct 23 2018, 14:07:42) n[Clang 4.0.1 (tags/RELEASE_401/final)]'.
– Joshua
Dec 20 '18 at 16:45
What versions are running? I have pd.__version__ = '0.23.4', np.__version__ = '1.15.4', sys.version='3.7.1 (default, Oct 23 2018, 14:07:42) n[Clang 4.0.1 (tags/RELEASE_401/final)]'.
– Joshua
Dec 20 '18 at 16:45
Same versions, no difference. I think these will work if you add engine='python' as an argument. Your second option will not work on MultiIndexed dataframes.
– coldspeed
Dec 20 '18 at 18:59
Same versions, no difference. I think these will work if you add engine='python' as an argument. Your second option will not work on MultiIndexed dataframes.
– coldspeed
Dec 20 '18 at 18:59
|
show 1 more comment
df.query('ilevel_0 in ilevel_0') will always return the full dataframe, also when the index contains NaN values or even when the dataframe is completely empty.
In you particular case you could then define a global variable all_true = 'ilevel_0 in ilevel_0' (as suggested in the comments by Zero) so that your engineers could use the name of the global variable in their config file instead.
This statement is just a dirty way to properly query True like you already tried. ilevel_0 is a more formal way of making sure you are referring the index. See the docs here for more details on using in and ilevel_0: https://pandas.pydata.org/pandas-docs/stable/indexing.html#the-query-method
add a comment |
df.query('ilevel_0 in ilevel_0') will always return the full dataframe, also when the index contains NaN values or even when the dataframe is completely empty.
In you particular case you could then define a global variable all_true = 'ilevel_0 in ilevel_0' (as suggested in the comments by Zero) so that your engineers could use the name of the global variable in their config file instead.
This statement is just a dirty way to properly query True like you already tried. ilevel_0 is a more formal way of making sure you are referring the index. See the docs here for more details on using in and ilevel_0: https://pandas.pydata.org/pandas-docs/stable/indexing.html#the-query-method
add a comment |
df.query('ilevel_0 in ilevel_0') will always return the full dataframe, also when the index contains NaN values or even when the dataframe is completely empty.
In you particular case you could then define a global variable all_true = 'ilevel_0 in ilevel_0' (as suggested in the comments by Zero) so that your engineers could use the name of the global variable in their config file instead.
This statement is just a dirty way to properly query True like you already tried. ilevel_0 is a more formal way of making sure you are referring the index. See the docs here for more details on using in and ilevel_0: https://pandas.pydata.org/pandas-docs/stable/indexing.html#the-query-method
df.query('ilevel_0 in ilevel_0') will always return the full dataframe, also when the index contains NaN values or even when the dataframe is completely empty.
In you particular case you could then define a global variable all_true = 'ilevel_0 in ilevel_0' (as suggested in the comments by Zero) so that your engineers could use the name of the global variable in their config file instead.
This statement is just a dirty way to properly query True like you already tried. ilevel_0 is a more formal way of making sure you are referring the index. See the docs here for more details on using in and ilevel_0: https://pandas.pydata.org/pandas-docs/stable/indexing.html#the-query-method
answered Dec 20 '18 at 7:32
jorijnsmitjorijnsmit
614522
614522
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f46822423%2fpandas-dataframe-query-expression-that-returns-all-rows-by-default%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
If the users knows the column names upfront, he could
df.query('a == a')whereais one of the columns, but doesn't seem clean. Ah, may not work for rows withnull– Zero
Oct 19 '17 at 3:38
Or, have a global
all_true = [True]*len(df)and then refer itdf.query('@all_true ')perhaps? Or, have a all True reserved column if that isn't a constraint and referdf.query('_all_true_col')?– Zero
Oct 19 '17 at 3:42
Zero, the columns will change, but there is one column that is absolutely required to be there and not be Null, so I will keep that in mind as an option. I don't think I would make my users put that in the config file, but rather would replace 'all' with that for internal use. But still not as clean as I would like, as you mention..
– moink
Oct 19 '17 at 3:43
Zero, as to your second suggestion, I would need to use the same query on different dataframes of different lengths, without knowing the length ahead of time.
– moink
Oct 19 '17 at 3:48
2
@Thomas, I ended up implementing my own module with something quite similar to the code I showed, though I didn't end up using inheritance, and several other functions on queries
– moink
Jun 22 '18 at 12:02