pandas: complex filter on rows of DataFrame Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live! Should we burninate the [wrap] tag?Complex Filtering of DataFrameHow to filter column values in the pandas dataframe with certain conditions?Count the number of observations between two datetimesAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeRenaming columns in pandasAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrame by column nameHow to drop rows of Pandas DataFrame whose value in certain columns is NaNHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers

Extract all GPU name, model and GPU ram

Why are Kinder Surprise Eggs illegal in the USA?

Can a USB port passively 'listen only'?

2001: A Space Odyssey's use of the song "Daisy Bell" (Bicycle Built for Two); life imitates art or vice-versa?

How to tell that you are a giant?

What does the "x" in "x86" represent?

Resolving to minmaj7

Why do we bend a book to keep it straight?

How to call a function with default parameter through a pointer to function that is the return of another function?

Apollo command module space walk?

Why did the IBM 650 use bi-quinary?

Dating a Former Employee

What LEGO pieces have "real-world" functionality?

Short Story with Cinderella as a Voo-doo Witch

Can a non-EU citizen traveling with me come with me through the EU passport line?

Can an alien society believe that their star system is the universe?

How to react to hostile behavior from a senior developer?

How to run gsettings for another user Ubuntu 18.04.2 LTS

Using et al. for a last / senior author rather than for a first author

What's the meaning of 間時肆拾貳 at a car parking sign

How discoverable are IPv6 addresses and AAAA names by potential attackers?

How to bypass password on Windows XP account?

Single word antonym of "flightless"

Is there a (better) way to access $wpdb results?

pandas: complex filter on rows of DataFrame

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)

Data science time! April 2019 and salary with experience

The Ask Question Wizard is Live!

Should we burninate the [wrap] tag?Complex Filtering of DataFrameHow to filter column values in the pandas dataframe with certain conditions?Count the number of observations between two datetimesAdd one row to pandas DataFrameSelecting multiple columns in a pandas dataframeRenaming columns in pandasAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrame by column nameHow to drop rows of Pandas DataFrame whose value in certain columns is NaNHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I would like to filter rows by a function of each row, e.g.

def f(row):
 return sin(row['velocity'])/np.prod(['masses']) > 5

df = pandas.DataFrame(...)
filtered = df[apply_to_all_rows(df, f)]

Or for another more complex, contrived example,

def g(row):
 if row['col1'].method1() == 1:
 val = row['col1'].method2() / row['col1'].method3(row['col3'], row['col4'])
 else:
 val = row['col2'].method5(row['col6'])
 return np.sin(val)

df = pandas.DataFrame(...)
filtered = df[apply_to_all_rows(df, g)]

How can I do so?

edited Mar 8 at 17:21

JJJ

73611221

asked Jul 10 '12 at 16:56

duckworthd

6,206114461

add a comment |

I would like to filter rows by a function of each row, e.g.

def f(row):
 return sin(row['velocity'])/np.prod(['masses']) > 5

df = pandas.DataFrame(...)
filtered = df[apply_to_all_rows(df, f)]

Or for another more complex, contrived example,

def g(row):
 if row['col1'].method1() == 1:
 val = row['col1'].method2() / row['col1'].method3(row['col3'], row['col4'])
 else:
 val = row['col2'].method5(row['col6'])
 return np.sin(val)

df = pandas.DataFrame(...)
filtered = df[apply_to_all_rows(df, g)]

How can I do so?

edited Mar 8 at 17:21

JJJ

73611221

asked Jul 10 '12 at 16:56

duckworthd

6,206114461

add a comment |

I would like to filter rows by a function of each row, e.g.

def f(row):
 return sin(row['velocity'])/np.prod(['masses']) > 5

df = pandas.DataFrame(...)
filtered = df[apply_to_all_rows(df, f)]

Or for another more complex, contrived example,

def g(row):
 if row['col1'].method1() == 1:
 val = row['col1'].method2() / row['col1'].method3(row['col3'], row['col4'])
 else:
 val = row['col2'].method5(row['col6'])
 return np.sin(val)

df = pandas.DataFrame(...)
filtered = df[apply_to_all_rows(df, g)]

How can I do so?

edited Mar 8 at 17:21

JJJ

73611221

asked Jul 10 '12 at 16:56

duckworthd

6,206114461

I would like to filter rows by a function of each row, e.g.

def f(row):
 return sin(row['velocity'])/np.prod(['masses']) > 5

df = pandas.DataFrame(...)
filtered = df[apply_to_all_rows(df, f)]

Or for another more complex, contrived example,

def g(row):
 if row['col1'].method1() == 1:
 val = row['col1'].method2() / row['col1'].method3(row['col3'], row['col4'])
 else:
 val = row['col2'].method5(row['col6'])
 return np.sin(val)

df = pandas.DataFrame(...)
filtered = df[apply_to_all_rows(df, g)]

How can I do so?

python pandas

edited Mar 8 at 17:21

JJJ

73611221

asked Jul 10 '12 at 16:56

duckworthd

6,206114461

edited Mar 8 at 17:21

JJJ

73611221

asked Jul 10 '12 at 16:56

duckworthd

6,206114461

edited Mar 8 at 17:21

JJJ

73611221

edited Mar 8 at 17:21

JJJ

73611221

edited Mar 8 at 17:21

JJJ

73611221

asked Jul 10 '12 at 16:56

duckworthd

6,206114461

asked Jul 10 '12 at 16:56

duckworthd

6,206114461

asked Jul 10 '12 at 16:56

duckworthd

6,206114461

add a comment |

5 Answers
5

active

oldest

votes

100

You can do this using DataFrame.apply, which applies a function along a given axis,

In [3]: df = pandas.DataFrame(np.random.randn(5, 3), columns=['a', 'b', 'c'])

In [4]: df
Out[4]: 
 a b c
0 -0.001968 -1.877945 -1.515674
1 -0.540628 0.793913 -0.983315
2 -1.313574 1.946410 0.826350
3 0.015763 -0.267860 -2.228350
4 0.563111 1.195459 0.343168

In [6]: df[df.apply(lambda x: x['b'] > x['c'], axis=1)]
Out[6]: 
 a b c
1 -0.540628 0.793913 -0.983315
2 -1.313574 1.946410 0.826350
3 0.015763 -0.267860 -2.228350
4 0.563111 1.195459 0.343168

answered Jul 13 '12 at 17:33

duckworthd

6,206114461

12

There is no need for apply in this situation. A regular boolean index will work just fine. df[df['b] > df['c']]. There are very few situations that actually require apply and even few that need it with axis=1

– Ted Petrou
Nov 6 '17 at 17:28

@TedPetrou What if your not sure that every element in your dataframe is of the right type. Does a regular boolean index support exception handling?

– D. Ror.
Oct 23 '18 at 19:48

add a comment |

Suppose I had a DataFrame as follows:

In [39]: df
Out[39]: 
 mass1 mass2 velocity
0 1.461711 -0.404452 0.722502
1 -2.169377 1.131037 0.232047
2 0.009450 -0.868753 0.598470
3 0.602463 0.299249 0.474564
4 -0.675339 -0.816702 0.799289

I can use sin and DataFrame.prod to create a boolean mask:

In [40]: mask = (np.sin(df.velocity) / df.ix[:, 0:2].prod(axis=1)) > 0

In [41]: mask
Out[41]: 
0 False
1 False
2 False
3 True
4 True

Then use the mask to select from the DataFrame:

In [42]: df[mask]
Out[42]: 
 mass1 mass2 velocity
3 0.602463 0.299249 0.474564
4 -0.675339 -0.816702 0.799289

answered Jul 10 '12 at 19:35

Chang She

11.1k33322

2

actually, this was probably a bad example: np.sin automatically broadcasts to all elements. What if I replaced it with a less intelligent function that could only handle one input at a time?

– duckworthd
Jul 10 '12 at 21:07

add a comment |

Specify reduce=True to handle empty DataFrames as well.

import pandas as pd

t = pd.DataFrame(columns=['a', 'b'])
t[t.apply(lambda x: x['a'] > 1, axis=1, reduce=True)]

https://crosscompute.com/n/jAbsB6OIm6oCCJX9PBIbY5FECFKCClyV/-/apply-custom-filter-on-rows-of-dataframe

edited May 16 '18 at 21:22

answered Oct 21 '17 at 17:31

Roy Hyunjin Han

3,35621917

add a comment |

I canot comment on duckworthd's answer, but it is not perfectly working. It crashes when the dataframe is empty:

df = pandas.DataFrame(columns=['a', 'b', 'c'])
df[df.apply(lambda x: x['b'] > x['c'], axis=1)]

Outputs:

ValueError: Must pass DataFrame with boolean values only

To me it looks like a bug in pandas, since is definitively a valid set of boolean values.

edited May 23 '17 at 12:34

Community♦

answered Jul 10 '15 at 12:16

cglacet

1,619820

add a comment |

The best approach I've found is, instead of using reduce=True to avoid errors for empty df (since this arg is deprecated anyway), just check that df size > 0 before applying the filter:

def my_filter(row):
 if row.columnA == something:
 return True

 return False

if len(df.index) > 0:
 df[df.apply(my_filter, axis=1)]

answered Jan 14 at 19:04

user553965

585612

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f11418192%2fpandas-complex-filter-on-rows-of-dataframe%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

100

You can do this using DataFrame.apply, which applies a function along a given axis,

In [3]: df = pandas.DataFrame(np.random.randn(5, 3), columns=['a', 'b', 'c'])

In [4]: df
Out[4]: 
 a b c
0 -0.001968 -1.877945 -1.515674
1 -0.540628 0.793913 -0.983315
2 -1.313574 1.946410 0.826350
3 0.015763 -0.267860 -2.228350
4 0.563111 1.195459 0.343168

In [6]: df[df.apply(lambda x: x['b'] > x['c'], axis=1)]
Out[6]: 
 a b c
1 -0.540628 0.793913 -0.983315
2 -1.313574 1.946410 0.826350
3 0.015763 -0.267860 -2.228350
4 0.563111 1.195459 0.343168

answered Jul 13 '12 at 17:33

duckworthd

6,206114461

12

There is no need for apply in this situation. A regular boolean index will work just fine. df[df['b] > df['c']]. There are very few situations that actually require apply and even few that need it with axis=1

– Ted Petrou
Nov 6 '17 at 17:28

@TedPetrou What if your not sure that every element in your dataframe is of the right type. Does a regular boolean index support exception handling?

– D. Ror.
Oct 23 '18 at 19:48

add a comment |

100

You can do this using DataFrame.apply, which applies a function along a given axis,

In [3]: df = pandas.DataFrame(np.random.randn(5, 3), columns=['a', 'b', 'c'])

In [4]: df
Out[4]: 
 a b c
0 -0.001968 -1.877945 -1.515674
1 -0.540628 0.793913 -0.983315
2 -1.313574 1.946410 0.826350
3 0.015763 -0.267860 -2.228350
4 0.563111 1.195459 0.343168

In [6]: df[df.apply(lambda x: x['b'] > x['c'], axis=1)]
Out[6]: 
 a b c
1 -0.540628 0.793913 -0.983315
2 -1.313574 1.946410 0.826350
3 0.015763 -0.267860 -2.228350
4 0.563111 1.195459 0.343168

answered Jul 13 '12 at 17:33

duckworthd

6,206114461

12

There is no need for apply in this situation. A regular boolean index will work just fine. df[df['b] > df['c']]. There are very few situations that actually require apply and even few that need it with axis=1

– Ted Petrou
Nov 6 '17 at 17:28

@TedPetrou What if your not sure that every element in your dataframe is of the right type. Does a regular boolean index support exception handling?

– D. Ror.
Oct 23 '18 at 19:48

add a comment |

100

You can do this using DataFrame.apply, which applies a function along a given axis,

In [3]: df = pandas.DataFrame(np.random.randn(5, 3), columns=['a', 'b', 'c'])

In [4]: df
Out[4]: 
 a b c
0 -0.001968 -1.877945 -1.515674
1 -0.540628 0.793913 -0.983315
2 -1.313574 1.946410 0.826350
3 0.015763 -0.267860 -2.228350
4 0.563111 1.195459 0.343168

In [6]: df[df.apply(lambda x: x['b'] > x['c'], axis=1)]
Out[6]: 
 a b c
1 -0.540628 0.793913 -0.983315
2 -1.313574 1.946410 0.826350
3 0.015763 -0.267860 -2.228350
4 0.563111 1.195459 0.343168

answered Jul 13 '12 at 17:33

duckworthd

6,206114461

You can do this using DataFrame.apply, which applies a function along a given axis,

In [3]: df = pandas.DataFrame(np.random.randn(5, 3), columns=['a', 'b', 'c'])

In [4]: df
Out[4]: 
 a b c
0 -0.001968 -1.877945 -1.515674
1 -0.540628 0.793913 -0.983315
2 -1.313574 1.946410 0.826350
3 0.015763 -0.267860 -2.228350
4 0.563111 1.195459 0.343168

In [6]: df[df.apply(lambda x: x['b'] > x['c'], axis=1)]
Out[6]: 
 a b c
1 -0.540628 0.793913 -0.983315
2 -1.313574 1.946410 0.826350
3 0.015763 -0.267860 -2.228350
4 0.563111 1.195459 0.343168

answered Jul 13 '12 at 17:33

duckworthd

6,206114461

answered Jul 13 '12 at 17:33

duckworthd

6,206114461

answered Jul 13 '12 at 17:33

duckworthd

6,206114461

answered Jul 13 '12 at 17:33

duckworthd

6,206114461

12

There is no need for apply in this situation. A regular boolean index will work just fine. df[df['b] > df['c']]. There are very few situations that actually require apply and even few that need it with axis=1

– Ted Petrou
Nov 6 '17 at 17:28

@TedPetrou What if your not sure that every element in your dataframe is of the right type. Does a regular boolean index support exception handling?

– D. Ror.
Oct 23 '18 at 19:48

add a comment |

12

There is no need for apply in this situation. A regular boolean index will work just fine. df[df['b] > df['c']]. There are very few situations that actually require apply and even few that need it with axis=1

– Ted Petrou
Nov 6 '17 at 17:28

@TedPetrou What if your not sure that every element in your dataframe is of the right type. Does a regular boolean index support exception handling?

– D. Ror.
Oct 23 '18 at 19:48

There is no need for apply in this situation. A regular boolean index will work just fine. df[df['b] > df['c']]. There are very few situations that actually require apply and even few that need it with axis=1

– Ted Petrou
Nov 6 '17 at 17:28

@TedPetrou What if your not sure that every element in your dataframe is of the right type. Does a regular boolean index support exception handling?

– D. Ror.
Oct 23 '18 at 19:48

add a comment |

Suppose I had a DataFrame as follows:

In [39]: df
Out[39]: 
 mass1 mass2 velocity
0 1.461711 -0.404452 0.722502
1 -2.169377 1.131037 0.232047
2 0.009450 -0.868753 0.598470
3 0.602463 0.299249 0.474564
4 -0.675339 -0.816702 0.799289

I can use sin and DataFrame.prod to create a boolean mask:

In [40]: mask = (np.sin(df.velocity) / df.ix[:, 0:2].prod(axis=1)) > 0

In [41]: mask
Out[41]: 
0 False
1 False
2 False
3 True
4 True

Then use the mask to select from the DataFrame:

In [42]: df[mask]
Out[42]: 
 mass1 mass2 velocity
3 0.602463 0.299249 0.474564
4 -0.675339 -0.816702 0.799289

answered Jul 10 '12 at 19:35

Chang She

11.1k33322

2

actually, this was probably a bad example: np.sin automatically broadcasts to all elements. What if I replaced it with a less intelligent function that could only handle one input at a time?

– duckworthd
Jul 10 '12 at 21:07

add a comment |

Suppose I had a DataFrame as follows:

In [39]: df
Out[39]: 
 mass1 mass2 velocity
0 1.461711 -0.404452 0.722502
1 -2.169377 1.131037 0.232047
2 0.009450 -0.868753 0.598470
3 0.602463 0.299249 0.474564
4 -0.675339 -0.816702 0.799289

I can use sin and DataFrame.prod to create a boolean mask:

In [40]: mask = (np.sin(df.velocity) / df.ix[:, 0:2].prod(axis=1)) > 0

In [41]: mask
Out[41]: 
0 False
1 False
2 False
3 True
4 True

Then use the mask to select from the DataFrame:

In [42]: df[mask]
Out[42]: 
 mass1 mass2 velocity
3 0.602463 0.299249 0.474564
4 -0.675339 -0.816702 0.799289

answered Jul 10 '12 at 19:35

Chang She

11.1k33322

2

actually, this was probably a bad example: np.sin automatically broadcasts to all elements. What if I replaced it with a less intelligent function that could only handle one input at a time?

– duckworthd
Jul 10 '12 at 21:07

add a comment |

Suppose I had a DataFrame as follows:

In [39]: df
Out[39]: 
 mass1 mass2 velocity
0 1.461711 -0.404452 0.722502
1 -2.169377 1.131037 0.232047
2 0.009450 -0.868753 0.598470
3 0.602463 0.299249 0.474564
4 -0.675339 -0.816702 0.799289

I can use sin and DataFrame.prod to create a boolean mask:

In [40]: mask = (np.sin(df.velocity) / df.ix[:, 0:2].prod(axis=1)) > 0

In [41]: mask
Out[41]: 
0 False
1 False
2 False
3 True
4 True

Then use the mask to select from the DataFrame:

In [42]: df[mask]
Out[42]: 
 mass1 mass2 velocity
3 0.602463 0.299249 0.474564
4 -0.675339 -0.816702 0.799289

answered Jul 10 '12 at 19:35

Chang She

11.1k33322

Suppose I had a DataFrame as follows:

In [39]: df
Out[39]: 
 mass1 mass2 velocity
0 1.461711 -0.404452 0.722502
1 -2.169377 1.131037 0.232047
2 0.009450 -0.868753 0.598470
3 0.602463 0.299249 0.474564
4 -0.675339 -0.816702 0.799289

I can use sin and DataFrame.prod to create a boolean mask:

In [40]: mask = (np.sin(df.velocity) / df.ix[:, 0:2].prod(axis=1)) > 0

In [41]: mask
Out[41]: 
0 False
1 False
2 False
3 True
4 True

Then use the mask to select from the DataFrame:

In [42]: df[mask]
Out[42]: 
 mass1 mass2 velocity
3 0.602463 0.299249 0.474564
4 -0.675339 -0.816702 0.799289

answered Jul 10 '12 at 19:35

Chang She

11.1k33322

answered Jul 10 '12 at 19:35

Chang She

11.1k33322

answered Jul 10 '12 at 19:35

Chang She

11.1k33322

answered Jul 10 '12 at 19:35

Chang She

11.1k33322

2

actually, this was probably a bad example: np.sin automatically broadcasts to all elements. What if I replaced it with a less intelligent function that could only handle one input at a time?

– duckworthd
Jul 10 '12 at 21:07

add a comment |

2

actually, this was probably a bad example: np.sin automatically broadcasts to all elements. What if I replaced it with a less intelligent function that could only handle one input at a time?

– duckworthd
Jul 10 '12 at 21:07

actually, this was probably a bad example: np.sin automatically broadcasts to all elements. What if I replaced it with a less intelligent function that could only handle one input at a time?

– duckworthd
Jul 10 '12 at 21:07

add a comment |

Specify reduce=True to handle empty DataFrames as well.

import pandas as pd

t = pd.DataFrame(columns=['a', 'b'])
t[t.apply(lambda x: x['a'] > 1, axis=1, reduce=True)]

https://crosscompute.com/n/jAbsB6OIm6oCCJX9PBIbY5FECFKCClyV/-/apply-custom-filter-on-rows-of-dataframe

edited May 16 '18 at 21:22

answered Oct 21 '17 at 17:31

Roy Hyunjin Han

3,35621917

add a comment |

Specify reduce=True to handle empty DataFrames as well.

import pandas as pd

t = pd.DataFrame(columns=['a', 'b'])
t[t.apply(lambda x: x['a'] > 1, axis=1, reduce=True)]

https://crosscompute.com/n/jAbsB6OIm6oCCJX9PBIbY5FECFKCClyV/-/apply-custom-filter-on-rows-of-dataframe

edited May 16 '18 at 21:22

answered Oct 21 '17 at 17:31

Roy Hyunjin Han

3,35621917

add a comment |

Specify reduce=True to handle empty DataFrames as well.

import pandas as pd

t = pd.DataFrame(columns=['a', 'b'])
t[t.apply(lambda x: x['a'] > 1, axis=1, reduce=True)]

https://crosscompute.com/n/jAbsB6OIm6oCCJX9PBIbY5FECFKCClyV/-/apply-custom-filter-on-rows-of-dataframe

edited May 16 '18 at 21:22

answered Oct 21 '17 at 17:31

Roy Hyunjin Han

3,35621917

Specify reduce=True to handle empty DataFrames as well.

import pandas as pd

t = pd.DataFrame(columns=['a', 'b'])
t[t.apply(lambda x: x['a'] > 1, axis=1, reduce=True)]

https://crosscompute.com/n/jAbsB6OIm6oCCJX9PBIbY5FECFKCClyV/-/apply-custom-filter-on-rows-of-dataframe

edited May 16 '18 at 21:22

answered Oct 21 '17 at 17:31

Roy Hyunjin Han

3,35621917

edited May 16 '18 at 21:22

answered Oct 21 '17 at 17:31

Roy Hyunjin Han

3,35621917

answered Oct 21 '17 at 17:31

Roy Hyunjin Han

3,35621917

answered Oct 21 '17 at 17:31

Roy Hyunjin Han

3,35621917

add a comment |

I canot comment on duckworthd's answer, but it is not perfectly working. It crashes when the dataframe is empty:

df = pandas.DataFrame(columns=['a', 'b', 'c'])
df[df.apply(lambda x: x['b'] > x['c'], axis=1)]

Outputs:

ValueError: Must pass DataFrame with boolean values only

To me it looks like a bug in pandas, since is definitively a valid set of boolean values.

edited May 23 '17 at 12:34

Community♦

answered Jul 10 '15 at 12:16

cglacet

1,619820

add a comment |

I canot comment on duckworthd's answer, but it is not perfectly working. It crashes when the dataframe is empty:

df = pandas.DataFrame(columns=['a', 'b', 'c'])
df[df.apply(lambda x: x['b'] > x['c'], axis=1)]

Outputs:

ValueError: Must pass DataFrame with boolean values only

To me it looks like a bug in pandas, since is definitively a valid set of boolean values.

edited May 23 '17 at 12:34

Community♦

answered Jul 10 '15 at 12:16

cglacet

1,619820

add a comment |

I canot comment on duckworthd's answer, but it is not perfectly working. It crashes when the dataframe is empty:

df = pandas.DataFrame(columns=['a', 'b', 'c'])
df[df.apply(lambda x: x['b'] > x['c'], axis=1)]

Outputs:

ValueError: Must pass DataFrame with boolean values only

To me it looks like a bug in pandas, since is definitively a valid set of boolean values.

edited May 23 '17 at 12:34

Community♦

answered Jul 10 '15 at 12:16

cglacet

1,619820

I canot comment on duckworthd's answer, but it is not perfectly working. It crashes when the dataframe is empty:

df = pandas.DataFrame(columns=['a', 'b', 'c'])
df[df.apply(lambda x: x['b'] > x['c'], axis=1)]

Outputs:

ValueError: Must pass DataFrame with boolean values only

To me it looks like a bug in pandas, since is definitively a valid set of boolean values.

edited May 23 '17 at 12:34

Community♦

answered Jul 10 '15 at 12:16

cglacet

1,619820

edited May 23 '17 at 12:34

Community♦

edited May 23 '17 at 12:34

Community♦

edited May 23 '17 at 12:34

Community♦

answered Jul 10 '15 at 12:16

cglacet

1,619820

answered Jul 10 '15 at 12:16

cglacet

1,619820

answered Jul 10 '15 at 12:16

cglacet

1,619820

add a comment |

The best approach I've found is, instead of using reduce=True to avoid errors for empty df (since this arg is deprecated anyway), just check that df size > 0 before applying the filter:

def my_filter(row):
 if row.columnA == something:
 return True

 return False

if len(df.index) > 0:
 df[df.apply(my_filter, axis=1)]

answered Jan 14 at 19:04

user553965

585612

add a comment |

The best approach I've found is, instead of using reduce=True to avoid errors for empty df (since this arg is deprecated anyway), just check that df size > 0 before applying the filter:

def my_filter(row):
 if row.columnA == something:
 return True

 return False

if len(df.index) > 0:
 df[df.apply(my_filter, axis=1)]

answered Jan 14 at 19:04

user553965

585612

add a comment |

The best approach I've found is, instead of using reduce=True to avoid errors for empty df (since this arg is deprecated anyway), just check that df size > 0 before applying the filter:

def my_filter(row):
 if row.columnA == something:
 return True

 return False

if len(df.index) > 0:
 df[df.apply(my_filter, axis=1)]

answered Jan 14 at 19:04

user553965

585612

The best approach I've found is, instead of using reduce=True to avoid errors for empty df (since this arg is deprecated anyway), just check that df size > 0 before applying the filter:

def my_filter(row):
 if row.columnA == something:
 return True

 return False

if len(df.index) > 0:
 df[df.apply(my_filter, axis=1)]

answered Jan 14 at 19:04

user553965

585612

answered Jan 14 at 19:04

user553965

585612

answered Jan 14 at 19:04

user553965

585612

answered Jan 14 at 19:04

user553965

585612

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ufdjrw

5 Answers
5

Your Answer

Post as a guest

5 Answers
5

5 Answers
5

Post as a guest

Popular posts from this blog

Алба-Юлія

Захаров Федір Захарович

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

5 Answers 5

5 Answers 5

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Алба-Юлія

Захаров Федір Захарович

5 Answers
5

5 Answers
5

5 Answers
5