How are iloc, ix and loc different? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) Data science time! April 2019 and salary with experience Should we burninate the [wrap] tag? The Ask Question Wizard is Live!Why/How does Pandas use square brackets with .loc and .iloc?pandas .ix slicing deprecated - how to replace?How to get pandas crosstab margins value?Pandas iloc returns different range than locpython iloc and loc function generating different behavior in selecting rowsHow to retrieve data if you know the column value and row value using a pandas data frame?How to use “ iloc ” and “ loc ” method in place of .ix method for index slicing let's just say to get the data between a particular time frame?Issue on pandas.DataFrame for get a itemFinding values in pandas dataframe for a particular column based on indexHow to select data from pandas seriesHow does database indexing work?How to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?What is the difference between @staticmethod and @classmethod?Difference between append vs. extend list methods in PythonHow do I sort a dictionary by value?Difference between __str__ and __repr__?How do I list all files of a directory?How to iterate over rows in a DataFrame in Pandas?

Resolving to minmaj7

What is a non-alternating simple group with big order, but relatively few conjugacy classes?

List of Python versions

Using audio cues to encourage good posture

3 doors, three guards, one stone

Why do people hide their license plates in the EU?

How to react to hostile behavior from a senior developer?

What causes the vertical darker bands in my photo?

How to tell that you are a giant?

What's the meaning of 間時肆拾貳 at a car parking sign

English words in a non-english sci-fi novel

Why is "Consequences inflicted." not a sentence?

Error "illegal generic type for instanceof" when using local classes

Should I discuss the type of campaign with my players?

How to bypass password on Windows XP account?

Storing hydrofluoric acid before the invention of plastics

Why did the Falcon Heavy center core fall off the ASDS OCISLY barge?

How does debian/ubuntu knows a package has a updated version

Do I really need recursive chmod to restrict access to a folder?

If a contract sometimes uses the wrong name, is it still valid?

Should I use a zero-interest credit card for a large one-time purchase?

How can I make names more distinctive without making them longer?

Can a non-EU citizen traveling with me come with me through the EU passport line?

Coloring maths inside a tcolorbox



How are iloc, ix and loc different?



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
Data science time! April 2019 and salary with experience
Should we burninate the [wrap] tag?
The Ask Question Wizard is Live!Why/How does Pandas use square brackets with .loc and .iloc?pandas .ix slicing deprecated - how to replace?How to get pandas crosstab margins value?Pandas iloc returns different range than locpython iloc and loc function generating different behavior in selecting rowsHow to retrieve data if you know the column value and row value using a pandas data frame?How to use “ iloc ” and “ loc ” method in place of .ix method for index slicing let's just say to get the data between a particular time frame?Issue on pandas.DataFrame for get a itemFinding values in pandas dataframe for a particular column based on indexHow to select data from pandas seriesHow does database indexing work?How to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?What is the difference between @staticmethod and @classmethod?Difference between append vs. extend list methods in PythonHow do I sort a dictionary by value?Difference between __str__ and __repr__?How do I list all files of a directory?How to iterate over rows in a DataFrame in Pandas?



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








478















Can someone explain how these three methods of slicing are different?

I've seen the docs,
and I've seen these answers, but I still find myself unable to explain how the three are different. To me, they seem interchangeable in large part, because they are at the lower levels of slicing.



For example, say we want to get the first five rows of a DataFrame. How is it that all three of these work?



df.loc[:5]
df.ix[:5]
df.iloc[:5]


Can someone present three cases where the distinction in uses are clearer?










share|improve this question



















  • 3





    very important to mention the SettingWithCopyWarning scenarios: stackoverflow.com/questions/20625582/… and stackoverflow.com/questions/23688307/…

    – Paul
    May 20 '16 at 13:08






  • 6





    Note that ix is now planned for deprecation: github.com/pandas-dev/pandas/issues/14218

    – JohnE
    Dec 20 '16 at 17:57






  • 4





    For those new to this question, you can check my new solution with a very detailed explanation: stackoverflow.com/a/46915810/3707607

    – Ted Petrou
    Oct 24 '17 at 17:03

















478















Can someone explain how these three methods of slicing are different?

I've seen the docs,
and I've seen these answers, but I still find myself unable to explain how the three are different. To me, they seem interchangeable in large part, because they are at the lower levels of slicing.



For example, say we want to get the first five rows of a DataFrame. How is it that all three of these work?



df.loc[:5]
df.ix[:5]
df.iloc[:5]


Can someone present three cases where the distinction in uses are clearer?










share|improve this question



















  • 3





    very important to mention the SettingWithCopyWarning scenarios: stackoverflow.com/questions/20625582/… and stackoverflow.com/questions/23688307/…

    – Paul
    May 20 '16 at 13:08






  • 6





    Note that ix is now planned for deprecation: github.com/pandas-dev/pandas/issues/14218

    – JohnE
    Dec 20 '16 at 17:57






  • 4





    For those new to this question, you can check my new solution with a very detailed explanation: stackoverflow.com/a/46915810/3707607

    – Ted Petrou
    Oct 24 '17 at 17:03













478












478








478


324






Can someone explain how these three methods of slicing are different?

I've seen the docs,
and I've seen these answers, but I still find myself unable to explain how the three are different. To me, they seem interchangeable in large part, because they are at the lower levels of slicing.



For example, say we want to get the first five rows of a DataFrame. How is it that all three of these work?



df.loc[:5]
df.ix[:5]
df.iloc[:5]


Can someone present three cases where the distinction in uses are clearer?










share|improve this question
















Can someone explain how these three methods of slicing are different?

I've seen the docs,
and I've seen these answers, but I still find myself unable to explain how the three are different. To me, they seem interchangeable in large part, because they are at the lower levels of slicing.



For example, say we want to get the first five rows of a DataFrame. How is it that all three of these work?



df.loc[:5]
df.ix[:5]
df.iloc[:5]


Can someone present three cases where the distinction in uses are clearer?







python pandas indexing dataframe






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 8 at 17:41









nbro

5,826105198




5,826105198










asked Jul 23 '15 at 16:34









AZhaoAZhao

5,14761639




5,14761639







  • 3





    very important to mention the SettingWithCopyWarning scenarios: stackoverflow.com/questions/20625582/… and stackoverflow.com/questions/23688307/…

    – Paul
    May 20 '16 at 13:08






  • 6





    Note that ix is now planned for deprecation: github.com/pandas-dev/pandas/issues/14218

    – JohnE
    Dec 20 '16 at 17:57






  • 4





    For those new to this question, you can check my new solution with a very detailed explanation: stackoverflow.com/a/46915810/3707607

    – Ted Petrou
    Oct 24 '17 at 17:03












  • 3





    very important to mention the SettingWithCopyWarning scenarios: stackoverflow.com/questions/20625582/… and stackoverflow.com/questions/23688307/…

    – Paul
    May 20 '16 at 13:08






  • 6





    Note that ix is now planned for deprecation: github.com/pandas-dev/pandas/issues/14218

    – JohnE
    Dec 20 '16 at 17:57






  • 4





    For those new to this question, you can check my new solution with a very detailed explanation: stackoverflow.com/a/46915810/3707607

    – Ted Petrou
    Oct 24 '17 at 17:03







3




3





very important to mention the SettingWithCopyWarning scenarios: stackoverflow.com/questions/20625582/… and stackoverflow.com/questions/23688307/…

– Paul
May 20 '16 at 13:08





very important to mention the SettingWithCopyWarning scenarios: stackoverflow.com/questions/20625582/… and stackoverflow.com/questions/23688307/…

– Paul
May 20 '16 at 13:08




6




6





Note that ix is now planned for deprecation: github.com/pandas-dev/pandas/issues/14218

– JohnE
Dec 20 '16 at 17:57





Note that ix is now planned for deprecation: github.com/pandas-dev/pandas/issues/14218

– JohnE
Dec 20 '16 at 17:57




4




4





For those new to this question, you can check my new solution with a very detailed explanation: stackoverflow.com/a/46915810/3707607

– Ted Petrou
Oct 24 '17 at 17:03





For those new to this question, you can check my new solution with a very detailed explanation: stackoverflow.com/a/46915810/3707607

– Ted Petrou
Oct 24 '17 at 17:03












4 Answers
4






active

oldest

votes


















739














Note: in pandas version 0.20.0 and above, ix is deprecated and the use of loc and iloc is encouraged instead. I have left the parts of this answer that describe ix intact as a reference for users of earlier versions of pandas. Examples have been added below showing alternatives to ix.




First, here's a recap of the three methods:




  • loc gets rows (or columns) with particular labels from the index.


  • iloc gets rows (or columns) at particular positions in the index (so it only takes integers).


  • ix usually tries to behave like loc but falls back to behaving like iloc if a label is not present in the index.

It's important to note some subtleties that can make ix slightly tricky to use:



  • if the index is of integer type, ix will only use label-based indexing and not fall back to position-based indexing. If the label is not in the index, an error is raised.


  • if the index does not contain only integers, then given an integer, ix will immediately use position-based indexing rather than label-based indexing. If however ix is given another type (e.g. a string), it can use label-based indexing.



To illustrate the differences between the three methods, consider the following Series:



>>> s = pd.Series(np.nan, index=[49,48,47,46,45, 1, 2, 3, 4, 5])
>>> s
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN


We'll look at slicing with the integer value 3.



In this case, s.iloc[:3] returns us the first 3 rows (since it treats 3 as a position) and s.loc[:3] returns us the first 8 rows (since it treats 3 as a label):



>>> s.iloc[:3] # slice the first three rows
49 NaN
48 NaN
47 NaN

>>> s.loc[:3] # slice up to and including label 3
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
2 NaN
3 NaN

>>> s.ix[:3] # the integer is in the index so s.ix[:3] works like loc
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
2 NaN
3 NaN


Notice s.ix[:3] returns the same Series as s.loc[:3] since it looks for the label first rather than working on the position (and the index for s is of integer type).



What if we try with an integer label that isn't in the index (say 6)?



Here s.iloc[:6] returns the first 6 rows of the Series as expected. However, s.loc[:6] raises a KeyError since 6 is not in the index.



>>> s.iloc[:6]
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN

>>> s.loc[:6]
KeyError: 6

>>> s.ix[:6]
KeyError: 6


As per the subtleties noted above, s.ix[:6] now raises a KeyError because it tries to work like loc but can't find a 6 in the index. Because our index is of integer type ix doesn't fall back to behaving like iloc.



If, however, our index was of mixed type, given an integer ix would behave like iloc immediately instead of raising a KeyError:



>>> s2 = pd.Series(np.nan, index=['a','b','c','d','e', 1, 2, 3, 4, 5])
>>> s2.index.is_mixed() # index is mix of different types
True
>>> s2.ix[:6] # now behaves like iloc given integer
a NaN
b NaN
c NaN
d NaN
e NaN
1 NaN


Keep in mind that ix can still accept non-integers and behave like loc:



>>> s2.ix[:'c'] # behaves like loc given non-integer
a NaN
b NaN
c NaN


As general advice, if you're only indexing using labels, or only indexing using integer positions, stick with loc or iloc to avoid unexpected results - try not use ix.




Combining position-based and label-based indexing



Sometimes given a DataFrame, you will want to mix label and positional indexing methods for the rows and columns.



For example, consider the following DataFrame. How best to slice the rows up to and including 'c' and take the first four columns?



>>> df = pd.DataFrame(np.nan, 
index=list('abcde'),
columns=['x','y','z', 8, 9])
>>> df
x y z 8 9
a NaN NaN NaN NaN NaN
b NaN NaN NaN NaN NaN
c NaN NaN NaN NaN NaN
d NaN NaN NaN NaN NaN
e NaN NaN NaN NaN NaN


In earlier versions of pandas (before 0.20.0) ix lets you do this quite neatly - we can slice the rows by label and the columns by position (note that for the columns, ix will default to position-based slicing since 4 is not a column name):



>>> df.ix[:'c', :4]
x y z 8
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN


In later versions of pandas, we can achieve this result using iloc and the help of another method:



>>> df.iloc[:df.index.get_loc('c') + 1, :4]
x y z 8
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN


get_loc() is an index method meaning "get the position of the label in this index". Note that since slicing with iloc is exclusive of its endpoint, we must add 1 to this value if we want row 'c' as well.



There are further examples in pandas' documentation here.






share|improve this answer




















  • 7





    Great explanation! One related question I've always had is what relation, if any, loc, iloc and ix have with SettingWithCopy warnings? There is some documentation but to be honest I'm still a little confused pandas.pydata.org/pandas-docs/stable/…

    – measureallthethings
    Jul 23 '15 at 18:36






  • 2





    @measureallthethings: loc, iloc and ix might still trigger the warning if they are chained together. Using the example DataFrame in the linked docs dfmi.loc[:, 'one'].loc[:, 'second'] triggers the warning just like dfmi['one']['second'] because a copy of data (rather than a view) might be returned by the first indexing operation.

    – Alex Riley
    Jul 23 '15 at 18:56












  • What do you use if you want to lookup a DateIndex with a Date, or something like df.ix[date, 'Cash']?

    – cjm2671
    Apr 29 '16 at 8:51











  • @cjm2671: both loc or ix should work in that case. For example, df.loc['2016-04-29', 'Cash'] will return all row indexes with that particular date from the 'Cash' column. (You can be as specific as you like when retrieving indexes with strings, e.g. '2016-01' will select all datetimes falling in January 2016, `'2016-01-02 11' will select datetimes on January 2 2016 with time 11:??:??.)

    – Alex Riley
    Apr 29 '16 at 9:18











  • In case you want to update this answer at some point, there are suggestions here for how to use loc/iloc instead of ix github.com/pandas-dev/pandas/issues/14218

    – JohnE
    Dec 20 '16 at 18:00


















104














iloc works based on integer positioning. So no matter what your row labels are, you can always, e.g., get the first row by doing



df.iloc[0]


or the last five rows by doing



df.iloc[-5:]


You can also use it on the columns. This retrieves the 3rd column:



df.iloc[:, 2] # the : in the first position indicates all rows


You can combine them to get intersections of rows and columns:



df.iloc[:3, :3] # The upper-left 3 X 3 entries (assuming df has 3+ rows and columns)


On the other hand, .loc use named indices. Let's set up a data frame with strings as row and column labels:



df = pd.DataFrame(index=['a', 'b', 'c'], columns=['time', 'date', 'name'])


Then we can get the first row by



df.loc['a'] # equivalent to df.iloc[0]


and the second two rows of the 'date' column by



df.loc['b':, 'date'] # equivalent to df.iloc[1:, 1]


and so on. Now, it's probably worth pointing out that the default row and column indices for a DataFrame are integers from 0 and in this case iloc and loc would work in the same way. This is why your three examples are equivalent. If you had a non-numeric index such as strings or datetimes, df.loc[:5] would raise an error.



Also, you can do column retrieval just by using the data frame's __getitem__:



df['time'] # equivalent to df.loc[:, 'time']


Now suppose you want to mix position and named indexing, that is, indexing using names on rows and positions on columns (to clarify, I mean select from our data frame, rather than creating a data frame with strings in the row index and integers in the column index). This is where .ix comes in:



df.ix[:2, 'time'] # the first two rows of the 'time' column


I think it's also worth mentioning that you can pass boolean vectors to the loc method as well. For example:



 b = [True, False, True]
df.loc[b]


Will return the 1st and 3rd rows of df. This is equivalent to df[b] for selection, but it can also be used for assigning via boolean vectors:



df.loc[b, 'name'] = 'Mary', 'John'





share|improve this answer

























  • Is df.iloc[:, :] equivalent to all rows and columns?

    – Alvis
    May 3 '17 at 10:03











  • It is, as would be df.loc[:, :]. It can be used to re-assign the values of the entire DataFrame or create a view of it.

    – JoeCondron
    May 3 '17 at 20:45



















77














In my opinion, the accepted answer is confusing, since it uses a DataFrame with only missing values. I also do not like the term position-based for .iloc and instead, prefer integer location as it is much more descriptive and exactly what .iloc stands for. The key word is INTEGER - .iloc needs INTEGERS.



See my extremely detailed blog series on subset selection for more




.ix is deprecated and ambiguous and should never be used



Because .ix is deprecated we will only focus on the differences between .loc and .iloc.



Before we talk about the differences, it is important to understand that DataFrames have labels that help identify each column and each index. Let's take a look at a sample DataFrame:



df = pd.DataFrame('age':[30, 2, 12, 4, 32, 33, 69],
'color':['blue', 'green', 'red', 'white', 'gray', 'black', 'red'],
'food':['Steak', 'Lamb', 'Mango', 'Apple', 'Cheese', 'Melon', 'Beans'],
'height':[165, 70, 120, 80, 180, 172, 150],
'score':[4.6, 8.3, 9.0, 3.3, 1.8, 9.5, 2.2],
'state':['NY', 'TX', 'FL', 'AL', 'AK', 'TX', 'TX']
,
index=['Jane', 'Nick', 'Aaron', 'Penelope', 'Dean', 'Christina', 'Cornelia'])


enter image description here



All the words in bold are the labels. The labels, age, color, food, height, score and state are used for the columns. The other labels, Jane, Nick, Aaron, Penelope, Dean, Christina, Cornelia are used for the index.




The primary ways to select particular rows in a DataFrame are with the .loc and .iloc indexers. Each of these indexers can also be used to simultaneously select columns but it is easier to just focus on rows for now. Also, each of the indexers use a set of brackets that immediately follow their name to make their selections.



.loc selects data only by labels



We will first talk about the .loc indexer which only selects data by the index or column labels. In our sample DataFrame, we have provided meaningful names as values for the index. Many DataFrames will not have any meaningful names and will instead, default to just the integers from 0 to n-1, where n is the length of the DataFrame.



There are three different inputs you can use for .loc



  • A string

  • A list of strings

  • Slice notation using strings as the start and stop values

Selecting a single row with .loc with a string



To select a single row of data, place the index label inside of the brackets following .loc.



df.loc['Penelope']


This returns the row of data as a Series



age 4
color white
food Apple
height 80
score 3.3
state AL
Name: Penelope, dtype: object


Selecting multiple rows with .loc with a list of strings



df.loc[['Cornelia', 'Jane', 'Dean']]


This returns a DataFrame with the rows in the order specified in the list:



enter image description here



Selecting multiple rows with .loc with slice notation



Slice notation is defined by a start, stop and step values. When slicing by label, pandas includes the stop value in the return. The following slices from Aaron to Dean, inclusive. Its step size is not explicitly defined but defaulted to 1.



df.loc['Aaron':'Dean']


enter image description here



Complex slices can be taken in the same manner as Python lists.



.iloc selects data only by integer location



Let's now turn to .iloc. Every row and column of data in a DataFrame has an integer location that defines it. This is in addition to the label that is visually displayed in the output. The integer location is simply the number of rows/columns from the top/left beginning at 0.



There are three different inputs you can use for .iloc



  • An integer

  • A list of integers

  • Slice notation using integers as the start and stop values

Selecting a single row with .iloc with an integer



df.iloc[4]


This returns the 5th row (integer location 4) as a Series



age 32
color gray
food Cheese
height 180
score 1.8
state AK
Name: Dean, dtype: object


Selecting multiple rows with .iloc with a list of integers



df.iloc[[2, -2]]


This returns a DataFrame of the third and second to last rows:



enter image description here



Selecting multiple rows with .iloc with slice notation



df.iloc[:5:3]


enter image description here




Simultaneous selection of rows and columns with .loc and .iloc



One excellent ability of both .loc/.iloc is their ability to select both rows and columns simultaneously. In the examples above, all the columns were returned from each selection. We can choose columns with the same types of inputs as we do for rows. We simply need to separate the row and column selection with a comma.



For example, we can select rows Jane, and Dean with just the columns height, score and state like this:



df.loc[['Jane', 'Dean'], 'height':]


enter image description here



This uses a list of labels for the rows and slice notation for the columns



We can naturally do similar operations with .iloc using only integers.



df.iloc[[1,4], 2]
Nick Lamb
Dean Cheese
Name: food, dtype: object



Simultaneous selection with labels and integer location



.ix was used to make selections simultaneously with labels and integer location which was useful but confusing and ambiguous at times and thankfully it has been deprecated. In the event that you need to make a selection with a mix of labels and integer locations, you will have to make both your selections labels or integer locations.



For instance, if we want to select rows Nick and Cornelia along with columns 2 and 4, we could use .loc by converting the integers to labels with the following:



col_names = df.columns[[2, 4]]
df.loc[['Nick', 'Cornelia'], col_names]


Or alternatively, convert the index labels to integers with the get_loc index method.



labels = ['Nick', 'Cornelia']
index_ints = [df.index.get_loc(label) for label in labels]
df.iloc[index_ints, [2, 4]]


Boolean Selection



The .loc indexer can also do boolean selection. For instance, if we are interested in finding all the rows wher age is above 30 and return just the food and score columns we can do the following:



df.loc[df['age'] > 30, ['food', 'score']] 


You can replicate this with .iloc but you cannot pass it a boolean series. You must convert the boolean Series into a numpy array like this:



df.iloc[(df['age'] > 30).values, [2, 4]] 



Selecting all rows



It is possible to use .loc/.iloc for just column selection. You can select all the rows by using a colon like this:



df.loc[:, 'color':'score':2]


enter image description here




The indexing operator, [], can select rows and columns too but not simultaneously.



Most people are familiar with the primary purpose of the DataFrame indexing operator, which is to select columns. A string selects a single column as a Series and a list of strings selects multiple columns as a DataFrame.



df['food']

Jane Steak
Nick Lamb
Aaron Mango
Penelope Apple
Dean Cheese
Christina Melon
Cornelia Beans
Name: food, dtype: object


Using a list selects multiple columns



df[['food', 'score']]


enter image description here



What people are less familiar with, is that, when slice notation is used, then selection happens by row labels or by integer location. This is very confusing and something that I almost never use but it does work.



df['Penelope':'Christina'] # slice rows by label


enter image description here



df[2:6:2] # slice rows by integer location


enter image description here



The explicitness of .loc/.iloc for selecting rows is highly preferred. The indexing operator alone is unable to select rows and columns simultaneously.



df[3:5, 'color']
TypeError: unhashable type: 'slice'





share|improve this answer
































    -2














    Let me tell you that ix was in the previous versions of pandas.and iloc and loc is incorporated in its latest versions.



    • IX-this is used to parse any particular data from the data frame using either with the label or row and column index at a time.So there was a bit of issue generated as in some case where column index and row index both were the combination of number and string labels.

      Example:-df.ix[:2, 'time']

    Now come to loc.



    • This parse the data using the labels as index, whether it is column or row.

      Example:- df.loc[:, 'color':'score':2]

    Now for iloc.



    • What we do is we provide both the column and row as index (denoted by the number)

      Example:- df.iloc[[1,4], 2]





    share|improve this answer

























      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f31593201%2fhow-are-iloc-ix-and-loc-different%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      4 Answers
      4






      active

      oldest

      votes








      4 Answers
      4






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      739














      Note: in pandas version 0.20.0 and above, ix is deprecated and the use of loc and iloc is encouraged instead. I have left the parts of this answer that describe ix intact as a reference for users of earlier versions of pandas. Examples have been added below showing alternatives to ix.




      First, here's a recap of the three methods:




      • loc gets rows (or columns) with particular labels from the index.


      • iloc gets rows (or columns) at particular positions in the index (so it only takes integers).


      • ix usually tries to behave like loc but falls back to behaving like iloc if a label is not present in the index.

      It's important to note some subtleties that can make ix slightly tricky to use:



      • if the index is of integer type, ix will only use label-based indexing and not fall back to position-based indexing. If the label is not in the index, an error is raised.


      • if the index does not contain only integers, then given an integer, ix will immediately use position-based indexing rather than label-based indexing. If however ix is given another type (e.g. a string), it can use label-based indexing.



      To illustrate the differences between the three methods, consider the following Series:



      >>> s = pd.Series(np.nan, index=[49,48,47,46,45, 1, 2, 3, 4, 5])
      >>> s
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN
      4 NaN
      5 NaN


      We'll look at slicing with the integer value 3.



      In this case, s.iloc[:3] returns us the first 3 rows (since it treats 3 as a position) and s.loc[:3] returns us the first 8 rows (since it treats 3 as a label):



      >>> s.iloc[:3] # slice the first three rows
      49 NaN
      48 NaN
      47 NaN

      >>> s.loc[:3] # slice up to and including label 3
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN

      >>> s.ix[:3] # the integer is in the index so s.ix[:3] works like loc
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN


      Notice s.ix[:3] returns the same Series as s.loc[:3] since it looks for the label first rather than working on the position (and the index for s is of integer type).



      What if we try with an integer label that isn't in the index (say 6)?



      Here s.iloc[:6] returns the first 6 rows of the Series as expected. However, s.loc[:6] raises a KeyError since 6 is not in the index.



      >>> s.iloc[:6]
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN

      >>> s.loc[:6]
      KeyError: 6

      >>> s.ix[:6]
      KeyError: 6


      As per the subtleties noted above, s.ix[:6] now raises a KeyError because it tries to work like loc but can't find a 6 in the index. Because our index is of integer type ix doesn't fall back to behaving like iloc.



      If, however, our index was of mixed type, given an integer ix would behave like iloc immediately instead of raising a KeyError:



      >>> s2 = pd.Series(np.nan, index=['a','b','c','d','e', 1, 2, 3, 4, 5])
      >>> s2.index.is_mixed() # index is mix of different types
      True
      >>> s2.ix[:6] # now behaves like iloc given integer
      a NaN
      b NaN
      c NaN
      d NaN
      e NaN
      1 NaN


      Keep in mind that ix can still accept non-integers and behave like loc:



      >>> s2.ix[:'c'] # behaves like loc given non-integer
      a NaN
      b NaN
      c NaN


      As general advice, if you're only indexing using labels, or only indexing using integer positions, stick with loc or iloc to avoid unexpected results - try not use ix.




      Combining position-based and label-based indexing



      Sometimes given a DataFrame, you will want to mix label and positional indexing methods for the rows and columns.



      For example, consider the following DataFrame. How best to slice the rows up to and including 'c' and take the first four columns?



      >>> df = pd.DataFrame(np.nan, 
      index=list('abcde'),
      columns=['x','y','z', 8, 9])
      >>> df
      x y z 8 9
      a NaN NaN NaN NaN NaN
      b NaN NaN NaN NaN NaN
      c NaN NaN NaN NaN NaN
      d NaN NaN NaN NaN NaN
      e NaN NaN NaN NaN NaN


      In earlier versions of pandas (before 0.20.0) ix lets you do this quite neatly - we can slice the rows by label and the columns by position (note that for the columns, ix will default to position-based slicing since 4 is not a column name):



      >>> df.ix[:'c', :4]
      x y z 8
      a NaN NaN NaN NaN
      b NaN NaN NaN NaN
      c NaN NaN NaN NaN


      In later versions of pandas, we can achieve this result using iloc and the help of another method:



      >>> df.iloc[:df.index.get_loc('c') + 1, :4]
      x y z 8
      a NaN NaN NaN NaN
      b NaN NaN NaN NaN
      c NaN NaN NaN NaN


      get_loc() is an index method meaning "get the position of the label in this index". Note that since slicing with iloc is exclusive of its endpoint, we must add 1 to this value if we want row 'c' as well.



      There are further examples in pandas' documentation here.






      share|improve this answer




















      • 7





        Great explanation! One related question I've always had is what relation, if any, loc, iloc and ix have with SettingWithCopy warnings? There is some documentation but to be honest I'm still a little confused pandas.pydata.org/pandas-docs/stable/…

        – measureallthethings
        Jul 23 '15 at 18:36






      • 2





        @measureallthethings: loc, iloc and ix might still trigger the warning if they are chained together. Using the example DataFrame in the linked docs dfmi.loc[:, 'one'].loc[:, 'second'] triggers the warning just like dfmi['one']['second'] because a copy of data (rather than a view) might be returned by the first indexing operation.

        – Alex Riley
        Jul 23 '15 at 18:56












      • What do you use if you want to lookup a DateIndex with a Date, or something like df.ix[date, 'Cash']?

        – cjm2671
        Apr 29 '16 at 8:51











      • @cjm2671: both loc or ix should work in that case. For example, df.loc['2016-04-29', 'Cash'] will return all row indexes with that particular date from the 'Cash' column. (You can be as specific as you like when retrieving indexes with strings, e.g. '2016-01' will select all datetimes falling in January 2016, `'2016-01-02 11' will select datetimes on January 2 2016 with time 11:??:??.)

        – Alex Riley
        Apr 29 '16 at 9:18











      • In case you want to update this answer at some point, there are suggestions here for how to use loc/iloc instead of ix github.com/pandas-dev/pandas/issues/14218

        – JohnE
        Dec 20 '16 at 18:00















      739














      Note: in pandas version 0.20.0 and above, ix is deprecated and the use of loc and iloc is encouraged instead. I have left the parts of this answer that describe ix intact as a reference for users of earlier versions of pandas. Examples have been added below showing alternatives to ix.




      First, here's a recap of the three methods:




      • loc gets rows (or columns) with particular labels from the index.


      • iloc gets rows (or columns) at particular positions in the index (so it only takes integers).


      • ix usually tries to behave like loc but falls back to behaving like iloc if a label is not present in the index.

      It's important to note some subtleties that can make ix slightly tricky to use:



      • if the index is of integer type, ix will only use label-based indexing and not fall back to position-based indexing. If the label is not in the index, an error is raised.


      • if the index does not contain only integers, then given an integer, ix will immediately use position-based indexing rather than label-based indexing. If however ix is given another type (e.g. a string), it can use label-based indexing.



      To illustrate the differences between the three methods, consider the following Series:



      >>> s = pd.Series(np.nan, index=[49,48,47,46,45, 1, 2, 3, 4, 5])
      >>> s
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN
      4 NaN
      5 NaN


      We'll look at slicing with the integer value 3.



      In this case, s.iloc[:3] returns us the first 3 rows (since it treats 3 as a position) and s.loc[:3] returns us the first 8 rows (since it treats 3 as a label):



      >>> s.iloc[:3] # slice the first three rows
      49 NaN
      48 NaN
      47 NaN

      >>> s.loc[:3] # slice up to and including label 3
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN

      >>> s.ix[:3] # the integer is in the index so s.ix[:3] works like loc
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN


      Notice s.ix[:3] returns the same Series as s.loc[:3] since it looks for the label first rather than working on the position (and the index for s is of integer type).



      What if we try with an integer label that isn't in the index (say 6)?



      Here s.iloc[:6] returns the first 6 rows of the Series as expected. However, s.loc[:6] raises a KeyError since 6 is not in the index.



      >>> s.iloc[:6]
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN

      >>> s.loc[:6]
      KeyError: 6

      >>> s.ix[:6]
      KeyError: 6


      As per the subtleties noted above, s.ix[:6] now raises a KeyError because it tries to work like loc but can't find a 6 in the index. Because our index is of integer type ix doesn't fall back to behaving like iloc.



      If, however, our index was of mixed type, given an integer ix would behave like iloc immediately instead of raising a KeyError:



      >>> s2 = pd.Series(np.nan, index=['a','b','c','d','e', 1, 2, 3, 4, 5])
      >>> s2.index.is_mixed() # index is mix of different types
      True
      >>> s2.ix[:6] # now behaves like iloc given integer
      a NaN
      b NaN
      c NaN
      d NaN
      e NaN
      1 NaN


      Keep in mind that ix can still accept non-integers and behave like loc:



      >>> s2.ix[:'c'] # behaves like loc given non-integer
      a NaN
      b NaN
      c NaN


      As general advice, if you're only indexing using labels, or only indexing using integer positions, stick with loc or iloc to avoid unexpected results - try not use ix.




      Combining position-based and label-based indexing



      Sometimes given a DataFrame, you will want to mix label and positional indexing methods for the rows and columns.



      For example, consider the following DataFrame. How best to slice the rows up to and including 'c' and take the first four columns?



      >>> df = pd.DataFrame(np.nan, 
      index=list('abcde'),
      columns=['x','y','z', 8, 9])
      >>> df
      x y z 8 9
      a NaN NaN NaN NaN NaN
      b NaN NaN NaN NaN NaN
      c NaN NaN NaN NaN NaN
      d NaN NaN NaN NaN NaN
      e NaN NaN NaN NaN NaN


      In earlier versions of pandas (before 0.20.0) ix lets you do this quite neatly - we can slice the rows by label and the columns by position (note that for the columns, ix will default to position-based slicing since 4 is not a column name):



      >>> df.ix[:'c', :4]
      x y z 8
      a NaN NaN NaN NaN
      b NaN NaN NaN NaN
      c NaN NaN NaN NaN


      In later versions of pandas, we can achieve this result using iloc and the help of another method:



      >>> df.iloc[:df.index.get_loc('c') + 1, :4]
      x y z 8
      a NaN NaN NaN NaN
      b NaN NaN NaN NaN
      c NaN NaN NaN NaN


      get_loc() is an index method meaning "get the position of the label in this index". Note that since slicing with iloc is exclusive of its endpoint, we must add 1 to this value if we want row 'c' as well.



      There are further examples in pandas' documentation here.






      share|improve this answer




















      • 7





        Great explanation! One related question I've always had is what relation, if any, loc, iloc and ix have with SettingWithCopy warnings? There is some documentation but to be honest I'm still a little confused pandas.pydata.org/pandas-docs/stable/…

        – measureallthethings
        Jul 23 '15 at 18:36






      • 2





        @measureallthethings: loc, iloc and ix might still trigger the warning if they are chained together. Using the example DataFrame in the linked docs dfmi.loc[:, 'one'].loc[:, 'second'] triggers the warning just like dfmi['one']['second'] because a copy of data (rather than a view) might be returned by the first indexing operation.

        – Alex Riley
        Jul 23 '15 at 18:56












      • What do you use if you want to lookup a DateIndex with a Date, or something like df.ix[date, 'Cash']?

        – cjm2671
        Apr 29 '16 at 8:51











      • @cjm2671: both loc or ix should work in that case. For example, df.loc['2016-04-29', 'Cash'] will return all row indexes with that particular date from the 'Cash' column. (You can be as specific as you like when retrieving indexes with strings, e.g. '2016-01' will select all datetimes falling in January 2016, `'2016-01-02 11' will select datetimes on January 2 2016 with time 11:??:??.)

        – Alex Riley
        Apr 29 '16 at 9:18











      • In case you want to update this answer at some point, there are suggestions here for how to use loc/iloc instead of ix github.com/pandas-dev/pandas/issues/14218

        – JohnE
        Dec 20 '16 at 18:00













      739












      739








      739







      Note: in pandas version 0.20.0 and above, ix is deprecated and the use of loc and iloc is encouraged instead. I have left the parts of this answer that describe ix intact as a reference for users of earlier versions of pandas. Examples have been added below showing alternatives to ix.




      First, here's a recap of the three methods:




      • loc gets rows (or columns) with particular labels from the index.


      • iloc gets rows (or columns) at particular positions in the index (so it only takes integers).


      • ix usually tries to behave like loc but falls back to behaving like iloc if a label is not present in the index.

      It's important to note some subtleties that can make ix slightly tricky to use:



      • if the index is of integer type, ix will only use label-based indexing and not fall back to position-based indexing. If the label is not in the index, an error is raised.


      • if the index does not contain only integers, then given an integer, ix will immediately use position-based indexing rather than label-based indexing. If however ix is given another type (e.g. a string), it can use label-based indexing.



      To illustrate the differences between the three methods, consider the following Series:



      >>> s = pd.Series(np.nan, index=[49,48,47,46,45, 1, 2, 3, 4, 5])
      >>> s
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN
      4 NaN
      5 NaN


      We'll look at slicing with the integer value 3.



      In this case, s.iloc[:3] returns us the first 3 rows (since it treats 3 as a position) and s.loc[:3] returns us the first 8 rows (since it treats 3 as a label):



      >>> s.iloc[:3] # slice the first three rows
      49 NaN
      48 NaN
      47 NaN

      >>> s.loc[:3] # slice up to and including label 3
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN

      >>> s.ix[:3] # the integer is in the index so s.ix[:3] works like loc
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN


      Notice s.ix[:3] returns the same Series as s.loc[:3] since it looks for the label first rather than working on the position (and the index for s is of integer type).



      What if we try with an integer label that isn't in the index (say 6)?



      Here s.iloc[:6] returns the first 6 rows of the Series as expected. However, s.loc[:6] raises a KeyError since 6 is not in the index.



      >>> s.iloc[:6]
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN

      >>> s.loc[:6]
      KeyError: 6

      >>> s.ix[:6]
      KeyError: 6


      As per the subtleties noted above, s.ix[:6] now raises a KeyError because it tries to work like loc but can't find a 6 in the index. Because our index is of integer type ix doesn't fall back to behaving like iloc.



      If, however, our index was of mixed type, given an integer ix would behave like iloc immediately instead of raising a KeyError:



      >>> s2 = pd.Series(np.nan, index=['a','b','c','d','e', 1, 2, 3, 4, 5])
      >>> s2.index.is_mixed() # index is mix of different types
      True
      >>> s2.ix[:6] # now behaves like iloc given integer
      a NaN
      b NaN
      c NaN
      d NaN
      e NaN
      1 NaN


      Keep in mind that ix can still accept non-integers and behave like loc:



      >>> s2.ix[:'c'] # behaves like loc given non-integer
      a NaN
      b NaN
      c NaN


      As general advice, if you're only indexing using labels, or only indexing using integer positions, stick with loc or iloc to avoid unexpected results - try not use ix.




      Combining position-based and label-based indexing



      Sometimes given a DataFrame, you will want to mix label and positional indexing methods for the rows and columns.



      For example, consider the following DataFrame. How best to slice the rows up to and including 'c' and take the first four columns?



      >>> df = pd.DataFrame(np.nan, 
      index=list('abcde'),
      columns=['x','y','z', 8, 9])
      >>> df
      x y z 8 9
      a NaN NaN NaN NaN NaN
      b NaN NaN NaN NaN NaN
      c NaN NaN NaN NaN NaN
      d NaN NaN NaN NaN NaN
      e NaN NaN NaN NaN NaN


      In earlier versions of pandas (before 0.20.0) ix lets you do this quite neatly - we can slice the rows by label and the columns by position (note that for the columns, ix will default to position-based slicing since 4 is not a column name):



      >>> df.ix[:'c', :4]
      x y z 8
      a NaN NaN NaN NaN
      b NaN NaN NaN NaN
      c NaN NaN NaN NaN


      In later versions of pandas, we can achieve this result using iloc and the help of another method:



      >>> df.iloc[:df.index.get_loc('c') + 1, :4]
      x y z 8
      a NaN NaN NaN NaN
      b NaN NaN NaN NaN
      c NaN NaN NaN NaN


      get_loc() is an index method meaning "get the position of the label in this index". Note that since slicing with iloc is exclusive of its endpoint, we must add 1 to this value if we want row 'c' as well.



      There are further examples in pandas' documentation here.






      share|improve this answer















      Note: in pandas version 0.20.0 and above, ix is deprecated and the use of loc and iloc is encouraged instead. I have left the parts of this answer that describe ix intact as a reference for users of earlier versions of pandas. Examples have been added below showing alternatives to ix.




      First, here's a recap of the three methods:




      • loc gets rows (or columns) with particular labels from the index.


      • iloc gets rows (or columns) at particular positions in the index (so it only takes integers).


      • ix usually tries to behave like loc but falls back to behaving like iloc if a label is not present in the index.

      It's important to note some subtleties that can make ix slightly tricky to use:



      • if the index is of integer type, ix will only use label-based indexing and not fall back to position-based indexing. If the label is not in the index, an error is raised.


      • if the index does not contain only integers, then given an integer, ix will immediately use position-based indexing rather than label-based indexing. If however ix is given another type (e.g. a string), it can use label-based indexing.



      To illustrate the differences between the three methods, consider the following Series:



      >>> s = pd.Series(np.nan, index=[49,48,47,46,45, 1, 2, 3, 4, 5])
      >>> s
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN
      4 NaN
      5 NaN


      We'll look at slicing with the integer value 3.



      In this case, s.iloc[:3] returns us the first 3 rows (since it treats 3 as a position) and s.loc[:3] returns us the first 8 rows (since it treats 3 as a label):



      >>> s.iloc[:3] # slice the first three rows
      49 NaN
      48 NaN
      47 NaN

      >>> s.loc[:3] # slice up to and including label 3
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN

      >>> s.ix[:3] # the integer is in the index so s.ix[:3] works like loc
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN
      2 NaN
      3 NaN


      Notice s.ix[:3] returns the same Series as s.loc[:3] since it looks for the label first rather than working on the position (and the index for s is of integer type).



      What if we try with an integer label that isn't in the index (say 6)?



      Here s.iloc[:6] returns the first 6 rows of the Series as expected. However, s.loc[:6] raises a KeyError since 6 is not in the index.



      >>> s.iloc[:6]
      49 NaN
      48 NaN
      47 NaN
      46 NaN
      45 NaN
      1 NaN

      >>> s.loc[:6]
      KeyError: 6

      >>> s.ix[:6]
      KeyError: 6


      As per the subtleties noted above, s.ix[:6] now raises a KeyError because it tries to work like loc but can't find a 6 in the index. Because our index is of integer type ix doesn't fall back to behaving like iloc.



      If, however, our index was of mixed type, given an integer ix would behave like iloc immediately instead of raising a KeyError:



      >>> s2 = pd.Series(np.nan, index=['a','b','c','d','e', 1, 2, 3, 4, 5])
      >>> s2.index.is_mixed() # index is mix of different types
      True
      >>> s2.ix[:6] # now behaves like iloc given integer
      a NaN
      b NaN
      c NaN
      d NaN
      e NaN
      1 NaN


      Keep in mind that ix can still accept non-integers and behave like loc:



      >>> s2.ix[:'c'] # behaves like loc given non-integer
      a NaN
      b NaN
      c NaN


      As general advice, if you're only indexing using labels, or only indexing using integer positions, stick with loc or iloc to avoid unexpected results - try not use ix.




      Combining position-based and label-based indexing



      Sometimes given a DataFrame, you will want to mix label and positional indexing methods for the rows and columns.



      For example, consider the following DataFrame. How best to slice the rows up to and including 'c' and take the first four columns?



      >>> df = pd.DataFrame(np.nan, 
      index=list('abcde'),
      columns=['x','y','z', 8, 9])
      >>> df
      x y z 8 9
      a NaN NaN NaN NaN NaN
      b NaN NaN NaN NaN NaN
      c NaN NaN NaN NaN NaN
      d NaN NaN NaN NaN NaN
      e NaN NaN NaN NaN NaN


      In earlier versions of pandas (before 0.20.0) ix lets you do this quite neatly - we can slice the rows by label and the columns by position (note that for the columns, ix will default to position-based slicing since 4 is not a column name):



      >>> df.ix[:'c', :4]
      x y z 8
      a NaN NaN NaN NaN
      b NaN NaN NaN NaN
      c NaN NaN NaN NaN


      In later versions of pandas, we can achieve this result using iloc and the help of another method:



      >>> df.iloc[:df.index.get_loc('c') + 1, :4]
      x y z 8
      a NaN NaN NaN NaN
      b NaN NaN NaN NaN
      c NaN NaN NaN NaN


      get_loc() is an index method meaning "get the position of the label in this index". Note that since slicing with iloc is exclusive of its endpoint, we must add 1 to this value if we want row 'c' as well.



      There are further examples in pandas' documentation here.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Dec 16 '17 at 19:06

























      answered Jul 23 '15 at 16:59









      Alex RileyAlex Riley

      83.7k26166168




      83.7k26166168







      • 7





        Great explanation! One related question I've always had is what relation, if any, loc, iloc and ix have with SettingWithCopy warnings? There is some documentation but to be honest I'm still a little confused pandas.pydata.org/pandas-docs/stable/…

        – measureallthethings
        Jul 23 '15 at 18:36






      • 2





        @measureallthethings: loc, iloc and ix might still trigger the warning if they are chained together. Using the example DataFrame in the linked docs dfmi.loc[:, 'one'].loc[:, 'second'] triggers the warning just like dfmi['one']['second'] because a copy of data (rather than a view) might be returned by the first indexing operation.

        – Alex Riley
        Jul 23 '15 at 18:56












      • What do you use if you want to lookup a DateIndex with a Date, or something like df.ix[date, 'Cash']?

        – cjm2671
        Apr 29 '16 at 8:51











      • @cjm2671: both loc or ix should work in that case. For example, df.loc['2016-04-29', 'Cash'] will return all row indexes with that particular date from the 'Cash' column. (You can be as specific as you like when retrieving indexes with strings, e.g. '2016-01' will select all datetimes falling in January 2016, `'2016-01-02 11' will select datetimes on January 2 2016 with time 11:??:??.)

        – Alex Riley
        Apr 29 '16 at 9:18











      • In case you want to update this answer at some point, there are suggestions here for how to use loc/iloc instead of ix github.com/pandas-dev/pandas/issues/14218

        – JohnE
        Dec 20 '16 at 18:00












      • 7





        Great explanation! One related question I've always had is what relation, if any, loc, iloc and ix have with SettingWithCopy warnings? There is some documentation but to be honest I'm still a little confused pandas.pydata.org/pandas-docs/stable/…

        – measureallthethings
        Jul 23 '15 at 18:36






      • 2





        @measureallthethings: loc, iloc and ix might still trigger the warning if they are chained together. Using the example DataFrame in the linked docs dfmi.loc[:, 'one'].loc[:, 'second'] triggers the warning just like dfmi['one']['second'] because a copy of data (rather than a view) might be returned by the first indexing operation.

        – Alex Riley
        Jul 23 '15 at 18:56












      • What do you use if you want to lookup a DateIndex with a Date, or something like df.ix[date, 'Cash']?

        – cjm2671
        Apr 29 '16 at 8:51











      • @cjm2671: both loc or ix should work in that case. For example, df.loc['2016-04-29', 'Cash'] will return all row indexes with that particular date from the 'Cash' column. (You can be as specific as you like when retrieving indexes with strings, e.g. '2016-01' will select all datetimes falling in January 2016, `'2016-01-02 11' will select datetimes on January 2 2016 with time 11:??:??.)

        – Alex Riley
        Apr 29 '16 at 9:18











      • In case you want to update this answer at some point, there are suggestions here for how to use loc/iloc instead of ix github.com/pandas-dev/pandas/issues/14218

        – JohnE
        Dec 20 '16 at 18:00







      7




      7





      Great explanation! One related question I've always had is what relation, if any, loc, iloc and ix have with SettingWithCopy warnings? There is some documentation but to be honest I'm still a little confused pandas.pydata.org/pandas-docs/stable/…

      – measureallthethings
      Jul 23 '15 at 18:36





      Great explanation! One related question I've always had is what relation, if any, loc, iloc and ix have with SettingWithCopy warnings? There is some documentation but to be honest I'm still a little confused pandas.pydata.org/pandas-docs/stable/…

      – measureallthethings
      Jul 23 '15 at 18:36




      2




      2





      @measureallthethings: loc, iloc and ix might still trigger the warning if they are chained together. Using the example DataFrame in the linked docs dfmi.loc[:, 'one'].loc[:, 'second'] triggers the warning just like dfmi['one']['second'] because a copy of data (rather than a view) might be returned by the first indexing operation.

      – Alex Riley
      Jul 23 '15 at 18:56






      @measureallthethings: loc, iloc and ix might still trigger the warning if they are chained together. Using the example DataFrame in the linked docs dfmi.loc[:, 'one'].loc[:, 'second'] triggers the warning just like dfmi['one']['second'] because a copy of data (rather than a view) might be returned by the first indexing operation.

      – Alex Riley
      Jul 23 '15 at 18:56














      What do you use if you want to lookup a DateIndex with a Date, or something like df.ix[date, 'Cash']?

      – cjm2671
      Apr 29 '16 at 8:51





      What do you use if you want to lookup a DateIndex with a Date, or something like df.ix[date, 'Cash']?

      – cjm2671
      Apr 29 '16 at 8:51













      @cjm2671: both loc or ix should work in that case. For example, df.loc['2016-04-29', 'Cash'] will return all row indexes with that particular date from the 'Cash' column. (You can be as specific as you like when retrieving indexes with strings, e.g. '2016-01' will select all datetimes falling in January 2016, `'2016-01-02 11' will select datetimes on January 2 2016 with time 11:??:??.)

      – Alex Riley
      Apr 29 '16 at 9:18





      @cjm2671: both loc or ix should work in that case. For example, df.loc['2016-04-29', 'Cash'] will return all row indexes with that particular date from the 'Cash' column. (You can be as specific as you like when retrieving indexes with strings, e.g. '2016-01' will select all datetimes falling in January 2016, `'2016-01-02 11' will select datetimes on January 2 2016 with time 11:??:??.)

      – Alex Riley
      Apr 29 '16 at 9:18













      In case you want to update this answer at some point, there are suggestions here for how to use loc/iloc instead of ix github.com/pandas-dev/pandas/issues/14218

      – JohnE
      Dec 20 '16 at 18:00





      In case you want to update this answer at some point, there are suggestions here for how to use loc/iloc instead of ix github.com/pandas-dev/pandas/issues/14218

      – JohnE
      Dec 20 '16 at 18:00













      104














      iloc works based on integer positioning. So no matter what your row labels are, you can always, e.g., get the first row by doing



      df.iloc[0]


      or the last five rows by doing



      df.iloc[-5:]


      You can also use it on the columns. This retrieves the 3rd column:



      df.iloc[:, 2] # the : in the first position indicates all rows


      You can combine them to get intersections of rows and columns:



      df.iloc[:3, :3] # The upper-left 3 X 3 entries (assuming df has 3+ rows and columns)


      On the other hand, .loc use named indices. Let's set up a data frame with strings as row and column labels:



      df = pd.DataFrame(index=['a', 'b', 'c'], columns=['time', 'date', 'name'])


      Then we can get the first row by



      df.loc['a'] # equivalent to df.iloc[0]


      and the second two rows of the 'date' column by



      df.loc['b':, 'date'] # equivalent to df.iloc[1:, 1]


      and so on. Now, it's probably worth pointing out that the default row and column indices for a DataFrame are integers from 0 and in this case iloc and loc would work in the same way. This is why your three examples are equivalent. If you had a non-numeric index such as strings or datetimes, df.loc[:5] would raise an error.



      Also, you can do column retrieval just by using the data frame's __getitem__:



      df['time'] # equivalent to df.loc[:, 'time']


      Now suppose you want to mix position and named indexing, that is, indexing using names on rows and positions on columns (to clarify, I mean select from our data frame, rather than creating a data frame with strings in the row index and integers in the column index). This is where .ix comes in:



      df.ix[:2, 'time'] # the first two rows of the 'time' column


      I think it's also worth mentioning that you can pass boolean vectors to the loc method as well. For example:



       b = [True, False, True]
      df.loc[b]


      Will return the 1st and 3rd rows of df. This is equivalent to df[b] for selection, but it can also be used for assigning via boolean vectors:



      df.loc[b, 'name'] = 'Mary', 'John'





      share|improve this answer

























      • Is df.iloc[:, :] equivalent to all rows and columns?

        – Alvis
        May 3 '17 at 10:03











      • It is, as would be df.loc[:, :]. It can be used to re-assign the values of the entire DataFrame or create a view of it.

        – JoeCondron
        May 3 '17 at 20:45
















      104














      iloc works based on integer positioning. So no matter what your row labels are, you can always, e.g., get the first row by doing



      df.iloc[0]


      or the last five rows by doing



      df.iloc[-5:]


      You can also use it on the columns. This retrieves the 3rd column:



      df.iloc[:, 2] # the : in the first position indicates all rows


      You can combine them to get intersections of rows and columns:



      df.iloc[:3, :3] # The upper-left 3 X 3 entries (assuming df has 3+ rows and columns)


      On the other hand, .loc use named indices. Let's set up a data frame with strings as row and column labels:



      df = pd.DataFrame(index=['a', 'b', 'c'], columns=['time', 'date', 'name'])


      Then we can get the first row by



      df.loc['a'] # equivalent to df.iloc[0]


      and the second two rows of the 'date' column by



      df.loc['b':, 'date'] # equivalent to df.iloc[1:, 1]


      and so on. Now, it's probably worth pointing out that the default row and column indices for a DataFrame are integers from 0 and in this case iloc and loc would work in the same way. This is why your three examples are equivalent. If you had a non-numeric index such as strings or datetimes, df.loc[:5] would raise an error.



      Also, you can do column retrieval just by using the data frame's __getitem__:



      df['time'] # equivalent to df.loc[:, 'time']


      Now suppose you want to mix position and named indexing, that is, indexing using names on rows and positions on columns (to clarify, I mean select from our data frame, rather than creating a data frame with strings in the row index and integers in the column index). This is where .ix comes in:



      df.ix[:2, 'time'] # the first two rows of the 'time' column


      I think it's also worth mentioning that you can pass boolean vectors to the loc method as well. For example:



       b = [True, False, True]
      df.loc[b]


      Will return the 1st and 3rd rows of df. This is equivalent to df[b] for selection, but it can also be used for assigning via boolean vectors:



      df.loc[b, 'name'] = 'Mary', 'John'





      share|improve this answer

























      • Is df.iloc[:, :] equivalent to all rows and columns?

        – Alvis
        May 3 '17 at 10:03











      • It is, as would be df.loc[:, :]. It can be used to re-assign the values of the entire DataFrame or create a view of it.

        – JoeCondron
        May 3 '17 at 20:45














      104












      104








      104







      iloc works based on integer positioning. So no matter what your row labels are, you can always, e.g., get the first row by doing



      df.iloc[0]


      or the last five rows by doing



      df.iloc[-5:]


      You can also use it on the columns. This retrieves the 3rd column:



      df.iloc[:, 2] # the : in the first position indicates all rows


      You can combine them to get intersections of rows and columns:



      df.iloc[:3, :3] # The upper-left 3 X 3 entries (assuming df has 3+ rows and columns)


      On the other hand, .loc use named indices. Let's set up a data frame with strings as row and column labels:



      df = pd.DataFrame(index=['a', 'b', 'c'], columns=['time', 'date', 'name'])


      Then we can get the first row by



      df.loc['a'] # equivalent to df.iloc[0]


      and the second two rows of the 'date' column by



      df.loc['b':, 'date'] # equivalent to df.iloc[1:, 1]


      and so on. Now, it's probably worth pointing out that the default row and column indices for a DataFrame are integers from 0 and in this case iloc and loc would work in the same way. This is why your three examples are equivalent. If you had a non-numeric index such as strings or datetimes, df.loc[:5] would raise an error.



      Also, you can do column retrieval just by using the data frame's __getitem__:



      df['time'] # equivalent to df.loc[:, 'time']


      Now suppose you want to mix position and named indexing, that is, indexing using names on rows and positions on columns (to clarify, I mean select from our data frame, rather than creating a data frame with strings in the row index and integers in the column index). This is where .ix comes in:



      df.ix[:2, 'time'] # the first two rows of the 'time' column


      I think it's also worth mentioning that you can pass boolean vectors to the loc method as well. For example:



       b = [True, False, True]
      df.loc[b]


      Will return the 1st and 3rd rows of df. This is equivalent to df[b] for selection, but it can also be used for assigning via boolean vectors:



      df.loc[b, 'name'] = 'Mary', 'John'





      share|improve this answer















      iloc works based on integer positioning. So no matter what your row labels are, you can always, e.g., get the first row by doing



      df.iloc[0]


      or the last five rows by doing



      df.iloc[-5:]


      You can also use it on the columns. This retrieves the 3rd column:



      df.iloc[:, 2] # the : in the first position indicates all rows


      You can combine them to get intersections of rows and columns:



      df.iloc[:3, :3] # The upper-left 3 X 3 entries (assuming df has 3+ rows and columns)


      On the other hand, .loc use named indices. Let's set up a data frame with strings as row and column labels:



      df = pd.DataFrame(index=['a', 'b', 'c'], columns=['time', 'date', 'name'])


      Then we can get the first row by



      df.loc['a'] # equivalent to df.iloc[0]


      and the second two rows of the 'date' column by



      df.loc['b':, 'date'] # equivalent to df.iloc[1:, 1]


      and so on. Now, it's probably worth pointing out that the default row and column indices for a DataFrame are integers from 0 and in this case iloc and loc would work in the same way. This is why your three examples are equivalent. If you had a non-numeric index such as strings or datetimes, df.loc[:5] would raise an error.



      Also, you can do column retrieval just by using the data frame's __getitem__:



      df['time'] # equivalent to df.loc[:, 'time']


      Now suppose you want to mix position and named indexing, that is, indexing using names on rows and positions on columns (to clarify, I mean select from our data frame, rather than creating a data frame with strings in the row index and integers in the column index). This is where .ix comes in:



      df.ix[:2, 'time'] # the first two rows of the 'time' column


      I think it's also worth mentioning that you can pass boolean vectors to the loc method as well. For example:



       b = [True, False, True]
      df.loc[b]


      Will return the 1st and 3rd rows of df. This is equivalent to df[b] for selection, but it can also be used for assigning via boolean vectors:



      df.loc[b, 'name'] = 'Mary', 'John'






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Dec 7 '18 at 10:29









      nbro

      5,826105198




      5,826105198










      answered Jul 23 '15 at 17:17









      JoeCondronJoeCondron

      4,77311224




      4,77311224












      • Is df.iloc[:, :] equivalent to all rows and columns?

        – Alvis
        May 3 '17 at 10:03











      • It is, as would be df.loc[:, :]. It can be used to re-assign the values of the entire DataFrame or create a view of it.

        – JoeCondron
        May 3 '17 at 20:45


















      • Is df.iloc[:, :] equivalent to all rows and columns?

        – Alvis
        May 3 '17 at 10:03











      • It is, as would be df.loc[:, :]. It can be used to re-assign the values of the entire DataFrame or create a view of it.

        – JoeCondron
        May 3 '17 at 20:45

















      Is df.iloc[:, :] equivalent to all rows and columns?

      – Alvis
      May 3 '17 at 10:03





      Is df.iloc[:, :] equivalent to all rows and columns?

      – Alvis
      May 3 '17 at 10:03













      It is, as would be df.loc[:, :]. It can be used to re-assign the values of the entire DataFrame or create a view of it.

      – JoeCondron
      May 3 '17 at 20:45






      It is, as would be df.loc[:, :]. It can be used to re-assign the values of the entire DataFrame or create a view of it.

      – JoeCondron
      May 3 '17 at 20:45












      77














      In my opinion, the accepted answer is confusing, since it uses a DataFrame with only missing values. I also do not like the term position-based for .iloc and instead, prefer integer location as it is much more descriptive and exactly what .iloc stands for. The key word is INTEGER - .iloc needs INTEGERS.



      See my extremely detailed blog series on subset selection for more




      .ix is deprecated and ambiguous and should never be used



      Because .ix is deprecated we will only focus on the differences between .loc and .iloc.



      Before we talk about the differences, it is important to understand that DataFrames have labels that help identify each column and each index. Let's take a look at a sample DataFrame:



      df = pd.DataFrame('age':[30, 2, 12, 4, 32, 33, 69],
      'color':['blue', 'green', 'red', 'white', 'gray', 'black', 'red'],
      'food':['Steak', 'Lamb', 'Mango', 'Apple', 'Cheese', 'Melon', 'Beans'],
      'height':[165, 70, 120, 80, 180, 172, 150],
      'score':[4.6, 8.3, 9.0, 3.3, 1.8, 9.5, 2.2],
      'state':['NY', 'TX', 'FL', 'AL', 'AK', 'TX', 'TX']
      ,
      index=['Jane', 'Nick', 'Aaron', 'Penelope', 'Dean', 'Christina', 'Cornelia'])


      enter image description here



      All the words in bold are the labels. The labels, age, color, food, height, score and state are used for the columns. The other labels, Jane, Nick, Aaron, Penelope, Dean, Christina, Cornelia are used for the index.




      The primary ways to select particular rows in a DataFrame are with the .loc and .iloc indexers. Each of these indexers can also be used to simultaneously select columns but it is easier to just focus on rows for now. Also, each of the indexers use a set of brackets that immediately follow their name to make their selections.



      .loc selects data only by labels



      We will first talk about the .loc indexer which only selects data by the index or column labels. In our sample DataFrame, we have provided meaningful names as values for the index. Many DataFrames will not have any meaningful names and will instead, default to just the integers from 0 to n-1, where n is the length of the DataFrame.



      There are three different inputs you can use for .loc



      • A string

      • A list of strings

      • Slice notation using strings as the start and stop values

      Selecting a single row with .loc with a string



      To select a single row of data, place the index label inside of the brackets following .loc.



      df.loc['Penelope']


      This returns the row of data as a Series



      age 4
      color white
      food Apple
      height 80
      score 3.3
      state AL
      Name: Penelope, dtype: object


      Selecting multiple rows with .loc with a list of strings



      df.loc[['Cornelia', 'Jane', 'Dean']]


      This returns a DataFrame with the rows in the order specified in the list:



      enter image description here



      Selecting multiple rows with .loc with slice notation



      Slice notation is defined by a start, stop and step values. When slicing by label, pandas includes the stop value in the return. The following slices from Aaron to Dean, inclusive. Its step size is not explicitly defined but defaulted to 1.



      df.loc['Aaron':'Dean']


      enter image description here



      Complex slices can be taken in the same manner as Python lists.



      .iloc selects data only by integer location



      Let's now turn to .iloc. Every row and column of data in a DataFrame has an integer location that defines it. This is in addition to the label that is visually displayed in the output. The integer location is simply the number of rows/columns from the top/left beginning at 0.



      There are three different inputs you can use for .iloc



      • An integer

      • A list of integers

      • Slice notation using integers as the start and stop values

      Selecting a single row with .iloc with an integer



      df.iloc[4]


      This returns the 5th row (integer location 4) as a Series



      age 32
      color gray
      food Cheese
      height 180
      score 1.8
      state AK
      Name: Dean, dtype: object


      Selecting multiple rows with .iloc with a list of integers



      df.iloc[[2, -2]]


      This returns a DataFrame of the third and second to last rows:



      enter image description here



      Selecting multiple rows with .iloc with slice notation



      df.iloc[:5:3]


      enter image description here




      Simultaneous selection of rows and columns with .loc and .iloc



      One excellent ability of both .loc/.iloc is their ability to select both rows and columns simultaneously. In the examples above, all the columns were returned from each selection. We can choose columns with the same types of inputs as we do for rows. We simply need to separate the row and column selection with a comma.



      For example, we can select rows Jane, and Dean with just the columns height, score and state like this:



      df.loc[['Jane', 'Dean'], 'height':]


      enter image description here



      This uses a list of labels for the rows and slice notation for the columns



      We can naturally do similar operations with .iloc using only integers.



      df.iloc[[1,4], 2]
      Nick Lamb
      Dean Cheese
      Name: food, dtype: object



      Simultaneous selection with labels and integer location



      .ix was used to make selections simultaneously with labels and integer location which was useful but confusing and ambiguous at times and thankfully it has been deprecated. In the event that you need to make a selection with a mix of labels and integer locations, you will have to make both your selections labels or integer locations.



      For instance, if we want to select rows Nick and Cornelia along with columns 2 and 4, we could use .loc by converting the integers to labels with the following:



      col_names = df.columns[[2, 4]]
      df.loc[['Nick', 'Cornelia'], col_names]


      Or alternatively, convert the index labels to integers with the get_loc index method.



      labels = ['Nick', 'Cornelia']
      index_ints = [df.index.get_loc(label) for label in labels]
      df.iloc[index_ints, [2, 4]]


      Boolean Selection



      The .loc indexer can also do boolean selection. For instance, if we are interested in finding all the rows wher age is above 30 and return just the food and score columns we can do the following:



      df.loc[df['age'] > 30, ['food', 'score']] 


      You can replicate this with .iloc but you cannot pass it a boolean series. You must convert the boolean Series into a numpy array like this:



      df.iloc[(df['age'] > 30).values, [2, 4]] 



      Selecting all rows



      It is possible to use .loc/.iloc for just column selection. You can select all the rows by using a colon like this:



      df.loc[:, 'color':'score':2]


      enter image description here




      The indexing operator, [], can select rows and columns too but not simultaneously.



      Most people are familiar with the primary purpose of the DataFrame indexing operator, which is to select columns. A string selects a single column as a Series and a list of strings selects multiple columns as a DataFrame.



      df['food']

      Jane Steak
      Nick Lamb
      Aaron Mango
      Penelope Apple
      Dean Cheese
      Christina Melon
      Cornelia Beans
      Name: food, dtype: object


      Using a list selects multiple columns



      df[['food', 'score']]


      enter image description here



      What people are less familiar with, is that, when slice notation is used, then selection happens by row labels or by integer location. This is very confusing and something that I almost never use but it does work.



      df['Penelope':'Christina'] # slice rows by label


      enter image description here



      df[2:6:2] # slice rows by integer location


      enter image description here



      The explicitness of .loc/.iloc for selecting rows is highly preferred. The indexing operator alone is unable to select rows and columns simultaneously.



      df[3:5, 'color']
      TypeError: unhashable type: 'slice'





      share|improve this answer





























        77














        In my opinion, the accepted answer is confusing, since it uses a DataFrame with only missing values. I also do not like the term position-based for .iloc and instead, prefer integer location as it is much more descriptive and exactly what .iloc stands for. The key word is INTEGER - .iloc needs INTEGERS.



        See my extremely detailed blog series on subset selection for more




        .ix is deprecated and ambiguous and should never be used



        Because .ix is deprecated we will only focus on the differences between .loc and .iloc.



        Before we talk about the differences, it is important to understand that DataFrames have labels that help identify each column and each index. Let's take a look at a sample DataFrame:



        df = pd.DataFrame('age':[30, 2, 12, 4, 32, 33, 69],
        'color':['blue', 'green', 'red', 'white', 'gray', 'black', 'red'],
        'food':['Steak', 'Lamb', 'Mango', 'Apple', 'Cheese', 'Melon', 'Beans'],
        'height':[165, 70, 120, 80, 180, 172, 150],
        'score':[4.6, 8.3, 9.0, 3.3, 1.8, 9.5, 2.2],
        'state':['NY', 'TX', 'FL', 'AL', 'AK', 'TX', 'TX']
        ,
        index=['Jane', 'Nick', 'Aaron', 'Penelope', 'Dean', 'Christina', 'Cornelia'])


        enter image description here



        All the words in bold are the labels. The labels, age, color, food, height, score and state are used for the columns. The other labels, Jane, Nick, Aaron, Penelope, Dean, Christina, Cornelia are used for the index.




        The primary ways to select particular rows in a DataFrame are with the .loc and .iloc indexers. Each of these indexers can also be used to simultaneously select columns but it is easier to just focus on rows for now. Also, each of the indexers use a set of brackets that immediately follow their name to make their selections.



        .loc selects data only by labels



        We will first talk about the .loc indexer which only selects data by the index or column labels. In our sample DataFrame, we have provided meaningful names as values for the index. Many DataFrames will not have any meaningful names and will instead, default to just the integers from 0 to n-1, where n is the length of the DataFrame.



        There are three different inputs you can use for .loc



        • A string

        • A list of strings

        • Slice notation using strings as the start and stop values

        Selecting a single row with .loc with a string



        To select a single row of data, place the index label inside of the brackets following .loc.



        df.loc['Penelope']


        This returns the row of data as a Series



        age 4
        color white
        food Apple
        height 80
        score 3.3
        state AL
        Name: Penelope, dtype: object


        Selecting multiple rows with .loc with a list of strings



        df.loc[['Cornelia', 'Jane', 'Dean']]


        This returns a DataFrame with the rows in the order specified in the list:



        enter image description here



        Selecting multiple rows with .loc with slice notation



        Slice notation is defined by a start, stop and step values. When slicing by label, pandas includes the stop value in the return. The following slices from Aaron to Dean, inclusive. Its step size is not explicitly defined but defaulted to 1.



        df.loc['Aaron':'Dean']


        enter image description here



        Complex slices can be taken in the same manner as Python lists.



        .iloc selects data only by integer location



        Let's now turn to .iloc. Every row and column of data in a DataFrame has an integer location that defines it. This is in addition to the label that is visually displayed in the output. The integer location is simply the number of rows/columns from the top/left beginning at 0.



        There are three different inputs you can use for .iloc



        • An integer

        • A list of integers

        • Slice notation using integers as the start and stop values

        Selecting a single row with .iloc with an integer



        df.iloc[4]


        This returns the 5th row (integer location 4) as a Series



        age 32
        color gray
        food Cheese
        height 180
        score 1.8
        state AK
        Name: Dean, dtype: object


        Selecting multiple rows with .iloc with a list of integers



        df.iloc[[2, -2]]


        This returns a DataFrame of the third and second to last rows:



        enter image description here



        Selecting multiple rows with .iloc with slice notation



        df.iloc[:5:3]


        enter image description here




        Simultaneous selection of rows and columns with .loc and .iloc



        One excellent ability of both .loc/.iloc is their ability to select both rows and columns simultaneously. In the examples above, all the columns were returned from each selection. We can choose columns with the same types of inputs as we do for rows. We simply need to separate the row and column selection with a comma.



        For example, we can select rows Jane, and Dean with just the columns height, score and state like this:



        df.loc[['Jane', 'Dean'], 'height':]


        enter image description here



        This uses a list of labels for the rows and slice notation for the columns



        We can naturally do similar operations with .iloc using only integers.



        df.iloc[[1,4], 2]
        Nick Lamb
        Dean Cheese
        Name: food, dtype: object



        Simultaneous selection with labels and integer location



        .ix was used to make selections simultaneously with labels and integer location which was useful but confusing and ambiguous at times and thankfully it has been deprecated. In the event that you need to make a selection with a mix of labels and integer locations, you will have to make both your selections labels or integer locations.



        For instance, if we want to select rows Nick and Cornelia along with columns 2 and 4, we could use .loc by converting the integers to labels with the following:



        col_names = df.columns[[2, 4]]
        df.loc[['Nick', 'Cornelia'], col_names]


        Or alternatively, convert the index labels to integers with the get_loc index method.



        labels = ['Nick', 'Cornelia']
        index_ints = [df.index.get_loc(label) for label in labels]
        df.iloc[index_ints, [2, 4]]


        Boolean Selection



        The .loc indexer can also do boolean selection. For instance, if we are interested in finding all the rows wher age is above 30 and return just the food and score columns we can do the following:



        df.loc[df['age'] > 30, ['food', 'score']] 


        You can replicate this with .iloc but you cannot pass it a boolean series. You must convert the boolean Series into a numpy array like this:



        df.iloc[(df['age'] > 30).values, [2, 4]] 



        Selecting all rows



        It is possible to use .loc/.iloc for just column selection. You can select all the rows by using a colon like this:



        df.loc[:, 'color':'score':2]


        enter image description here




        The indexing operator, [], can select rows and columns too but not simultaneously.



        Most people are familiar with the primary purpose of the DataFrame indexing operator, which is to select columns. A string selects a single column as a Series and a list of strings selects multiple columns as a DataFrame.



        df['food']

        Jane Steak
        Nick Lamb
        Aaron Mango
        Penelope Apple
        Dean Cheese
        Christina Melon
        Cornelia Beans
        Name: food, dtype: object


        Using a list selects multiple columns



        df[['food', 'score']]


        enter image description here



        What people are less familiar with, is that, when slice notation is used, then selection happens by row labels or by integer location. This is very confusing and something that I almost never use but it does work.



        df['Penelope':'Christina'] # slice rows by label


        enter image description here



        df[2:6:2] # slice rows by integer location


        enter image description here



        The explicitness of .loc/.iloc for selecting rows is highly preferred. The indexing operator alone is unable to select rows and columns simultaneously.



        df[3:5, 'color']
        TypeError: unhashable type: 'slice'





        share|improve this answer



























          77












          77








          77







          In my opinion, the accepted answer is confusing, since it uses a DataFrame with only missing values. I also do not like the term position-based for .iloc and instead, prefer integer location as it is much more descriptive and exactly what .iloc stands for. The key word is INTEGER - .iloc needs INTEGERS.



          See my extremely detailed blog series on subset selection for more




          .ix is deprecated and ambiguous and should never be used



          Because .ix is deprecated we will only focus on the differences between .loc and .iloc.



          Before we talk about the differences, it is important to understand that DataFrames have labels that help identify each column and each index. Let's take a look at a sample DataFrame:



          df = pd.DataFrame('age':[30, 2, 12, 4, 32, 33, 69],
          'color':['blue', 'green', 'red', 'white', 'gray', 'black', 'red'],
          'food':['Steak', 'Lamb', 'Mango', 'Apple', 'Cheese', 'Melon', 'Beans'],
          'height':[165, 70, 120, 80, 180, 172, 150],
          'score':[4.6, 8.3, 9.0, 3.3, 1.8, 9.5, 2.2],
          'state':['NY', 'TX', 'FL', 'AL', 'AK', 'TX', 'TX']
          ,
          index=['Jane', 'Nick', 'Aaron', 'Penelope', 'Dean', 'Christina', 'Cornelia'])


          enter image description here



          All the words in bold are the labels. The labels, age, color, food, height, score and state are used for the columns. The other labels, Jane, Nick, Aaron, Penelope, Dean, Christina, Cornelia are used for the index.




          The primary ways to select particular rows in a DataFrame are with the .loc and .iloc indexers. Each of these indexers can also be used to simultaneously select columns but it is easier to just focus on rows for now. Also, each of the indexers use a set of brackets that immediately follow their name to make their selections.



          .loc selects data only by labels



          We will first talk about the .loc indexer which only selects data by the index or column labels. In our sample DataFrame, we have provided meaningful names as values for the index. Many DataFrames will not have any meaningful names and will instead, default to just the integers from 0 to n-1, where n is the length of the DataFrame.



          There are three different inputs you can use for .loc



          • A string

          • A list of strings

          • Slice notation using strings as the start and stop values

          Selecting a single row with .loc with a string



          To select a single row of data, place the index label inside of the brackets following .loc.



          df.loc['Penelope']


          This returns the row of data as a Series



          age 4
          color white
          food Apple
          height 80
          score 3.3
          state AL
          Name: Penelope, dtype: object


          Selecting multiple rows with .loc with a list of strings



          df.loc[['Cornelia', 'Jane', 'Dean']]


          This returns a DataFrame with the rows in the order specified in the list:



          enter image description here



          Selecting multiple rows with .loc with slice notation



          Slice notation is defined by a start, stop and step values. When slicing by label, pandas includes the stop value in the return. The following slices from Aaron to Dean, inclusive. Its step size is not explicitly defined but defaulted to 1.



          df.loc['Aaron':'Dean']


          enter image description here



          Complex slices can be taken in the same manner as Python lists.



          .iloc selects data only by integer location



          Let's now turn to .iloc. Every row and column of data in a DataFrame has an integer location that defines it. This is in addition to the label that is visually displayed in the output. The integer location is simply the number of rows/columns from the top/left beginning at 0.



          There are three different inputs you can use for .iloc



          • An integer

          • A list of integers

          • Slice notation using integers as the start and stop values

          Selecting a single row with .iloc with an integer



          df.iloc[4]


          This returns the 5th row (integer location 4) as a Series



          age 32
          color gray
          food Cheese
          height 180
          score 1.8
          state AK
          Name: Dean, dtype: object


          Selecting multiple rows with .iloc with a list of integers



          df.iloc[[2, -2]]


          This returns a DataFrame of the third and second to last rows:



          enter image description here



          Selecting multiple rows with .iloc with slice notation



          df.iloc[:5:3]


          enter image description here




          Simultaneous selection of rows and columns with .loc and .iloc



          One excellent ability of both .loc/.iloc is their ability to select both rows and columns simultaneously. In the examples above, all the columns were returned from each selection. We can choose columns with the same types of inputs as we do for rows. We simply need to separate the row and column selection with a comma.



          For example, we can select rows Jane, and Dean with just the columns height, score and state like this:



          df.loc[['Jane', 'Dean'], 'height':]


          enter image description here



          This uses a list of labels for the rows and slice notation for the columns



          We can naturally do similar operations with .iloc using only integers.



          df.iloc[[1,4], 2]
          Nick Lamb
          Dean Cheese
          Name: food, dtype: object



          Simultaneous selection with labels and integer location



          .ix was used to make selections simultaneously with labels and integer location which was useful but confusing and ambiguous at times and thankfully it has been deprecated. In the event that you need to make a selection with a mix of labels and integer locations, you will have to make both your selections labels or integer locations.



          For instance, if we want to select rows Nick and Cornelia along with columns 2 and 4, we could use .loc by converting the integers to labels with the following:



          col_names = df.columns[[2, 4]]
          df.loc[['Nick', 'Cornelia'], col_names]


          Or alternatively, convert the index labels to integers with the get_loc index method.



          labels = ['Nick', 'Cornelia']
          index_ints = [df.index.get_loc(label) for label in labels]
          df.iloc[index_ints, [2, 4]]


          Boolean Selection



          The .loc indexer can also do boolean selection. For instance, if we are interested in finding all the rows wher age is above 30 and return just the food and score columns we can do the following:



          df.loc[df['age'] > 30, ['food', 'score']] 


          You can replicate this with .iloc but you cannot pass it a boolean series. You must convert the boolean Series into a numpy array like this:



          df.iloc[(df['age'] > 30).values, [2, 4]] 



          Selecting all rows



          It is possible to use .loc/.iloc for just column selection. You can select all the rows by using a colon like this:



          df.loc[:, 'color':'score':2]


          enter image description here




          The indexing operator, [], can select rows and columns too but not simultaneously.



          Most people are familiar with the primary purpose of the DataFrame indexing operator, which is to select columns. A string selects a single column as a Series and a list of strings selects multiple columns as a DataFrame.



          df['food']

          Jane Steak
          Nick Lamb
          Aaron Mango
          Penelope Apple
          Dean Cheese
          Christina Melon
          Cornelia Beans
          Name: food, dtype: object


          Using a list selects multiple columns



          df[['food', 'score']]


          enter image description here



          What people are less familiar with, is that, when slice notation is used, then selection happens by row labels or by integer location. This is very confusing and something that I almost never use but it does work.



          df['Penelope':'Christina'] # slice rows by label


          enter image description here



          df[2:6:2] # slice rows by integer location


          enter image description here



          The explicitness of .loc/.iloc for selecting rows is highly preferred. The indexing operator alone is unable to select rows and columns simultaneously.



          df[3:5, 'color']
          TypeError: unhashable type: 'slice'





          share|improve this answer















          In my opinion, the accepted answer is confusing, since it uses a DataFrame with only missing values. I also do not like the term position-based for .iloc and instead, prefer integer location as it is much more descriptive and exactly what .iloc stands for. The key word is INTEGER - .iloc needs INTEGERS.



          See my extremely detailed blog series on subset selection for more




          .ix is deprecated and ambiguous and should never be used



          Because .ix is deprecated we will only focus on the differences between .loc and .iloc.



          Before we talk about the differences, it is important to understand that DataFrames have labels that help identify each column and each index. Let's take a look at a sample DataFrame:



          df = pd.DataFrame('age':[30, 2, 12, 4, 32, 33, 69],
          'color':['blue', 'green', 'red', 'white', 'gray', 'black', 'red'],
          'food':['Steak', 'Lamb', 'Mango', 'Apple', 'Cheese', 'Melon', 'Beans'],
          'height':[165, 70, 120, 80, 180, 172, 150],
          'score':[4.6, 8.3, 9.0, 3.3, 1.8, 9.5, 2.2],
          'state':['NY', 'TX', 'FL', 'AL', 'AK', 'TX', 'TX']
          ,
          index=['Jane', 'Nick', 'Aaron', 'Penelope', 'Dean', 'Christina', 'Cornelia'])


          enter image description here



          All the words in bold are the labels. The labels, age, color, food, height, score and state are used for the columns. The other labels, Jane, Nick, Aaron, Penelope, Dean, Christina, Cornelia are used for the index.




          The primary ways to select particular rows in a DataFrame are with the .loc and .iloc indexers. Each of these indexers can also be used to simultaneously select columns but it is easier to just focus on rows for now. Also, each of the indexers use a set of brackets that immediately follow their name to make their selections.



          .loc selects data only by labels



          We will first talk about the .loc indexer which only selects data by the index or column labels. In our sample DataFrame, we have provided meaningful names as values for the index. Many DataFrames will not have any meaningful names and will instead, default to just the integers from 0 to n-1, where n is the length of the DataFrame.



          There are three different inputs you can use for .loc



          • A string

          • A list of strings

          • Slice notation using strings as the start and stop values

          Selecting a single row with .loc with a string



          To select a single row of data, place the index label inside of the brackets following .loc.



          df.loc['Penelope']


          This returns the row of data as a Series



          age 4
          color white
          food Apple
          height 80
          score 3.3
          state AL
          Name: Penelope, dtype: object


          Selecting multiple rows with .loc with a list of strings



          df.loc[['Cornelia', 'Jane', 'Dean']]


          This returns a DataFrame with the rows in the order specified in the list:



          enter image description here



          Selecting multiple rows with .loc with slice notation



          Slice notation is defined by a start, stop and step values. When slicing by label, pandas includes the stop value in the return. The following slices from Aaron to Dean, inclusive. Its step size is not explicitly defined but defaulted to 1.



          df.loc['Aaron':'Dean']


          enter image description here



          Complex slices can be taken in the same manner as Python lists.



          .iloc selects data only by integer location



          Let's now turn to .iloc. Every row and column of data in a DataFrame has an integer location that defines it. This is in addition to the label that is visually displayed in the output. The integer location is simply the number of rows/columns from the top/left beginning at 0.



          There are three different inputs you can use for .iloc



          • An integer

          • A list of integers

          • Slice notation using integers as the start and stop values

          Selecting a single row with .iloc with an integer



          df.iloc[4]


          This returns the 5th row (integer location 4) as a Series



          age 32
          color gray
          food Cheese
          height 180
          score 1.8
          state AK
          Name: Dean, dtype: object


          Selecting multiple rows with .iloc with a list of integers



          df.iloc[[2, -2]]


          This returns a DataFrame of the third and second to last rows:



          enter image description here



          Selecting multiple rows with .iloc with slice notation



          df.iloc[:5:3]


          enter image description here




          Simultaneous selection of rows and columns with .loc and .iloc



          One excellent ability of both .loc/.iloc is their ability to select both rows and columns simultaneously. In the examples above, all the columns were returned from each selection. We can choose columns with the same types of inputs as we do for rows. We simply need to separate the row and column selection with a comma.



          For example, we can select rows Jane, and Dean with just the columns height, score and state like this:



          df.loc[['Jane', 'Dean'], 'height':]


          enter image description here



          This uses a list of labels for the rows and slice notation for the columns



          We can naturally do similar operations with .iloc using only integers.



          df.iloc[[1,4], 2]
          Nick Lamb
          Dean Cheese
          Name: food, dtype: object



          Simultaneous selection with labels and integer location



          .ix was used to make selections simultaneously with labels and integer location which was useful but confusing and ambiguous at times and thankfully it has been deprecated. In the event that you need to make a selection with a mix of labels and integer locations, you will have to make both your selections labels or integer locations.



          For instance, if we want to select rows Nick and Cornelia along with columns 2 and 4, we could use .loc by converting the integers to labels with the following:



          col_names = df.columns[[2, 4]]
          df.loc[['Nick', 'Cornelia'], col_names]


          Or alternatively, convert the index labels to integers with the get_loc index method.



          labels = ['Nick', 'Cornelia']
          index_ints = [df.index.get_loc(label) for label in labels]
          df.iloc[index_ints, [2, 4]]


          Boolean Selection



          The .loc indexer can also do boolean selection. For instance, if we are interested in finding all the rows wher age is above 30 and return just the food and score columns we can do the following:



          df.loc[df['age'] > 30, ['food', 'score']] 


          You can replicate this with .iloc but you cannot pass it a boolean series. You must convert the boolean Series into a numpy array like this:



          df.iloc[(df['age'] > 30).values, [2, 4]] 



          Selecting all rows



          It is possible to use .loc/.iloc for just column selection. You can select all the rows by using a colon like this:



          df.loc[:, 'color':'score':2]


          enter image description here




          The indexing operator, [], can select rows and columns too but not simultaneously.



          Most people are familiar with the primary purpose of the DataFrame indexing operator, which is to select columns. A string selects a single column as a Series and a list of strings selects multiple columns as a DataFrame.



          df['food']

          Jane Steak
          Nick Lamb
          Aaron Mango
          Penelope Apple
          Dean Cheese
          Christina Melon
          Cornelia Beans
          Name: food, dtype: object


          Using a list selects multiple columns



          df[['food', 'score']]


          enter image description here



          What people are less familiar with, is that, when slice notation is used, then selection happens by row labels or by integer location. This is very confusing and something that I almost never use but it does work.



          df['Penelope':'Christina'] # slice rows by label


          enter image description here



          df[2:6:2] # slice rows by integer location


          enter image description here



          The explicitness of .loc/.iloc for selecting rows is highly preferred. The indexing operator alone is unable to select rows and columns simultaneously.



          df[3:5, 'color']
          TypeError: unhashable type: 'slice'






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Oct 27 '18 at 1:53

























          answered Oct 24 '17 at 16:39









          Ted PetrouTed Petrou

          25k97771




          25k97771





















              -2














              Let me tell you that ix was in the previous versions of pandas.and iloc and loc is incorporated in its latest versions.



              • IX-this is used to parse any particular data from the data frame using either with the label or row and column index at a time.So there was a bit of issue generated as in some case where column index and row index both were the combination of number and string labels.

                Example:-df.ix[:2, 'time']

              Now come to loc.



              • This parse the data using the labels as index, whether it is column or row.

                Example:- df.loc[:, 'color':'score':2]

              Now for iloc.



              • What we do is we provide both the column and row as index (denoted by the number)

                Example:- df.iloc[[1,4], 2]





              share|improve this answer





























                -2














                Let me tell you that ix was in the previous versions of pandas.and iloc and loc is incorporated in its latest versions.



                • IX-this is used to parse any particular data from the data frame using either with the label or row and column index at a time.So there was a bit of issue generated as in some case where column index and row index both were the combination of number and string labels.

                  Example:-df.ix[:2, 'time']

                Now come to loc.



                • This parse the data using the labels as index, whether it is column or row.

                  Example:- df.loc[:, 'color':'score':2]

                Now for iloc.



                • What we do is we provide both the column and row as index (denoted by the number)

                  Example:- df.iloc[[1,4], 2]





                share|improve this answer



























                  -2












                  -2








                  -2







                  Let me tell you that ix was in the previous versions of pandas.and iloc and loc is incorporated in its latest versions.



                  • IX-this is used to parse any particular data from the data frame using either with the label or row and column index at a time.So there was a bit of issue generated as in some case where column index and row index both were the combination of number and string labels.

                    Example:-df.ix[:2, 'time']

                  Now come to loc.



                  • This parse the data using the labels as index, whether it is column or row.

                    Example:- df.loc[:, 'color':'score':2]

                  Now for iloc.



                  • What we do is we provide both the column and row as index (denoted by the number)

                    Example:- df.iloc[[1,4], 2]





                  share|improve this answer















                  Let me tell you that ix was in the previous versions of pandas.and iloc and loc is incorporated in its latest versions.



                  • IX-this is used to parse any particular data from the data frame using either with the label or row and column index at a time.So there was a bit of issue generated as in some case where column index and row index both were the combination of number and string labels.

                    Example:-df.ix[:2, 'time']

                  Now come to loc.



                  • This parse the data using the labels as index, whether it is column or row.

                    Example:- df.loc[:, 'color':'score':2]

                  Now for iloc.



                  • What we do is we provide both the column and row as index (denoted by the number)

                    Example:- df.iloc[[1,4], 2]






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Feb 3 at 16:23

























                  answered Feb 3 at 9:37









                  Shobhit SrivastavaShobhit Srivastava

                  13




                  13



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f31593201%2fhow-are-iloc-ix-and-loc-different%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      1928 у кіно

                      Захаров Федір Захарович

                      Ель Греко