Skip to content Skip to sidebar Skip to footer

Syntaxerror When Accessing Column Named "class" In Pandas Dataframe

I have pandas DataFrame named 'dataset' and it contains a column named 'class' when I execute the following line I get SyntaxError: invalid syntax print('Unique values in the Class

Solution 1:

class is a keyword in python. A rule of thumb: whenever you're dealing with column names that cannot be used as valid variable names in python, you must use the bracket notation to access: dataset['class'].unique().

There are, of course, exceptions here, but they work against your favour. For example, min/max is a valid variable name in python (even though it shadows builtins). In the case of pandas, however, you cannot refer to such a named column using the Attribute Access notation. There are more such exceptions, they're enumerated in the documentation.

A good place to begin with further reading is the documentation on Attribute Access. Specifically, the red Warning box), which I'm adding here for posterity:

  • You can use this access only if the index element is a valid Python identifier, e.g. s.1 is not allowed. See here for an explanation of valid identifiers.

  • The attribute will not be available if it conflicts with an existing method name, e.g. s.min is not allowed, but s['min'] is possible.

  • Similarly, the attribute will not be available if it conflicts with any of the following list: index, major_axis, minor_axis, items.

  • In any of these cases, standard indexing will still work, e.g. s['1'], s['min'], and s['index'] will access the corresponding element or column.

Solution 2:

class is reserved word. You can do as dataset['class'].unique()

Post a Comment for "Syntaxerror When Accessing Column Named "class" In Pandas Dataframe"