Skip to content Skip to sidebar Skip to footer

Python Split Function. Too Many Values To Unpack Error

I have a python function that must read data from file and split it into two key and value, and then store it in dictionary. Example: file: http://google.com 2 http://python.org 3

Solution 1:

You are trying to unwrap the split list in to these two variables.

url, count = line.split()

What if there is no space or two or more spaces? Where will the rest of the words go?

data = "abcd"
print data.split()    # ['abcd']
data = "ab cd"
print data.split()    # ['ab', 'cd']
data = "a b c d"
print data.split()    # ['a', 'b', 'c', 'd']

You can actually check the length before assigning

withopen(urls_file_path, "r") as f:
    for idx, line inenumerate(f, 1):
        split_list = line.split()
        iflen(split_list) != 2:
            raise ValueError("Line {}: '{}' has {} spaces, expected 1"
                .format(idx, line.rstrip(), len(split_list) - 1))
        else:
            url, count = split_list
            print url, count

With the input file,

http://google.com 2
http://python.org 3
http://python.org 4 Welcome
http://python.org 5

This program produces,

$ python Test.py
Read Data: http://google.com 2
Read Data: http://python.org 3
Traceback (most recent call last):
  File "Test.py", line 6, in <module>
    .format(idx, line.rstrip(), len(split_list) - 1))
ValueError: Line 3: 'http://python.org 4 Welcome' has 2 spaces, expected 1

Following @abarnert's comment, you can use partition function like this

url, _, count = data.partition(" ")

If there are more than one spaces/no space, then count will hold rest of the string or empty string, respectively.

If you are using Python 3.x, you can do something like this

first, second, *rest = data.split()

First two values will be assigned in first and second respectively and the rest of the list will be assigned to rest, in Python 3.x

Solution 2:

Too many values to unpack error will come when number of identifier on left side of your split expression is less than the number of seperator(here blank space) in the line, so you have to keep exact number of identifiers on left for assignment of expression after splitting by your seperator

Solution 3:

The "too many values to unpack" error can also be returned by the str.split method of pandas data frames.

For example splitting a character vector on the "," pattern:

import pandas
df = pandas.DataFrame({"x": ["a", "a, b", "a,b,c"]})
df.x.str.split(",")

# 0          [a]# 1      [a,  b]# 2    [a, b, c]


df.x.str.split(",", n=1)

# 0         [a]# 1     [a,  b]# 2    [a, b,c]


df.x.str.split(",", expand=True)

#    0     1     2# 0  a  None  None# 1  a     b  None# 2  a     b     c

df.x.str.split(",", n=1, expand=True)

#    0     1# 0  a  None# 1  a     b# 2  a   b,c

The following version works only if each row has exactly 2 splits. It fails with the error “too many values to unpack (expected 2)” in this example

df["y"], df["z"] = df.x.str.split(",", n=1)

The last version with both n=1 and expand=True is the one to use for multiple vector assignment. It is equivalent to tidyr::separate in R.

df[["y", "z"]] = df.x.str.split(",", n=1, expand=True)
df#        x  y     z# 0      a  a  None# 1   a, b  a     b# 2  a,b,c  a   b,c

According to the documentation of pandas.Series.str.split If n > 0 and

"If for a certain row the number of found splits < n, append None for padding up to n if expand=True."

Post a Comment for "Python Split Function. Too Many Values To Unpack Error"