Splitting A String With A Unicode Delimiter?
Given the string: str = 'Led Zeppelin — Blackdog' how do I split it at —, ending up with: ['Led Zeppelin', 'Blackdog'] but — is not an hyphen; it is encoded as u'\u2014' ho
Solution 1:
You can just split on explicitly what you've provided if you want it to be clear that it is not a hyphen, surrounded by a whitespace character if that is standard-included with the character. Also, don't shadow built-ins with str
as a variable name.
>>> s = 'Led Zeppelin — Blackdog'
>>> s.split(u' \u2014 ')
['Led Zeppelin', 'Blackdog']
>>> s.split(' — ') # perhaps less explicit
['Led Zeppelin', 'Blackdog']
Post a Comment for "Splitting A String With A Unicode Delimiter?"