Skip to content Skip to sidebar Skip to footer

Pyyaml Parses '9:00' As Int

I have a file with the following data: classes: - 9:00 - 10:20 - 12:10 (and so on up to 21:00) I use python3 and yaml module to parse it. Precisely, the source is config = y

Solution 1:

The documentation of YAML is a bit difficult to "parse" so I can imagine you missed this little bit of info about colons:

Normally, YAML insists the “:” mapping value indicator be separated from the value by white space. A benefit of this restriction is that the “:” character can be used inside plain scalars, as long as it is not followed by white space. This allows for unquoted URLs and timestamps. It is also a potential source for confusion as “a:1” is a plain scalar and not a key: value pair.

And what you have there in your input is a sexagesimal and your 9:00 is considered to be similar to 9 minutes and 0 seconds, equalling a total of 540 seconds.

Unfortunately this doesn't get constructed as some special Sexagesimal instance that can be used for calculations as if it were an integer but can be printed in its original form. Therefore, if you want to use this as a string internally you have to single quote them:

classes:
  - '9:00'
  - '10:20'
  - '12:10'

which is what you would get if you dump {'classes': ['9:00', '10:20', '12:10']} (and note that the unambiguous classes doesn't get any quotes).

That the BaseLoader gives you strings is not surprising. The BaseConstructor that is used by the BaseLoader handles any scalar as string, including integers, booleans and "your" sexagesimals:

import ruamel.yaml as yaml

yaml_str = """\
classes:
  - 12345
  - 10:20
  - abc
  - True
"""

data = yaml.load(yaml_str, Loader=yaml.BaseLoader)
print(data)
data = yaml.load(yaml_str, Loader=yaml.SafeLoader)

gives:

{u'classes': [u'12345', u'10:20', u'abc', u'True']}
{'classes': [12345, 620, 'abc', True]}

If you really don't want to use quotes, then you have to "reset" the implicit resolver for scalars that start with numbers:

import ruamel.yaml as yaml
from ruamel.yaml.resolver import Resolver
import re

yaml_str = """\
classes:
  - 9:00
  - 10:20
  - 12:10
"""for ch inlist(u'-+0123456789'):
    del Resolver.yaml_implicit_resolvers[ch]
Resolver.add_implicit_resolver(
    u'tag:yaml.org,2002:int',
    re.compile(u'''^(?:[-+]?0b[0-1_]+
    |[-+]?0o?[0-7_]+
    |[-+]?(?:0|[1-9][0-9_]*)
    |[-+]?0x[0-9a-fA-F_]+)$''', re.X),  # <- copy from resolver.py without sexagesimal supportlist(u'-+0123456789'))

data = yaml.load(yaml_str, Loader=yaml.SafeLoader)
print(data)

gives you:

{'classes': ['9:00', '10:20', '12:10']}

Solution 2:

You should probably check the documentation of YAML

The colon are for mapping values.

I presume you want a string and not an integer, so you should double quote your strings.

Post a Comment for "Pyyaml Parses '9:00' As Int"