Parse YAML and suppose a specific path is always a string

I use the YAML parser from http://pyyaml.org , and I want it to always interpret certain fields as a string, but I cannot understand how add_path_resolver () works.

For example: The parser assumes that the "version" is a float:

network: - name: apple - name: orange version: 2.3 site: banana 

Some files have "version: 2" (which is interpreted as int) or "version: 2.3 alpha" (which is interpreted as str).

I want them to always be interpreted as str.

It seems that yaml.add_path_resolver () should let me indicate: "When you see a version: always interpret it as str), but it is not documented very well. My best guess:

 yaml.add_path_resolver(u'!root', ['version'], kind=str) 

But that does not work.

Suggestions on how to make my version field always be a string?

PS Here are some examples of the various "version" lines and their interpretation:

 (Pdb) import yaml (Pdb) import pprint (Pdb) pprint.pprint(yaml.load("---\nnetwork:\n- name: apple\n- name: orange\nversion: 2\nsite: banana")) {'network': [{'name': 'apple'}, {'name': 'orange'}], 'site': 'banana', 'version': 2} (Pdb) pprint.pprint(yaml.load("---\nnetwork:\n- name: apple\n- name: orange\nversion: 2.3\nsite: banana")) {'network': [{'name': 'apple'}, {'name': 'orange'}], 'site': 'banana', 'version': 2.2999999999999998} (Pdb) pprint.pprint(yaml.load("---\nnetwork:\n- name: apple\n- name: orange\nversion: 2.3 alpha\nsite: banana")) {'network': [{'name': 'apple'}, {'name': 'orange'}], 'site': 'banana', 'version': '2.3 alpha'} 
+4
source share
2 answers

To date, the easiest solution for this is not to use basic .load() (which is unsafe anyway), but to use it with Loader=BaseLoader , which loads each scalar as a string:

 import yaml yaml_str = """\ network: - name: apple - name: orange version: 2.3 old: 2 site: banana """ data = yaml.load(yaml_str, Loader=yaml.BaseLoader) print(data) 

gives:

 {'network': [{'name': 'apple'}, {'name': 'orange'}], 'version': '2.3', 'old': '2', 'site': 'banana'} 
0
source

From current source:

  # Note: `add_path_resolver` is experimental. The API could be changed. 

It seems that it is not completed (yet?). The syntax that will work (as far as I can tell):

 yaml.add_path_resolver(u'tag:yaml.org,2002:str', ['version'], yaml.ScalarNode) 

However, it is not.

Apparently, implicit type resolvers are checked first, and if they match, then it never checks the user-defined converters. See resolver.py for more details (find the resolve function).

I suggest changing your version entry to

 version: !!str 2.3 

This will always force it to a string.

+2
source

All Articles