Set data types in YAML
I have a configuration file in YAML which contains strings, floats, integers and a list. I would like, when the YAML is loaded to return the list a numpy array. So, for example, if the YAML is as follows:
name: 'John Doe'
age: 20
score:
-- 19
- 45
-- 21
- 12
-- 32
- 13
and I read this by
import yaml
def read(CONFIG_FILE):
with open(CONFIG_FILE) as c:
return yaml.load(c)
config = read('pathtoyml')
then I would like config['score'] instead of list to be typed as a numpy.array. Of course, this could easily be done outside YAML with something like numpy.array(config['score']) but I want to avoid that.
I have tried setting the tag as described in the documentation (https://pyyaml.org/wiki/PyYAMLDocumentation) but I can not make it work. So for example, the following fails:
score:!!python/object:numpy.array
-- 19
- 45
-- 21
- 12
-- 32
- 13
Changing the tag to !!python/module:numpy.array or !!python/name:numpy.array doesn't work either.
How can I make this work? I am using Python v.3
python python-3.x yaml
add a comment |
I have a configuration file in YAML which contains strings, floats, integers and a list. I would like, when the YAML is loaded to return the list a numpy array. So, for example, if the YAML is as follows:
name: 'John Doe'
age: 20
score:
-- 19
- 45
-- 21
- 12
-- 32
- 13
and I read this by
import yaml
def read(CONFIG_FILE):
with open(CONFIG_FILE) as c:
return yaml.load(c)
config = read('pathtoyml')
then I would like config['score'] instead of list to be typed as a numpy.array. Of course, this could easily be done outside YAML with something like numpy.array(config['score']) but I want to avoid that.
I have tried setting the tag as described in the documentation (https://pyyaml.org/wiki/PyYAMLDocumentation) but I can not make it work. So for example, the following fails:
score:!!python/object:numpy.array
-- 19
- 45
-- 21
- 12
-- 32
- 13
Changing the tag to !!python/module:numpy.array or !!python/name:numpy.array doesn't work either.
How can I make this work? I am using Python v.3
python python-3.x yaml
There is no list (i.e. sequence in your YAML), because there are no spaces after the dash that starts the value forscore, that is not sequence entry indicator and that and the rest is one plain multi-line scalar.
– Anthon
Jan 18 at 19:55
add a comment |
I have a configuration file in YAML which contains strings, floats, integers and a list. I would like, when the YAML is loaded to return the list a numpy array. So, for example, if the YAML is as follows:
name: 'John Doe'
age: 20
score:
-- 19
- 45
-- 21
- 12
-- 32
- 13
and I read this by
import yaml
def read(CONFIG_FILE):
with open(CONFIG_FILE) as c:
return yaml.load(c)
config = read('pathtoyml')
then I would like config['score'] instead of list to be typed as a numpy.array. Of course, this could easily be done outside YAML with something like numpy.array(config['score']) but I want to avoid that.
I have tried setting the tag as described in the documentation (https://pyyaml.org/wiki/PyYAMLDocumentation) but I can not make it work. So for example, the following fails:
score:!!python/object:numpy.array
-- 19
- 45
-- 21
- 12
-- 32
- 13
Changing the tag to !!python/module:numpy.array or !!python/name:numpy.array doesn't work either.
How can I make this work? I am using Python v.3
python python-3.x yaml
I have a configuration file in YAML which contains strings, floats, integers and a list. I would like, when the YAML is loaded to return the list a numpy array. So, for example, if the YAML is as follows:
name: 'John Doe'
age: 20
score:
-- 19
- 45
-- 21
- 12
-- 32
- 13
and I read this by
import yaml
def read(CONFIG_FILE):
with open(CONFIG_FILE) as c:
return yaml.load(c)
config = read('pathtoyml')
then I would like config['score'] instead of list to be typed as a numpy.array. Of course, this could easily be done outside YAML with something like numpy.array(config['score']) but I want to avoid that.
I have tried setting the tag as described in the documentation (https://pyyaml.org/wiki/PyYAMLDocumentation) but I can not make it work. So for example, the following fails:
score:!!python/object:numpy.array
-- 19
- 45
-- 21
- 12
-- 32
- 13
Changing the tag to !!python/module:numpy.array or !!python/name:numpy.array doesn't work either.
How can I make this work? I am using Python v.3
python python-3.x yaml
python python-3.x yaml
edited Jan 18 at 19:53
Anthon
29.3k1693145
29.3k1693145
asked Jan 18 at 17:30
AenaonAenaon
5571717
5571717
There is no list (i.e. sequence in your YAML), because there are no spaces after the dash that starts the value forscore, that is not sequence entry indicator and that and the rest is one plain multi-line scalar.
– Anthon
Jan 18 at 19:55
add a comment |
There is no list (i.e. sequence in your YAML), because there are no spaces after the dash that starts the value forscore, that is not sequence entry indicator and that and the rest is one plain multi-line scalar.
– Anthon
Jan 18 at 19:55
There is no list (i.e. sequence in your YAML), because there are no spaces after the dash that starts the value for
score, that is not sequence entry indicator and that and the rest is one plain multi-line scalar.– Anthon
Jan 18 at 19:55
There is no list (i.e. sequence in your YAML), because there are no spaces after the dash that starts the value for
score, that is not sequence entry indicator and that and the rest is one plain multi-line scalar.– Anthon
Jan 18 at 19:55
add a comment |
1 Answer
1
active
oldest
votes
Dumping a numpy array with the data that you get, will get you a
vastly more complex YAML file than what you can get by just adding a
tag. I therefore recommend that you just define a tag of your own that
causess the data as you have it to load, and then convert to numpy on
the fly. That way you don't have to walk over the resulting loaded structure to find score or its value.
config.yaml:
name: 'John Doe'
age: 20
score: !2darray
-- 19
- 45
-- 21
- 12
-- 32
- 13
You also have to realize that the value for score in that file is a plain multi-line
scalar, that will get loaded as the string '-- 19 - 45 -- 21 - 12 -- 32 - 13'
import sys
import ruamel.yaml
from pathlib import Path
import numpy
config_file = Path('config.yaml')
yaml = ruamel.yaml.YAML(typ='safe')
@yaml.register_class
class Array:
yaml_tag = '!2darray'
@classmethod
def from_yaml(cls, constructor, node):
array =
for x in node.value.split():
if x == '--':
sub_array =
array.append(sub_array)
continue
if x == '-':
continue
sub_array.append(int(x))
return numpy.array(array)
data = yaml.load(config_file)
print(type(data['score']))
print(data)
which gives:
<class 'numpy.ndarray'>
{'name': 'John Doe', 'age': 20, 'score': array([[19, 45],
[21, 12],
[32, 13]])}
If in your input the value for score would be sequence of sequences,
which requires a space after the -, that only then gets interpreted as
a sequence entry indicator:
name: 'John Doe'
age: 20
score: !2darray
- - 19
- 45
- - 21
- 12
- - 32
- 13
If that would be the input, then you need to adapt the from_yaml method:
@yaml.register_class
class Array:
yaml_tag = '!2darray'
@classmethod
def from_yaml(cls, constructor, node):
array = constructor.construct_sequence(node, deep=True)
return numpy.array(array)
Which gives exactly the same output as before.
1
The above has the additional advantage that you don't have to use PyYAML. It'syaml.load()is documented to be potentially unsafe and you would need it to try and use a tag of the form!!python/object:module.type
– Anthon
Jan 18 at 20:09
Many thanks! Very good insight, really helpful!!
– Aenaon
Jan 19 at 12:50
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54258827%2fset-data-types-in-yaml%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Dumping a numpy array with the data that you get, will get you a
vastly more complex YAML file than what you can get by just adding a
tag. I therefore recommend that you just define a tag of your own that
causess the data as you have it to load, and then convert to numpy on
the fly. That way you don't have to walk over the resulting loaded structure to find score or its value.
config.yaml:
name: 'John Doe'
age: 20
score: !2darray
-- 19
- 45
-- 21
- 12
-- 32
- 13
You also have to realize that the value for score in that file is a plain multi-line
scalar, that will get loaded as the string '-- 19 - 45 -- 21 - 12 -- 32 - 13'
import sys
import ruamel.yaml
from pathlib import Path
import numpy
config_file = Path('config.yaml')
yaml = ruamel.yaml.YAML(typ='safe')
@yaml.register_class
class Array:
yaml_tag = '!2darray'
@classmethod
def from_yaml(cls, constructor, node):
array =
for x in node.value.split():
if x == '--':
sub_array =
array.append(sub_array)
continue
if x == '-':
continue
sub_array.append(int(x))
return numpy.array(array)
data = yaml.load(config_file)
print(type(data['score']))
print(data)
which gives:
<class 'numpy.ndarray'>
{'name': 'John Doe', 'age': 20, 'score': array([[19, 45],
[21, 12],
[32, 13]])}
If in your input the value for score would be sequence of sequences,
which requires a space after the -, that only then gets interpreted as
a sequence entry indicator:
name: 'John Doe'
age: 20
score: !2darray
- - 19
- 45
- - 21
- 12
- - 32
- 13
If that would be the input, then you need to adapt the from_yaml method:
@yaml.register_class
class Array:
yaml_tag = '!2darray'
@classmethod
def from_yaml(cls, constructor, node):
array = constructor.construct_sequence(node, deep=True)
return numpy.array(array)
Which gives exactly the same output as before.
1
The above has the additional advantage that you don't have to use PyYAML. It'syaml.load()is documented to be potentially unsafe and you would need it to try and use a tag of the form!!python/object:module.type
– Anthon
Jan 18 at 20:09
Many thanks! Very good insight, really helpful!!
– Aenaon
Jan 19 at 12:50
add a comment |
Dumping a numpy array with the data that you get, will get you a
vastly more complex YAML file than what you can get by just adding a
tag. I therefore recommend that you just define a tag of your own that
causess the data as you have it to load, and then convert to numpy on
the fly. That way you don't have to walk over the resulting loaded structure to find score or its value.
config.yaml:
name: 'John Doe'
age: 20
score: !2darray
-- 19
- 45
-- 21
- 12
-- 32
- 13
You also have to realize that the value for score in that file is a plain multi-line
scalar, that will get loaded as the string '-- 19 - 45 -- 21 - 12 -- 32 - 13'
import sys
import ruamel.yaml
from pathlib import Path
import numpy
config_file = Path('config.yaml')
yaml = ruamel.yaml.YAML(typ='safe')
@yaml.register_class
class Array:
yaml_tag = '!2darray'
@classmethod
def from_yaml(cls, constructor, node):
array =
for x in node.value.split():
if x == '--':
sub_array =
array.append(sub_array)
continue
if x == '-':
continue
sub_array.append(int(x))
return numpy.array(array)
data = yaml.load(config_file)
print(type(data['score']))
print(data)
which gives:
<class 'numpy.ndarray'>
{'name': 'John Doe', 'age': 20, 'score': array([[19, 45],
[21, 12],
[32, 13]])}
If in your input the value for score would be sequence of sequences,
which requires a space after the -, that only then gets interpreted as
a sequence entry indicator:
name: 'John Doe'
age: 20
score: !2darray
- - 19
- 45
- - 21
- 12
- - 32
- 13
If that would be the input, then you need to adapt the from_yaml method:
@yaml.register_class
class Array:
yaml_tag = '!2darray'
@classmethod
def from_yaml(cls, constructor, node):
array = constructor.construct_sequence(node, deep=True)
return numpy.array(array)
Which gives exactly the same output as before.
1
The above has the additional advantage that you don't have to use PyYAML. It'syaml.load()is documented to be potentially unsafe and you would need it to try and use a tag of the form!!python/object:module.type
– Anthon
Jan 18 at 20:09
Many thanks! Very good insight, really helpful!!
– Aenaon
Jan 19 at 12:50
add a comment |
Dumping a numpy array with the data that you get, will get you a
vastly more complex YAML file than what you can get by just adding a
tag. I therefore recommend that you just define a tag of your own that
causess the data as you have it to load, and then convert to numpy on
the fly. That way you don't have to walk over the resulting loaded structure to find score or its value.
config.yaml:
name: 'John Doe'
age: 20
score: !2darray
-- 19
- 45
-- 21
- 12
-- 32
- 13
You also have to realize that the value for score in that file is a plain multi-line
scalar, that will get loaded as the string '-- 19 - 45 -- 21 - 12 -- 32 - 13'
import sys
import ruamel.yaml
from pathlib import Path
import numpy
config_file = Path('config.yaml')
yaml = ruamel.yaml.YAML(typ='safe')
@yaml.register_class
class Array:
yaml_tag = '!2darray'
@classmethod
def from_yaml(cls, constructor, node):
array =
for x in node.value.split():
if x == '--':
sub_array =
array.append(sub_array)
continue
if x == '-':
continue
sub_array.append(int(x))
return numpy.array(array)
data = yaml.load(config_file)
print(type(data['score']))
print(data)
which gives:
<class 'numpy.ndarray'>
{'name': 'John Doe', 'age': 20, 'score': array([[19, 45],
[21, 12],
[32, 13]])}
If in your input the value for score would be sequence of sequences,
which requires a space after the -, that only then gets interpreted as
a sequence entry indicator:
name: 'John Doe'
age: 20
score: !2darray
- - 19
- 45
- - 21
- 12
- - 32
- 13
If that would be the input, then you need to adapt the from_yaml method:
@yaml.register_class
class Array:
yaml_tag = '!2darray'
@classmethod
def from_yaml(cls, constructor, node):
array = constructor.construct_sequence(node, deep=True)
return numpy.array(array)
Which gives exactly the same output as before.
Dumping a numpy array with the data that you get, will get you a
vastly more complex YAML file than what you can get by just adding a
tag. I therefore recommend that you just define a tag of your own that
causess the data as you have it to load, and then convert to numpy on
the fly. That way you don't have to walk over the resulting loaded structure to find score or its value.
config.yaml:
name: 'John Doe'
age: 20
score: !2darray
-- 19
- 45
-- 21
- 12
-- 32
- 13
You also have to realize that the value for score in that file is a plain multi-line
scalar, that will get loaded as the string '-- 19 - 45 -- 21 - 12 -- 32 - 13'
import sys
import ruamel.yaml
from pathlib import Path
import numpy
config_file = Path('config.yaml')
yaml = ruamel.yaml.YAML(typ='safe')
@yaml.register_class
class Array:
yaml_tag = '!2darray'
@classmethod
def from_yaml(cls, constructor, node):
array =
for x in node.value.split():
if x == '--':
sub_array =
array.append(sub_array)
continue
if x == '-':
continue
sub_array.append(int(x))
return numpy.array(array)
data = yaml.load(config_file)
print(type(data['score']))
print(data)
which gives:
<class 'numpy.ndarray'>
{'name': 'John Doe', 'age': 20, 'score': array([[19, 45],
[21, 12],
[32, 13]])}
If in your input the value for score would be sequence of sequences,
which requires a space after the -, that only then gets interpreted as
a sequence entry indicator:
name: 'John Doe'
age: 20
score: !2darray
- - 19
- 45
- - 21
- 12
- - 32
- 13
If that would be the input, then you need to adapt the from_yaml method:
@yaml.register_class
class Array:
yaml_tag = '!2darray'
@classmethod
def from_yaml(cls, constructor, node):
array = constructor.construct_sequence(node, deep=True)
return numpy.array(array)
Which gives exactly the same output as before.
answered Jan 18 at 19:52
AnthonAnthon
29.3k1693145
29.3k1693145
1
The above has the additional advantage that you don't have to use PyYAML. It'syaml.load()is documented to be potentially unsafe and you would need it to try and use a tag of the form!!python/object:module.type
– Anthon
Jan 18 at 20:09
Many thanks! Very good insight, really helpful!!
– Aenaon
Jan 19 at 12:50
add a comment |
1
The above has the additional advantage that you don't have to use PyYAML. It'syaml.load()is documented to be potentially unsafe and you would need it to try and use a tag of the form!!python/object:module.type
– Anthon
Jan 18 at 20:09
Many thanks! Very good insight, really helpful!!
– Aenaon
Jan 19 at 12:50
1
1
The above has the additional advantage that you don't have to use PyYAML. It's
yaml.load() is documented to be potentially unsafe and you would need it to try and use a tag of the form !!python/object:module.type– Anthon
Jan 18 at 20:09
The above has the additional advantage that you don't have to use PyYAML. It's
yaml.load() is documented to be potentially unsafe and you would need it to try and use a tag of the form !!python/object:module.type– Anthon
Jan 18 at 20:09
Many thanks! Very good insight, really helpful!!
– Aenaon
Jan 19 at 12:50
Many thanks! Very good insight, really helpful!!
– Aenaon
Jan 19 at 12:50
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54258827%2fset-data-types-in-yaml%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
There is no list (i.e. sequence in your YAML), because there are no spaces after the dash that starts the value for
score, that is not sequence entry indicator and that and the rest is one plain multi-line scalar.– Anthon
Jan 18 at 19:55