Variable parameterised over a trait not a struct?
I'm trying to wrap my head around Rust's generics. I'm writing something to extract HTML from different web sites. What I want is something like this:
trait CanGetTitle {
fn get_title(&self) -> String;
}
struct Spider<T: CanGetTitle> {
pub parser: T
}
struct GoogleParser;
impl CanGetTitle for GoogleParser {
fn get_title(&self) -> String {
"title from H1".to_string().clone()
}
}
struct YahooParser;
impl CanGetTitle for YahooParser {
fn get_title(&self) -> String {
"title from H2".to_string().clone()
}
}
enum SiteName {
Google,
Yahoo,
}
impl SiteName {
fn from_url(url: &str) -> SiteName {
SiteName::Google
}
}
fn main() {
let url = "http://www.google.com";
let site_name = SiteName::from_url(&url);
let spider: Spider<_> = match site_name {
Google => Spider { parser: GoogleParser },
Yahoo => Spider { parser: YahooParser }
};
spider.parser.get_title(); // fails
}
I'm getting an error about the match
returning Spider
s parameterised over two different types. It expects it to return Spider<GoogleParser>
because that's the return type of the first arm of the pattern match.
How can I declare that spider
should be any Spider<T: CanGetTitle>
?
generics rust polymorphism trait-objects
add a comment |
I'm trying to wrap my head around Rust's generics. I'm writing something to extract HTML from different web sites. What I want is something like this:
trait CanGetTitle {
fn get_title(&self) -> String;
}
struct Spider<T: CanGetTitle> {
pub parser: T
}
struct GoogleParser;
impl CanGetTitle for GoogleParser {
fn get_title(&self) -> String {
"title from H1".to_string().clone()
}
}
struct YahooParser;
impl CanGetTitle for YahooParser {
fn get_title(&self) -> String {
"title from H2".to_string().clone()
}
}
enum SiteName {
Google,
Yahoo,
}
impl SiteName {
fn from_url(url: &str) -> SiteName {
SiteName::Google
}
}
fn main() {
let url = "http://www.google.com";
let site_name = SiteName::from_url(&url);
let spider: Spider<_> = match site_name {
Google => Spider { parser: GoogleParser },
Yahoo => Spider { parser: YahooParser }
};
spider.parser.get_title(); // fails
}
I'm getting an error about the match
returning Spider
s parameterised over two different types. It expects it to return Spider<GoogleParser>
because that's the return type of the first arm of the pattern match.
How can I declare that spider
should be any Spider<T: CanGetTitle>
?
generics rust polymorphism trait-objects
add a comment |
I'm trying to wrap my head around Rust's generics. I'm writing something to extract HTML from different web sites. What I want is something like this:
trait CanGetTitle {
fn get_title(&self) -> String;
}
struct Spider<T: CanGetTitle> {
pub parser: T
}
struct GoogleParser;
impl CanGetTitle for GoogleParser {
fn get_title(&self) -> String {
"title from H1".to_string().clone()
}
}
struct YahooParser;
impl CanGetTitle for YahooParser {
fn get_title(&self) -> String {
"title from H2".to_string().clone()
}
}
enum SiteName {
Google,
Yahoo,
}
impl SiteName {
fn from_url(url: &str) -> SiteName {
SiteName::Google
}
}
fn main() {
let url = "http://www.google.com";
let site_name = SiteName::from_url(&url);
let spider: Spider<_> = match site_name {
Google => Spider { parser: GoogleParser },
Yahoo => Spider { parser: YahooParser }
};
spider.parser.get_title(); // fails
}
I'm getting an error about the match
returning Spider
s parameterised over two different types. It expects it to return Spider<GoogleParser>
because that's the return type of the first arm of the pattern match.
How can I declare that spider
should be any Spider<T: CanGetTitle>
?
generics rust polymorphism trait-objects
I'm trying to wrap my head around Rust's generics. I'm writing something to extract HTML from different web sites. What I want is something like this:
trait CanGetTitle {
fn get_title(&self) -> String;
}
struct Spider<T: CanGetTitle> {
pub parser: T
}
struct GoogleParser;
impl CanGetTitle for GoogleParser {
fn get_title(&self) -> String {
"title from H1".to_string().clone()
}
}
struct YahooParser;
impl CanGetTitle for YahooParser {
fn get_title(&self) -> String {
"title from H2".to_string().clone()
}
}
enum SiteName {
Google,
Yahoo,
}
impl SiteName {
fn from_url(url: &str) -> SiteName {
SiteName::Google
}
}
fn main() {
let url = "http://www.google.com";
let site_name = SiteName::from_url(&url);
let spider: Spider<_> = match site_name {
Google => Spider { parser: GoogleParser },
Yahoo => Spider { parser: YahooParser }
};
spider.parser.get_title(); // fails
}
I'm getting an error about the match
returning Spider
s parameterised over two different types. It expects it to return Spider<GoogleParser>
because that's the return type of the first arm of the pattern match.
How can I declare that spider
should be any Spider<T: CanGetTitle>
?
generics rust polymorphism trait-objects
generics rust polymorphism trait-objects
edited Jan 19 at 11:10
E4_net_or_something_like_that
16.4k74387
16.4k74387
asked Dec 29 '16 at 16:43
jbrownjbrown
2,48722877
2,48722877
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
How can I declare that
spider
should be anySpider<T: CanGetTitle>
?
You cannot. Simply put, the compiler would have no idea how much space to allocate to store spider
on the stack.
Instead, you will want to use a trait object: Box<CanGetTitle>
:
impl<T: ?Sized> CanGetTitle for Box<T>
where
T: CanGetTitle,
{
fn get_title(&self) -> String {
(**self).get_title()
}
}
fn main() {
let innards: Box<CanGetTitle> = match SiteName::Google {
SiteName::Google => Box::new(GoogleParser),
SiteName::Yahoo => Box::new(YahooParser),
};
let spider = Spider { parser: innards };
}
I'm still struggling with this. Will it work with multiple traits though? I'll need things likeParsePage
,GetQuery
, etc. and will need something that I can extend to cover all the traits that need implementing.
– jbrown
Dec 29 '16 at 17:32
@jbrown why do you believe it wont work with multiple traits?
– Shepmaster
Dec 29 '16 at 18:01
just checking. I'm just learning rust.
– jbrown
Dec 29 '16 at 18:56
For some reason I needed to add?Sized
intoSpider
as well, as instruct Spider<T: ?Sized + CanGetTitle >
. This is great to know though, thanks a lot.
– jbrown
Dec 29 '16 at 19:32
@jbrown: The?Sized
should not be necessary for a concreteT
.
– Matthieu M.
Dec 30 '16 at 12:19
add a comment |
How can I declare that
spider
should be anySpider<T: CanGetTitle>
?
Just to add a little to what @Shepmaster already said, spider
cannot be any Spider<T>
, because it has to be exactly one Spider<T>
. Rust implements generics using monomorphization (explained here) which means it compiles a separate version of your polymorphic function for each concrete type that is used. If the compiler cannot deduce a unique T
for a particular call site then it's a compile error. In your case, the compiler deduced that the type must be Spider<Google>
, but then the next line tries to treat it as Spider<Yahoo>
.
Using a trait object lets you defer all of that to runtime. By storing the actual object on the heap and using a Box
, the compiler knows how much space needs to be stack allocated (just the size of a Box
). But this comes with performance costs: there is extra pointer indirection when the data needs to be accessed and, more significantly, the optimising compiler cannot inline virtual calls.
It is often possible to rejig things so you can work with a monomorphic type anyway. One way to do that in your case is to avoid the temporary assignment to a polymorphic variable, and use the value only at a place where you know its concrete type:
fn do_stuff<T: CanGetTitle>(spider: Spider<T>) {
println!("{:?}", spider.parser.get_title());
}
fn main() {
let url = "http://www.google.com";
let site_name = SiteName::from_url(&url);
match site_name {
SiteName::Google => do_stuff(Spider { parser: GoogleParser }),
SiteName::Yahoo => do_stuff(Spider { parser: YahooParser })
};
}
Notice that each time do_stuff
is called, T
resolves to a different type. You only write one implementation of do_stuff
, but the compiler monomorphizes it twice - once for each type that you called it with.
If you use a Box
then each call to parser.get_title()
will have to be looked up in the Box
's vtable. But this version will usually be faster by avoiding the need for that lookup, and allowing the compiler the possibility of inlining the body of parser.get_title()
in each case.
Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.
– jbrown
Dec 30 '16 at 9:28
at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.
– Matthieu M.
Dec 30 '16 at 12:21
@MatthieuM. Thanks, made a tweak to make that clear.
– E4_net_or_something_like_that
Dec 30 '16 at 12:26
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f41383790%2fvariable-parameterised-over-a-trait-not-a-struct%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
How can I declare that
spider
should be anySpider<T: CanGetTitle>
?
You cannot. Simply put, the compiler would have no idea how much space to allocate to store spider
on the stack.
Instead, you will want to use a trait object: Box<CanGetTitle>
:
impl<T: ?Sized> CanGetTitle for Box<T>
where
T: CanGetTitle,
{
fn get_title(&self) -> String {
(**self).get_title()
}
}
fn main() {
let innards: Box<CanGetTitle> = match SiteName::Google {
SiteName::Google => Box::new(GoogleParser),
SiteName::Yahoo => Box::new(YahooParser),
};
let spider = Spider { parser: innards };
}
I'm still struggling with this. Will it work with multiple traits though? I'll need things likeParsePage
,GetQuery
, etc. and will need something that I can extend to cover all the traits that need implementing.
– jbrown
Dec 29 '16 at 17:32
@jbrown why do you believe it wont work with multiple traits?
– Shepmaster
Dec 29 '16 at 18:01
just checking. I'm just learning rust.
– jbrown
Dec 29 '16 at 18:56
For some reason I needed to add?Sized
intoSpider
as well, as instruct Spider<T: ?Sized + CanGetTitle >
. This is great to know though, thanks a lot.
– jbrown
Dec 29 '16 at 19:32
@jbrown: The?Sized
should not be necessary for a concreteT
.
– Matthieu M.
Dec 30 '16 at 12:19
add a comment |
How can I declare that
spider
should be anySpider<T: CanGetTitle>
?
You cannot. Simply put, the compiler would have no idea how much space to allocate to store spider
on the stack.
Instead, you will want to use a trait object: Box<CanGetTitle>
:
impl<T: ?Sized> CanGetTitle for Box<T>
where
T: CanGetTitle,
{
fn get_title(&self) -> String {
(**self).get_title()
}
}
fn main() {
let innards: Box<CanGetTitle> = match SiteName::Google {
SiteName::Google => Box::new(GoogleParser),
SiteName::Yahoo => Box::new(YahooParser),
};
let spider = Spider { parser: innards };
}
I'm still struggling with this. Will it work with multiple traits though? I'll need things likeParsePage
,GetQuery
, etc. and will need something that I can extend to cover all the traits that need implementing.
– jbrown
Dec 29 '16 at 17:32
@jbrown why do you believe it wont work with multiple traits?
– Shepmaster
Dec 29 '16 at 18:01
just checking. I'm just learning rust.
– jbrown
Dec 29 '16 at 18:56
For some reason I needed to add?Sized
intoSpider
as well, as instruct Spider<T: ?Sized + CanGetTitle >
. This is great to know though, thanks a lot.
– jbrown
Dec 29 '16 at 19:32
@jbrown: The?Sized
should not be necessary for a concreteT
.
– Matthieu M.
Dec 30 '16 at 12:19
add a comment |
How can I declare that
spider
should be anySpider<T: CanGetTitle>
?
You cannot. Simply put, the compiler would have no idea how much space to allocate to store spider
on the stack.
Instead, you will want to use a trait object: Box<CanGetTitle>
:
impl<T: ?Sized> CanGetTitle for Box<T>
where
T: CanGetTitle,
{
fn get_title(&self) -> String {
(**self).get_title()
}
}
fn main() {
let innards: Box<CanGetTitle> = match SiteName::Google {
SiteName::Google => Box::new(GoogleParser),
SiteName::Yahoo => Box::new(YahooParser),
};
let spider = Spider { parser: innards };
}
How can I declare that
spider
should be anySpider<T: CanGetTitle>
?
You cannot. Simply put, the compiler would have no idea how much space to allocate to store spider
on the stack.
Instead, you will want to use a trait object: Box<CanGetTitle>
:
impl<T: ?Sized> CanGetTitle for Box<T>
where
T: CanGetTitle,
{
fn get_title(&self) -> String {
(**self).get_title()
}
}
fn main() {
let innards: Box<CanGetTitle> = match SiteName::Google {
SiteName::Google => Box::new(GoogleParser),
SiteName::Yahoo => Box::new(YahooParser),
};
let spider = Spider { parser: innards };
}
edited Jan 20 at 3:33
answered Dec 29 '16 at 16:49
ShepmasterShepmaster
153k14301439
153k14301439
I'm still struggling with this. Will it work with multiple traits though? I'll need things likeParsePage
,GetQuery
, etc. and will need something that I can extend to cover all the traits that need implementing.
– jbrown
Dec 29 '16 at 17:32
@jbrown why do you believe it wont work with multiple traits?
– Shepmaster
Dec 29 '16 at 18:01
just checking. I'm just learning rust.
– jbrown
Dec 29 '16 at 18:56
For some reason I needed to add?Sized
intoSpider
as well, as instruct Spider<T: ?Sized + CanGetTitle >
. This is great to know though, thanks a lot.
– jbrown
Dec 29 '16 at 19:32
@jbrown: The?Sized
should not be necessary for a concreteT
.
– Matthieu M.
Dec 30 '16 at 12:19
add a comment |
I'm still struggling with this. Will it work with multiple traits though? I'll need things likeParsePage
,GetQuery
, etc. and will need something that I can extend to cover all the traits that need implementing.
– jbrown
Dec 29 '16 at 17:32
@jbrown why do you believe it wont work with multiple traits?
– Shepmaster
Dec 29 '16 at 18:01
just checking. I'm just learning rust.
– jbrown
Dec 29 '16 at 18:56
For some reason I needed to add?Sized
intoSpider
as well, as instruct Spider<T: ?Sized + CanGetTitle >
. This is great to know though, thanks a lot.
– jbrown
Dec 29 '16 at 19:32
@jbrown: The?Sized
should not be necessary for a concreteT
.
– Matthieu M.
Dec 30 '16 at 12:19
I'm still struggling with this. Will it work with multiple traits though? I'll need things like
ParsePage
, GetQuery
, etc. and will need something that I can extend to cover all the traits that need implementing.– jbrown
Dec 29 '16 at 17:32
I'm still struggling with this. Will it work with multiple traits though? I'll need things like
ParsePage
, GetQuery
, etc. and will need something that I can extend to cover all the traits that need implementing.– jbrown
Dec 29 '16 at 17:32
@jbrown why do you believe it wont work with multiple traits?
– Shepmaster
Dec 29 '16 at 18:01
@jbrown why do you believe it wont work with multiple traits?
– Shepmaster
Dec 29 '16 at 18:01
just checking. I'm just learning rust.
– jbrown
Dec 29 '16 at 18:56
just checking. I'm just learning rust.
– jbrown
Dec 29 '16 at 18:56
For some reason I needed to add
?Sized
into Spider
as well, as in struct Spider<T: ?Sized + CanGetTitle >
. This is great to know though, thanks a lot.– jbrown
Dec 29 '16 at 19:32
For some reason I needed to add
?Sized
into Spider
as well, as in struct Spider<T: ?Sized + CanGetTitle >
. This is great to know though, thanks a lot.– jbrown
Dec 29 '16 at 19:32
@jbrown: The
?Sized
should not be necessary for a concrete T
.– Matthieu M.
Dec 30 '16 at 12:19
@jbrown: The
?Sized
should not be necessary for a concrete T
.– Matthieu M.
Dec 30 '16 at 12:19
add a comment |
How can I declare that
spider
should be anySpider<T: CanGetTitle>
?
Just to add a little to what @Shepmaster already said, spider
cannot be any Spider<T>
, because it has to be exactly one Spider<T>
. Rust implements generics using monomorphization (explained here) which means it compiles a separate version of your polymorphic function for each concrete type that is used. If the compiler cannot deduce a unique T
for a particular call site then it's a compile error. In your case, the compiler deduced that the type must be Spider<Google>
, but then the next line tries to treat it as Spider<Yahoo>
.
Using a trait object lets you defer all of that to runtime. By storing the actual object on the heap and using a Box
, the compiler knows how much space needs to be stack allocated (just the size of a Box
). But this comes with performance costs: there is extra pointer indirection when the data needs to be accessed and, more significantly, the optimising compiler cannot inline virtual calls.
It is often possible to rejig things so you can work with a monomorphic type anyway. One way to do that in your case is to avoid the temporary assignment to a polymorphic variable, and use the value only at a place where you know its concrete type:
fn do_stuff<T: CanGetTitle>(spider: Spider<T>) {
println!("{:?}", spider.parser.get_title());
}
fn main() {
let url = "http://www.google.com";
let site_name = SiteName::from_url(&url);
match site_name {
SiteName::Google => do_stuff(Spider { parser: GoogleParser }),
SiteName::Yahoo => do_stuff(Spider { parser: YahooParser })
};
}
Notice that each time do_stuff
is called, T
resolves to a different type. You only write one implementation of do_stuff
, but the compiler monomorphizes it twice - once for each type that you called it with.
If you use a Box
then each call to parser.get_title()
will have to be looked up in the Box
's vtable. But this version will usually be faster by avoiding the need for that lookup, and allowing the compiler the possibility of inlining the body of parser.get_title()
in each case.
Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.
– jbrown
Dec 30 '16 at 9:28
at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.
– Matthieu M.
Dec 30 '16 at 12:21
@MatthieuM. Thanks, made a tweak to make that clear.
– E4_net_or_something_like_that
Dec 30 '16 at 12:26
add a comment |
How can I declare that
spider
should be anySpider<T: CanGetTitle>
?
Just to add a little to what @Shepmaster already said, spider
cannot be any Spider<T>
, because it has to be exactly one Spider<T>
. Rust implements generics using monomorphization (explained here) which means it compiles a separate version of your polymorphic function for each concrete type that is used. If the compiler cannot deduce a unique T
for a particular call site then it's a compile error. In your case, the compiler deduced that the type must be Spider<Google>
, but then the next line tries to treat it as Spider<Yahoo>
.
Using a trait object lets you defer all of that to runtime. By storing the actual object on the heap and using a Box
, the compiler knows how much space needs to be stack allocated (just the size of a Box
). But this comes with performance costs: there is extra pointer indirection when the data needs to be accessed and, more significantly, the optimising compiler cannot inline virtual calls.
It is often possible to rejig things so you can work with a monomorphic type anyway. One way to do that in your case is to avoid the temporary assignment to a polymorphic variable, and use the value only at a place where you know its concrete type:
fn do_stuff<T: CanGetTitle>(spider: Spider<T>) {
println!("{:?}", spider.parser.get_title());
}
fn main() {
let url = "http://www.google.com";
let site_name = SiteName::from_url(&url);
match site_name {
SiteName::Google => do_stuff(Spider { parser: GoogleParser }),
SiteName::Yahoo => do_stuff(Spider { parser: YahooParser })
};
}
Notice that each time do_stuff
is called, T
resolves to a different type. You only write one implementation of do_stuff
, but the compiler monomorphizes it twice - once for each type that you called it with.
If you use a Box
then each call to parser.get_title()
will have to be looked up in the Box
's vtable. But this version will usually be faster by avoiding the need for that lookup, and allowing the compiler the possibility of inlining the body of parser.get_title()
in each case.
Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.
– jbrown
Dec 30 '16 at 9:28
at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.
– Matthieu M.
Dec 30 '16 at 12:21
@MatthieuM. Thanks, made a tweak to make that clear.
– E4_net_or_something_like_that
Dec 30 '16 at 12:26
add a comment |
How can I declare that
spider
should be anySpider<T: CanGetTitle>
?
Just to add a little to what @Shepmaster already said, spider
cannot be any Spider<T>
, because it has to be exactly one Spider<T>
. Rust implements generics using monomorphization (explained here) which means it compiles a separate version of your polymorphic function for each concrete type that is used. If the compiler cannot deduce a unique T
for a particular call site then it's a compile error. In your case, the compiler deduced that the type must be Spider<Google>
, but then the next line tries to treat it as Spider<Yahoo>
.
Using a trait object lets you defer all of that to runtime. By storing the actual object on the heap and using a Box
, the compiler knows how much space needs to be stack allocated (just the size of a Box
). But this comes with performance costs: there is extra pointer indirection when the data needs to be accessed and, more significantly, the optimising compiler cannot inline virtual calls.
It is often possible to rejig things so you can work with a monomorphic type anyway. One way to do that in your case is to avoid the temporary assignment to a polymorphic variable, and use the value only at a place where you know its concrete type:
fn do_stuff<T: CanGetTitle>(spider: Spider<T>) {
println!("{:?}", spider.parser.get_title());
}
fn main() {
let url = "http://www.google.com";
let site_name = SiteName::from_url(&url);
match site_name {
SiteName::Google => do_stuff(Spider { parser: GoogleParser }),
SiteName::Yahoo => do_stuff(Spider { parser: YahooParser })
};
}
Notice that each time do_stuff
is called, T
resolves to a different type. You only write one implementation of do_stuff
, but the compiler monomorphizes it twice - once for each type that you called it with.
If you use a Box
then each call to parser.get_title()
will have to be looked up in the Box
's vtable. But this version will usually be faster by avoiding the need for that lookup, and allowing the compiler the possibility of inlining the body of parser.get_title()
in each case.
How can I declare that
spider
should be anySpider<T: CanGetTitle>
?
Just to add a little to what @Shepmaster already said, spider
cannot be any Spider<T>
, because it has to be exactly one Spider<T>
. Rust implements generics using monomorphization (explained here) which means it compiles a separate version of your polymorphic function for each concrete type that is used. If the compiler cannot deduce a unique T
for a particular call site then it's a compile error. In your case, the compiler deduced that the type must be Spider<Google>
, but then the next line tries to treat it as Spider<Yahoo>
.
Using a trait object lets you defer all of that to runtime. By storing the actual object on the heap and using a Box
, the compiler knows how much space needs to be stack allocated (just the size of a Box
). But this comes with performance costs: there is extra pointer indirection when the data needs to be accessed and, more significantly, the optimising compiler cannot inline virtual calls.
It is often possible to rejig things so you can work with a monomorphic type anyway. One way to do that in your case is to avoid the temporary assignment to a polymorphic variable, and use the value only at a place where you know its concrete type:
fn do_stuff<T: CanGetTitle>(spider: Spider<T>) {
println!("{:?}", spider.parser.get_title());
}
fn main() {
let url = "http://www.google.com";
let site_name = SiteName::from_url(&url);
match site_name {
SiteName::Google => do_stuff(Spider { parser: GoogleParser }),
SiteName::Yahoo => do_stuff(Spider { parser: YahooParser })
};
}
Notice that each time do_stuff
is called, T
resolves to a different type. You only write one implementation of do_stuff
, but the compiler monomorphizes it twice - once for each type that you called it with.
If you use a Box
then each call to parser.get_title()
will have to be looked up in the Box
's vtable. But this version will usually be faster by avoiding the need for that lookup, and allowing the compiler the possibility of inlining the body of parser.get_title()
in each case.
edited Mar 3 '17 at 14:47
answered Dec 29 '16 at 21:36
E4_net_or_something_like_thatE4_net_or_something_like_that
16.4k74387
16.4k74387
Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.
– jbrown
Dec 30 '16 at 9:28
at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.
– Matthieu M.
Dec 30 '16 at 12:21
@MatthieuM. Thanks, made a tweak to make that clear.
– E4_net_or_something_like_that
Dec 30 '16 at 12:26
add a comment |
Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.
– jbrown
Dec 30 '16 at 9:28
at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.
– Matthieu M.
Dec 30 '16 at 12:21
@MatthieuM. Thanks, made a tweak to make that clear.
– E4_net_or_something_like_that
Dec 30 '16 at 12:26
Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.
– jbrown
Dec 30 '16 at 9:28
Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.
– jbrown
Dec 30 '16 at 9:28
at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.
– Matthieu M.
Dec 30 '16 at 12:21
at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.
– Matthieu M.
Dec 30 '16 at 12:21
@MatthieuM. Thanks, made a tweak to make that clear.
– E4_net_or_something_like_that
Dec 30 '16 at 12:26
@MatthieuM. Thanks, made a tweak to make that clear.
– E4_net_or_something_like_that
Dec 30 '16 at 12:26
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f41383790%2fvariable-parameterised-over-a-trait-not-a-struct%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown