Variable parameterised over a trait not a struct?












2















I'm trying to wrap my head around Rust's generics. I'm writing something to extract HTML from different web sites. What I want is something like this:



trait CanGetTitle {
fn get_title(&self) -> String;
}

struct Spider<T: CanGetTitle> {
pub parser: T
}

struct GoogleParser;
impl CanGetTitle for GoogleParser {
fn get_title(&self) -> String {
"title from H1".to_string().clone()
}
}

struct YahooParser;
impl CanGetTitle for YahooParser {
fn get_title(&self) -> String {
"title from H2".to_string().clone()
}
}

enum SiteName {
Google,
Yahoo,
}

impl SiteName {
fn from_url(url: &str) -> SiteName {
SiteName::Google
}
}

fn main() {
let url = "http://www.google.com";
let site_name = SiteName::from_url(&url);
let spider: Spider<_> = match site_name {
Google => Spider { parser: GoogleParser },
Yahoo => Spider { parser: YahooParser }
};

spider.parser.get_title(); // fails
}


I'm getting an error about the match returning Spiders parameterised over two different types. It expects it to return Spider<GoogleParser> because that's the return type of the first arm of the pattern match.



How can I declare that spider should be any Spider<T: CanGetTitle>?










share|improve this question





























    2















    I'm trying to wrap my head around Rust's generics. I'm writing something to extract HTML from different web sites. What I want is something like this:



    trait CanGetTitle {
    fn get_title(&self) -> String;
    }

    struct Spider<T: CanGetTitle> {
    pub parser: T
    }

    struct GoogleParser;
    impl CanGetTitle for GoogleParser {
    fn get_title(&self) -> String {
    "title from H1".to_string().clone()
    }
    }

    struct YahooParser;
    impl CanGetTitle for YahooParser {
    fn get_title(&self) -> String {
    "title from H2".to_string().clone()
    }
    }

    enum SiteName {
    Google,
    Yahoo,
    }

    impl SiteName {
    fn from_url(url: &str) -> SiteName {
    SiteName::Google
    }
    }

    fn main() {
    let url = "http://www.google.com";
    let site_name = SiteName::from_url(&url);
    let spider: Spider<_> = match site_name {
    Google => Spider { parser: GoogleParser },
    Yahoo => Spider { parser: YahooParser }
    };

    spider.parser.get_title(); // fails
    }


    I'm getting an error about the match returning Spiders parameterised over two different types. It expects it to return Spider<GoogleParser> because that's the return type of the first arm of the pattern match.



    How can I declare that spider should be any Spider<T: CanGetTitle>?










    share|improve this question



























      2












      2








      2








      I'm trying to wrap my head around Rust's generics. I'm writing something to extract HTML from different web sites. What I want is something like this:



      trait CanGetTitle {
      fn get_title(&self) -> String;
      }

      struct Spider<T: CanGetTitle> {
      pub parser: T
      }

      struct GoogleParser;
      impl CanGetTitle for GoogleParser {
      fn get_title(&self) -> String {
      "title from H1".to_string().clone()
      }
      }

      struct YahooParser;
      impl CanGetTitle for YahooParser {
      fn get_title(&self) -> String {
      "title from H2".to_string().clone()
      }
      }

      enum SiteName {
      Google,
      Yahoo,
      }

      impl SiteName {
      fn from_url(url: &str) -> SiteName {
      SiteName::Google
      }
      }

      fn main() {
      let url = "http://www.google.com";
      let site_name = SiteName::from_url(&url);
      let spider: Spider<_> = match site_name {
      Google => Spider { parser: GoogleParser },
      Yahoo => Spider { parser: YahooParser }
      };

      spider.parser.get_title(); // fails
      }


      I'm getting an error about the match returning Spiders parameterised over two different types. It expects it to return Spider<GoogleParser> because that's the return type of the first arm of the pattern match.



      How can I declare that spider should be any Spider<T: CanGetTitle>?










      share|improve this question
















      I'm trying to wrap my head around Rust's generics. I'm writing something to extract HTML from different web sites. What I want is something like this:



      trait CanGetTitle {
      fn get_title(&self) -> String;
      }

      struct Spider<T: CanGetTitle> {
      pub parser: T
      }

      struct GoogleParser;
      impl CanGetTitle for GoogleParser {
      fn get_title(&self) -> String {
      "title from H1".to_string().clone()
      }
      }

      struct YahooParser;
      impl CanGetTitle for YahooParser {
      fn get_title(&self) -> String {
      "title from H2".to_string().clone()
      }
      }

      enum SiteName {
      Google,
      Yahoo,
      }

      impl SiteName {
      fn from_url(url: &str) -> SiteName {
      SiteName::Google
      }
      }

      fn main() {
      let url = "http://www.google.com";
      let site_name = SiteName::from_url(&url);
      let spider: Spider<_> = match site_name {
      Google => Spider { parser: GoogleParser },
      Yahoo => Spider { parser: YahooParser }
      };

      spider.parser.get_title(); // fails
      }


      I'm getting an error about the match returning Spiders parameterised over two different types. It expects it to return Spider<GoogleParser> because that's the return type of the first arm of the pattern match.



      How can I declare that spider should be any Spider<T: CanGetTitle>?







      generics rust polymorphism trait-objects






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 19 at 11:10









      E4_net_or_something_like_that

      16.4k74387




      16.4k74387










      asked Dec 29 '16 at 16:43









      jbrownjbrown

      2,48722877




      2,48722877
























          2 Answers
          2






          active

          oldest

          votes


















          3















          How can I declare that spider should be any Spider<T: CanGetTitle>?




          You cannot. Simply put, the compiler would have no idea how much space to allocate to store spider on the stack.



          Instead, you will want to use a trait object: Box<CanGetTitle>:



          impl<T: ?Sized> CanGetTitle for Box<T>
          where
          T: CanGetTitle,
          {
          fn get_title(&self) -> String {
          (**self).get_title()
          }
          }

          fn main() {
          let innards: Box<CanGetTitle> = match SiteName::Google {
          SiteName::Google => Box::new(GoogleParser),
          SiteName::Yahoo => Box::new(YahooParser),
          };
          let spider = Spider { parser: innards };
          }





          share|improve this answer


























          • I'm still struggling with this. Will it work with multiple traits though? I'll need things like ParsePage, GetQuery, etc. and will need something that I can extend to cover all the traits that need implementing.

            – jbrown
            Dec 29 '16 at 17:32











          • @jbrown why do you believe it wont work with multiple traits?

            – Shepmaster
            Dec 29 '16 at 18:01











          • just checking. I'm just learning rust.

            – jbrown
            Dec 29 '16 at 18:56











          • For some reason I needed to add ?Sized into Spider as well, as in struct Spider<T: ?Sized + CanGetTitle >. This is great to know though, thanks a lot.

            – jbrown
            Dec 29 '16 at 19:32













          • @jbrown: The ?Sized should not be necessary for a concrete T.

            – Matthieu M.
            Dec 30 '16 at 12:19



















          4















          How can I declare that spider should be any Spider<T: CanGetTitle>?




          Just to add a little to what @Shepmaster already said, spider cannot be any Spider<T>, because it has to be exactly one Spider<T>. Rust implements generics using monomorphization (explained here) which means it compiles a separate version of your polymorphic function for each concrete type that is used. If the compiler cannot deduce a unique T for a particular call site then it's a compile error. In your case, the compiler deduced that the type must be Spider<Google>, but then the next line tries to treat it as Spider<Yahoo>.



          Using a trait object lets you defer all of that to runtime. By storing the actual object on the heap and using a Box, the compiler knows how much space needs to be stack allocated (just the size of a Box). But this comes with performance costs: there is extra pointer indirection when the data needs to be accessed and, more significantly, the optimising compiler cannot inline virtual calls.



          It is often possible to rejig things so you can work with a monomorphic type anyway. One way to do that in your case is to avoid the temporary assignment to a polymorphic variable, and use the value only at a place where you know its concrete type:



          fn do_stuff<T: CanGetTitle>(spider: Spider<T>) {
          println!("{:?}", spider.parser.get_title());
          }

          fn main() {
          let url = "http://www.google.com";
          let site_name = SiteName::from_url(&url);
          match site_name {
          SiteName::Google => do_stuff(Spider { parser: GoogleParser }),
          SiteName::Yahoo => do_stuff(Spider { parser: YahooParser })
          };
          }


          Notice that each time do_stuff is called, T resolves to a different type. You only write one implementation of do_stuff, but the compiler monomorphizes it twice - once for each type that you called it with.



          If you use a Box then each call to parser.get_title() will have to be looked up in the Box's vtable. But this version will usually be faster by avoiding the need for that lookup, and allowing the compiler the possibility of inlining the body of parser.get_title() in each case.






          share|improve this answer


























          • Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.

            – jbrown
            Dec 30 '16 at 9:28











          • at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.

            – Matthieu M.
            Dec 30 '16 at 12:21











          • @MatthieuM. Thanks, made a tweak to make that clear.

            – E4_net_or_something_like_that
            Dec 30 '16 at 12:26











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f41383790%2fvariable-parameterised-over-a-trait-not-a-struct%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          3















          How can I declare that spider should be any Spider<T: CanGetTitle>?




          You cannot. Simply put, the compiler would have no idea how much space to allocate to store spider on the stack.



          Instead, you will want to use a trait object: Box<CanGetTitle>:



          impl<T: ?Sized> CanGetTitle for Box<T>
          where
          T: CanGetTitle,
          {
          fn get_title(&self) -> String {
          (**self).get_title()
          }
          }

          fn main() {
          let innards: Box<CanGetTitle> = match SiteName::Google {
          SiteName::Google => Box::new(GoogleParser),
          SiteName::Yahoo => Box::new(YahooParser),
          };
          let spider = Spider { parser: innards };
          }





          share|improve this answer


























          • I'm still struggling with this. Will it work with multiple traits though? I'll need things like ParsePage, GetQuery, etc. and will need something that I can extend to cover all the traits that need implementing.

            – jbrown
            Dec 29 '16 at 17:32











          • @jbrown why do you believe it wont work with multiple traits?

            – Shepmaster
            Dec 29 '16 at 18:01











          • just checking. I'm just learning rust.

            – jbrown
            Dec 29 '16 at 18:56











          • For some reason I needed to add ?Sized into Spider as well, as in struct Spider<T: ?Sized + CanGetTitle >. This is great to know though, thanks a lot.

            – jbrown
            Dec 29 '16 at 19:32













          • @jbrown: The ?Sized should not be necessary for a concrete T.

            – Matthieu M.
            Dec 30 '16 at 12:19
















          3















          How can I declare that spider should be any Spider<T: CanGetTitle>?




          You cannot. Simply put, the compiler would have no idea how much space to allocate to store spider on the stack.



          Instead, you will want to use a trait object: Box<CanGetTitle>:



          impl<T: ?Sized> CanGetTitle for Box<T>
          where
          T: CanGetTitle,
          {
          fn get_title(&self) -> String {
          (**self).get_title()
          }
          }

          fn main() {
          let innards: Box<CanGetTitle> = match SiteName::Google {
          SiteName::Google => Box::new(GoogleParser),
          SiteName::Yahoo => Box::new(YahooParser),
          };
          let spider = Spider { parser: innards };
          }





          share|improve this answer


























          • I'm still struggling with this. Will it work with multiple traits though? I'll need things like ParsePage, GetQuery, etc. and will need something that I can extend to cover all the traits that need implementing.

            – jbrown
            Dec 29 '16 at 17:32











          • @jbrown why do you believe it wont work with multiple traits?

            – Shepmaster
            Dec 29 '16 at 18:01











          • just checking. I'm just learning rust.

            – jbrown
            Dec 29 '16 at 18:56











          • For some reason I needed to add ?Sized into Spider as well, as in struct Spider<T: ?Sized + CanGetTitle >. This is great to know though, thanks a lot.

            – jbrown
            Dec 29 '16 at 19:32













          • @jbrown: The ?Sized should not be necessary for a concrete T.

            – Matthieu M.
            Dec 30 '16 at 12:19














          3












          3








          3








          How can I declare that spider should be any Spider<T: CanGetTitle>?




          You cannot. Simply put, the compiler would have no idea how much space to allocate to store spider on the stack.



          Instead, you will want to use a trait object: Box<CanGetTitle>:



          impl<T: ?Sized> CanGetTitle for Box<T>
          where
          T: CanGetTitle,
          {
          fn get_title(&self) -> String {
          (**self).get_title()
          }
          }

          fn main() {
          let innards: Box<CanGetTitle> = match SiteName::Google {
          SiteName::Google => Box::new(GoogleParser),
          SiteName::Yahoo => Box::new(YahooParser),
          };
          let spider = Spider { parser: innards };
          }





          share|improve this answer
















          How can I declare that spider should be any Spider<T: CanGetTitle>?




          You cannot. Simply put, the compiler would have no idea how much space to allocate to store spider on the stack.



          Instead, you will want to use a trait object: Box<CanGetTitle>:



          impl<T: ?Sized> CanGetTitle for Box<T>
          where
          T: CanGetTitle,
          {
          fn get_title(&self) -> String {
          (**self).get_title()
          }
          }

          fn main() {
          let innards: Box<CanGetTitle> = match SiteName::Google {
          SiteName::Google => Box::new(GoogleParser),
          SiteName::Yahoo => Box::new(YahooParser),
          };
          let spider = Spider { parser: innards };
          }






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Jan 20 at 3:33

























          answered Dec 29 '16 at 16:49









          ShepmasterShepmaster

          153k14301439




          153k14301439













          • I'm still struggling with this. Will it work with multiple traits though? I'll need things like ParsePage, GetQuery, etc. and will need something that I can extend to cover all the traits that need implementing.

            – jbrown
            Dec 29 '16 at 17:32











          • @jbrown why do you believe it wont work with multiple traits?

            – Shepmaster
            Dec 29 '16 at 18:01











          • just checking. I'm just learning rust.

            – jbrown
            Dec 29 '16 at 18:56











          • For some reason I needed to add ?Sized into Spider as well, as in struct Spider<T: ?Sized + CanGetTitle >. This is great to know though, thanks a lot.

            – jbrown
            Dec 29 '16 at 19:32













          • @jbrown: The ?Sized should not be necessary for a concrete T.

            – Matthieu M.
            Dec 30 '16 at 12:19



















          • I'm still struggling with this. Will it work with multiple traits though? I'll need things like ParsePage, GetQuery, etc. and will need something that I can extend to cover all the traits that need implementing.

            – jbrown
            Dec 29 '16 at 17:32











          • @jbrown why do you believe it wont work with multiple traits?

            – Shepmaster
            Dec 29 '16 at 18:01











          • just checking. I'm just learning rust.

            – jbrown
            Dec 29 '16 at 18:56











          • For some reason I needed to add ?Sized into Spider as well, as in struct Spider<T: ?Sized + CanGetTitle >. This is great to know though, thanks a lot.

            – jbrown
            Dec 29 '16 at 19:32













          • @jbrown: The ?Sized should not be necessary for a concrete T.

            – Matthieu M.
            Dec 30 '16 at 12:19

















          I'm still struggling with this. Will it work with multiple traits though? I'll need things like ParsePage, GetQuery, etc. and will need something that I can extend to cover all the traits that need implementing.

          – jbrown
          Dec 29 '16 at 17:32





          I'm still struggling with this. Will it work with multiple traits though? I'll need things like ParsePage, GetQuery, etc. and will need something that I can extend to cover all the traits that need implementing.

          – jbrown
          Dec 29 '16 at 17:32













          @jbrown why do you believe it wont work with multiple traits?

          – Shepmaster
          Dec 29 '16 at 18:01





          @jbrown why do you believe it wont work with multiple traits?

          – Shepmaster
          Dec 29 '16 at 18:01













          just checking. I'm just learning rust.

          – jbrown
          Dec 29 '16 at 18:56





          just checking. I'm just learning rust.

          – jbrown
          Dec 29 '16 at 18:56













          For some reason I needed to add ?Sized into Spider as well, as in struct Spider<T: ?Sized + CanGetTitle >. This is great to know though, thanks a lot.

          – jbrown
          Dec 29 '16 at 19:32







          For some reason I needed to add ?Sized into Spider as well, as in struct Spider<T: ?Sized + CanGetTitle >. This is great to know though, thanks a lot.

          – jbrown
          Dec 29 '16 at 19:32















          @jbrown: The ?Sized should not be necessary for a concrete T.

          – Matthieu M.
          Dec 30 '16 at 12:19





          @jbrown: The ?Sized should not be necessary for a concrete T.

          – Matthieu M.
          Dec 30 '16 at 12:19













          4















          How can I declare that spider should be any Spider<T: CanGetTitle>?




          Just to add a little to what @Shepmaster already said, spider cannot be any Spider<T>, because it has to be exactly one Spider<T>. Rust implements generics using monomorphization (explained here) which means it compiles a separate version of your polymorphic function for each concrete type that is used. If the compiler cannot deduce a unique T for a particular call site then it's a compile error. In your case, the compiler deduced that the type must be Spider<Google>, but then the next line tries to treat it as Spider<Yahoo>.



          Using a trait object lets you defer all of that to runtime. By storing the actual object on the heap and using a Box, the compiler knows how much space needs to be stack allocated (just the size of a Box). But this comes with performance costs: there is extra pointer indirection when the data needs to be accessed and, more significantly, the optimising compiler cannot inline virtual calls.



          It is often possible to rejig things so you can work with a monomorphic type anyway. One way to do that in your case is to avoid the temporary assignment to a polymorphic variable, and use the value only at a place where you know its concrete type:



          fn do_stuff<T: CanGetTitle>(spider: Spider<T>) {
          println!("{:?}", spider.parser.get_title());
          }

          fn main() {
          let url = "http://www.google.com";
          let site_name = SiteName::from_url(&url);
          match site_name {
          SiteName::Google => do_stuff(Spider { parser: GoogleParser }),
          SiteName::Yahoo => do_stuff(Spider { parser: YahooParser })
          };
          }


          Notice that each time do_stuff is called, T resolves to a different type. You only write one implementation of do_stuff, but the compiler monomorphizes it twice - once for each type that you called it with.



          If you use a Box then each call to parser.get_title() will have to be looked up in the Box's vtable. But this version will usually be faster by avoiding the need for that lookup, and allowing the compiler the possibility of inlining the body of parser.get_title() in each case.






          share|improve this answer


























          • Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.

            – jbrown
            Dec 30 '16 at 9:28











          • at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.

            – Matthieu M.
            Dec 30 '16 at 12:21











          • @MatthieuM. Thanks, made a tweak to make that clear.

            – E4_net_or_something_like_that
            Dec 30 '16 at 12:26
















          4















          How can I declare that spider should be any Spider<T: CanGetTitle>?




          Just to add a little to what @Shepmaster already said, spider cannot be any Spider<T>, because it has to be exactly one Spider<T>. Rust implements generics using monomorphization (explained here) which means it compiles a separate version of your polymorphic function for each concrete type that is used. If the compiler cannot deduce a unique T for a particular call site then it's a compile error. In your case, the compiler deduced that the type must be Spider<Google>, but then the next line tries to treat it as Spider<Yahoo>.



          Using a trait object lets you defer all of that to runtime. By storing the actual object on the heap and using a Box, the compiler knows how much space needs to be stack allocated (just the size of a Box). But this comes with performance costs: there is extra pointer indirection when the data needs to be accessed and, more significantly, the optimising compiler cannot inline virtual calls.



          It is often possible to rejig things so you can work with a monomorphic type anyway. One way to do that in your case is to avoid the temporary assignment to a polymorphic variable, and use the value only at a place where you know its concrete type:



          fn do_stuff<T: CanGetTitle>(spider: Spider<T>) {
          println!("{:?}", spider.parser.get_title());
          }

          fn main() {
          let url = "http://www.google.com";
          let site_name = SiteName::from_url(&url);
          match site_name {
          SiteName::Google => do_stuff(Spider { parser: GoogleParser }),
          SiteName::Yahoo => do_stuff(Spider { parser: YahooParser })
          };
          }


          Notice that each time do_stuff is called, T resolves to a different type. You only write one implementation of do_stuff, but the compiler monomorphizes it twice - once for each type that you called it with.



          If you use a Box then each call to parser.get_title() will have to be looked up in the Box's vtable. But this version will usually be faster by avoiding the need for that lookup, and allowing the compiler the possibility of inlining the body of parser.get_title() in each case.






          share|improve this answer


























          • Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.

            – jbrown
            Dec 30 '16 at 9:28











          • at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.

            – Matthieu M.
            Dec 30 '16 at 12:21











          • @MatthieuM. Thanks, made a tweak to make that clear.

            – E4_net_or_something_like_that
            Dec 30 '16 at 12:26














          4












          4








          4








          How can I declare that spider should be any Spider<T: CanGetTitle>?




          Just to add a little to what @Shepmaster already said, spider cannot be any Spider<T>, because it has to be exactly one Spider<T>. Rust implements generics using monomorphization (explained here) which means it compiles a separate version of your polymorphic function for each concrete type that is used. If the compiler cannot deduce a unique T for a particular call site then it's a compile error. In your case, the compiler deduced that the type must be Spider<Google>, but then the next line tries to treat it as Spider<Yahoo>.



          Using a trait object lets you defer all of that to runtime. By storing the actual object on the heap and using a Box, the compiler knows how much space needs to be stack allocated (just the size of a Box). But this comes with performance costs: there is extra pointer indirection when the data needs to be accessed and, more significantly, the optimising compiler cannot inline virtual calls.



          It is often possible to rejig things so you can work with a monomorphic type anyway. One way to do that in your case is to avoid the temporary assignment to a polymorphic variable, and use the value only at a place where you know its concrete type:



          fn do_stuff<T: CanGetTitle>(spider: Spider<T>) {
          println!("{:?}", spider.parser.get_title());
          }

          fn main() {
          let url = "http://www.google.com";
          let site_name = SiteName::from_url(&url);
          match site_name {
          SiteName::Google => do_stuff(Spider { parser: GoogleParser }),
          SiteName::Yahoo => do_stuff(Spider { parser: YahooParser })
          };
          }


          Notice that each time do_stuff is called, T resolves to a different type. You only write one implementation of do_stuff, but the compiler monomorphizes it twice - once for each type that you called it with.



          If you use a Box then each call to parser.get_title() will have to be looked up in the Box's vtable. But this version will usually be faster by avoiding the need for that lookup, and allowing the compiler the possibility of inlining the body of parser.get_title() in each case.






          share|improve this answer
















          How can I declare that spider should be any Spider<T: CanGetTitle>?




          Just to add a little to what @Shepmaster already said, spider cannot be any Spider<T>, because it has to be exactly one Spider<T>. Rust implements generics using monomorphization (explained here) which means it compiles a separate version of your polymorphic function for each concrete type that is used. If the compiler cannot deduce a unique T for a particular call site then it's a compile error. In your case, the compiler deduced that the type must be Spider<Google>, but then the next line tries to treat it as Spider<Yahoo>.



          Using a trait object lets you defer all of that to runtime. By storing the actual object on the heap and using a Box, the compiler knows how much space needs to be stack allocated (just the size of a Box). But this comes with performance costs: there is extra pointer indirection when the data needs to be accessed and, more significantly, the optimising compiler cannot inline virtual calls.



          It is often possible to rejig things so you can work with a monomorphic type anyway. One way to do that in your case is to avoid the temporary assignment to a polymorphic variable, and use the value only at a place where you know its concrete type:



          fn do_stuff<T: CanGetTitle>(spider: Spider<T>) {
          println!("{:?}", spider.parser.get_title());
          }

          fn main() {
          let url = "http://www.google.com";
          let site_name = SiteName::from_url(&url);
          match site_name {
          SiteName::Google => do_stuff(Spider { parser: GoogleParser }),
          SiteName::Yahoo => do_stuff(Spider { parser: YahooParser })
          };
          }


          Notice that each time do_stuff is called, T resolves to a different type. You only write one implementation of do_stuff, but the compiler monomorphizes it twice - once for each type that you called it with.



          If you use a Box then each call to parser.get_title() will have to be looked up in the Box's vtable. But this version will usually be faster by avoiding the need for that lookup, and allowing the compiler the possibility of inlining the body of parser.get_title() in each case.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Mar 3 '17 at 14:47

























          answered Dec 29 '16 at 21:36









          E4_net_or_something_like_thatE4_net_or_something_like_that

          16.4k74387




          16.4k74387













          • Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.

            – jbrown
            Dec 30 '16 at 9:28











          • at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.

            – Matthieu M.
            Dec 30 '16 at 12:21











          • @MatthieuM. Thanks, made a tweak to make that clear.

            – E4_net_or_something_like_that
            Dec 30 '16 at 12:26



















          • Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.

            – jbrown
            Dec 30 '16 at 9:28











          • at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.

            – Matthieu M.
            Dec 30 '16 at 12:21











          • @MatthieuM. Thanks, made a tweak to make that clear.

            – E4_net_or_something_like_that
            Dec 30 '16 at 12:26

















          Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.

          – jbrown
          Dec 30 '16 at 9:28





          Hmm interesting. I think in this case though there'll be a lot of commonality for what I want to do between sites, with the only differences things like exactly which HTML selectors to use to extract the data I need depending on the site, etc.

          – jbrown
          Dec 30 '16 at 9:28













          at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.

          – Matthieu M.
          Dec 30 '16 at 12:21





          at the cost of extra pointer indirection when the data needs to be accessed => Actually, that's the least cost you pay for it. The greater cost is that baring an optimizer smart enough to devirtualize the call, this inhibits inlining, which is a key enabler for optimizations. So while the cost of an extra pointer dereference/virtual call is very small, the loss of inlining and optimizations can (in tight loops) be very costly indeed.

          – Matthieu M.
          Dec 30 '16 at 12:21













          @MatthieuM. Thanks, made a tweak to make that clear.

          – E4_net_or_something_like_that
          Dec 30 '16 at 12:26





          @MatthieuM. Thanks, made a tweak to make that clear.

          – E4_net_or_something_like_that
          Dec 30 '16 at 12:26


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f41383790%2fvariable-parameterised-over-a-trait-not-a-struct%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Liquibase includeAll doesn't find base path

          How to use setInterval in EJS file?

          Petrus Granier-Deferre