Convert Content-Type header into file extension












5















So what I am trying to do is convert a HEADER requests content-type into a file extension. The typical content-type is like this for html pages "text/html; charset=utf-8" that is the given response from python. I have looked into using the mimetype module with no success as it doesn't look like it accommodates what I am looking for.



Rundown:



I want to convert "text/html; charset=utf-8" into this ".html"



The typical image content-type is "image/jpeg" depending on the image type, but I am not too worried about images, given that most urls specify the image in the path. This is more for websites that don't end in "blahahah.html"



I do not want to use any libraries that are not in the base python library.










share|improve this question



























    5















    So what I am trying to do is convert a HEADER requests content-type into a file extension. The typical content-type is like this for html pages "text/html; charset=utf-8" that is the given response from python. I have looked into using the mimetype module with no success as it doesn't look like it accommodates what I am looking for.



    Rundown:



    I want to convert "text/html; charset=utf-8" into this ".html"



    The typical image content-type is "image/jpeg" depending on the image type, but I am not too worried about images, given that most urls specify the image in the path. This is more for websites that don't end in "blahahah.html"



    I do not want to use any libraries that are not in the base python library.










    share|improve this question

























      5












      5








      5


      1






      So what I am trying to do is convert a HEADER requests content-type into a file extension. The typical content-type is like this for html pages "text/html; charset=utf-8" that is the given response from python. I have looked into using the mimetype module with no success as it doesn't look like it accommodates what I am looking for.



      Rundown:



      I want to convert "text/html; charset=utf-8" into this ".html"



      The typical image content-type is "image/jpeg" depending on the image type, but I am not too worried about images, given that most urls specify the image in the path. This is more for websites that don't end in "blahahah.html"



      I do not want to use any libraries that are not in the base python library.










      share|improve this question














      So what I am trying to do is convert a HEADER requests content-type into a file extension. The typical content-type is like this for html pages "text/html; charset=utf-8" that is the given response from python. I have looked into using the mimetype module with no success as it doesn't look like it accommodates what I am looking for.



      Rundown:



      I want to convert "text/html; charset=utf-8" into this ".html"



      The typical image content-type is "image/jpeg" depending on the image type, but I am not too worried about images, given that most urls specify the image in the path. This is more for websites that don't end in "blahahah.html"



      I do not want to use any libraries that are not in the base python library.







      python http-headers mime-types content-type






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Apr 16 '15 at 12:35









      ShiftyShifty

      13012




      13012
























          1 Answer
          1






          active

          oldest

          votes


















          8














          You could split and strip:



          r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")

          from mimetypes import guess_extension

          print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))
          .htm





          share|improve this answer





















          • 1





            Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

            – Shifty
            Apr 16 '15 at 12:49











          • @Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

            – Padraic Cunningham
            Apr 16 '15 at 12:57











          • Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

            – Shifty
            Apr 16 '15 at 13:07











          • what url are you using it on?

            – Padraic Cunningham
            Apr 16 '15 at 13:20











          • any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

            – Shifty
            Apr 16 '15 at 13:28











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f29674905%2fconvert-content-type-header-into-file-extension%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          8














          You could split and strip:



          r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")

          from mimetypes import guess_extension

          print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))
          .htm





          share|improve this answer





















          • 1





            Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

            – Shifty
            Apr 16 '15 at 12:49











          • @Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

            – Padraic Cunningham
            Apr 16 '15 at 12:57











          • Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

            – Shifty
            Apr 16 '15 at 13:07











          • what url are you using it on?

            – Padraic Cunningham
            Apr 16 '15 at 13:20











          • any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

            – Shifty
            Apr 16 '15 at 13:28
















          8














          You could split and strip:



          r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")

          from mimetypes import guess_extension

          print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))
          .htm





          share|improve this answer





















          • 1





            Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

            – Shifty
            Apr 16 '15 at 12:49











          • @Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

            – Padraic Cunningham
            Apr 16 '15 at 12:57











          • Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

            – Shifty
            Apr 16 '15 at 13:07











          • what url are you using it on?

            – Padraic Cunningham
            Apr 16 '15 at 13:20











          • any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

            – Shifty
            Apr 16 '15 at 13:28














          8












          8








          8







          You could split and strip:



          r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")

          from mimetypes import guess_extension

          print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))
          .htm





          share|improve this answer















          You could split and strip:



          r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")

          from mimetypes import guess_extension

          print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))
          .htm






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Jan 19 at 19:28









          Martijn Pieters

          709k13724772299




          709k13724772299










          answered Apr 16 '15 at 12:45









          Padraic CunninghamPadraic Cunningham

          134k13121197




          134k13121197








          • 1





            Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

            – Shifty
            Apr 16 '15 at 12:49











          • @Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

            – Padraic Cunningham
            Apr 16 '15 at 12:57











          • Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

            – Shifty
            Apr 16 '15 at 13:07











          • what url are you using it on?

            – Padraic Cunningham
            Apr 16 '15 at 13:20











          • any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

            – Shifty
            Apr 16 '15 at 13:28














          • 1





            Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

            – Shifty
            Apr 16 '15 at 12:49











          • @Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

            – Padraic Cunningham
            Apr 16 '15 at 12:57











          • Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

            – Shifty
            Apr 16 '15 at 13:07











          • what url are you using it on?

            – Padraic Cunningham
            Apr 16 '15 at 13:20











          • any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

            – Shifty
            Apr 16 '15 at 13:28








          1




          1





          Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

          – Shifty
          Apr 16 '15 at 12:49





          Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

          – Shifty
          Apr 16 '15 at 12:49













          @Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

          – Padraic Cunningham
          Apr 16 '15 at 12:57





          @Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

          – Padraic Cunningham
          Apr 16 '15 at 12:57













          Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

          – Shifty
          Apr 16 '15 at 13:07





          Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

          – Shifty
          Apr 16 '15 at 13:07













          what url are you using it on?

          – Padraic Cunningham
          Apr 16 '15 at 13:20





          what url are you using it on?

          – Padraic Cunningham
          Apr 16 '15 at 13:20













          any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

          – Shifty
          Apr 16 '15 at 13:28





          any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

          – Shifty
          Apr 16 '15 at 13:28


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f29674905%2fconvert-content-type-header-into-file-extension%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Liquibase includeAll doesn't find base path

          How to use setInterval in EJS file?

          Petrus Granier-Deferre