Convert Content-Type header into file extension

So what I am trying to do is convert a HEADER requests content-type into a file extension. The typical content-type is like this for html pages "text/html; charset=utf-8" that is the given response from python. I have looked into using the mimetype module with no success as it doesn't look like it accommodates what I am looking for.

Rundown:

I want to convert "text/html; charset=utf-8" into this ".html"

The typical image content-type is "image/jpeg" depending on the image type, but I am not too worried about images, given that most urls specify the image in the path. This is more for websites that don't end in "blahahah.html"

I do not want to use any libraries that are not in the base python library.

asked Apr 16 '15 at 12:35

Shifty

13012

add a comment |

Rundown:

I want to convert "text/html; charset=utf-8" into this ".html"

I do not want to use any libraries that are not in the base python library.

asked Apr 16 '15 at 12:35

Shifty

13012

add a comment |

Rundown:

I want to convert "text/html; charset=utf-8" into this ".html"

I do not want to use any libraries that are not in the base python library.

asked Apr 16 '15 at 12:35

Shifty

13012

Rundown:

I want to convert "text/html; charset=utf-8" into this ".html"

I do not want to use any libraries that are not in the base python library.

python http-headers mime-types content-type

asked Apr 16 '15 at 12:35

Shifty

13012

asked Apr 16 '15 at 12:35

Shifty

13012

asked Apr 16 '15 at 12:35

Shifty

13012

asked Apr 16 '15 at 12:35

Shifty

13012

asked Apr 16 '15 at 12:35

Shifty

13012

add a comment |

1 Answer
1

active

oldest

votes

You could split and strip:

r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")



from mimetypes import guess_extension



print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))

.htm

edited Jan 19 at 19:28

Martijn Pieters♦

709k13724772299

answered Apr 16 '15 at 12:45

Padraic Cunningham

134k13121197

1

Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

– Shifty
Apr 16 '15 at 12:49

@Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

– Padraic Cunningham
Apr 16 '15 at 12:57

Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

– Shifty
Apr 16 '15 at 13:07

what url are you using it on?

– Padraic Cunningham
Apr 16 '15 at 13:20

any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

– Shifty
Apr 16 '15 at 13:28

|
show 3 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f29674905%2fconvert-content-type-header-into-file-extension%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You could split and strip:

r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")



from mimetypes import guess_extension



print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))

.htm

edited Jan 19 at 19:28

Martijn Pieters♦

709k13724772299

answered Apr 16 '15 at 12:45

Padraic Cunningham

134k13121197

1

Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

– Shifty
Apr 16 '15 at 12:49

@Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

– Padraic Cunningham
Apr 16 '15 at 12:57

Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

– Shifty
Apr 16 '15 at 13:07

what url are you using it on?

– Padraic Cunningham
Apr 16 '15 at 13:20

any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

– Shifty
Apr 16 '15 at 13:28

|
show 3 more comments

You could split and strip:

r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")



from mimetypes import guess_extension



print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))

.htm

edited Jan 19 at 19:28

Martijn Pieters♦

709k13724772299

answered Apr 16 '15 at 12:45

Padraic Cunningham

134k13121197

1

Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

– Shifty
Apr 16 '15 at 12:49

@Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

– Padraic Cunningham
Apr 16 '15 at 12:57

Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

– Shifty
Apr 16 '15 at 13:07

what url are you using it on?

– Padraic Cunningham
Apr 16 '15 at 13:20

any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

– Shifty
Apr 16 '15 at 13:28

|
show 3 more comments

You could split and strip:

r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")



from mimetypes import guess_extension



print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))

.htm

edited Jan 19 at 19:28

Martijn Pieters♦

709k13724772299

answered Apr 16 '15 at 12:45

Padraic Cunningham

134k13121197

You could split and strip:

r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")



from mimetypes import guess_extension



print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))

.htm

edited Jan 19 at 19:28

Martijn Pieters♦

709k13724772299

answered Apr 16 '15 at 12:45

Padraic Cunningham

134k13121197

edited Jan 19 at 19:28

Martijn Pieters♦

709k13724772299

edited Jan 19 at 19:28

Martijn Pieters♦

709k13724772299

edited Jan 19 at 19:28

Martijn Pieters♦

709k13724772299

answered Apr 16 '15 at 12:45

Padraic Cunningham

134k13121197

answered Apr 16 '15 at 12:45

Padraic Cunningham

134k13121197

answered Apr 16 '15 at 12:45

Padraic Cunningham

134k13121197

1

Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

– Shifty
Apr 16 '15 at 12:49

@Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

– Padraic Cunningham
Apr 16 '15 at 12:57

Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

– Shifty
Apr 16 '15 at 13:07

what url are you using it on?

– Padraic Cunningham
Apr 16 '15 at 13:20

any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

– Shifty
Apr 16 '15 at 13:28

|
show 3 more comments

1

Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

– Shifty
Apr 16 '15 at 12:49

@Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

– Padraic Cunningham
Apr 16 '15 at 12:57

Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

– Shifty
Apr 16 '15 at 13:07

what url are you using it on?

– Padraic Cunningham
Apr 16 '15 at 13:20

any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

– Shifty
Apr 16 '15 at 13:28

Thanks, you're a god. I couldn't for the life of me work out how that guess_extension worked.

– Shifty
Apr 16 '15 at 12:49

@Shiftym no worries, guess_extension(r.headers['content-type'])alone will work for certain sites but splitting should cover more bases

– Padraic Cunningham
Apr 16 '15 at 12:57

Weird oddity - The file extension is changing between ".htm" and ".html" on the same website

– Shifty
Apr 16 '15 at 13:07

what url are you using it on?

– Padraic Cunningham
Apr 16 '15 at 13:20

any URL, try google.com. the file extension will differ from .html and .htm at a totally random rate

– Shifty
Apr 16 '15 at 13:28

|
show 3 more comments

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Brtdku