How do I access the image and image url in an RSS feed using Python?

-1

I currently have this code in Python using feedparser:

import feedparser



RSS_FEEDS = {'cnn': 'http://rss.cnn.com/rss/edition.rss'}    



def get_news_test(publication="cnn"):

    feed = feedparser.parse(RSS_FEEDS[publication])

    articles_cnn = feed['entries']



    for article in articles_cnn:

        print(article)





get_news_test()

The above code returns all the current articles. Here is a sample of one of the articles it returned:

{'title': "China's internet shutdowns tactics are spreading worldwide", 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'http://rss.cnn.com/rss/edition.rss', 'value': "China's internet shutdowns tactics are spreading worldwide"}, 'summary': 'When Hong Kong police fired tear gas at peaceful pro-democracy protesters in 2014, the news moved swiftly through social media. Photos and videos of mostly student demonstrators being gassed helped fuel the outrage that ultimately drove hundreds of thousands of people into the streets.', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'http://rss.cnn.com/rss/edition.rss', 'value': 'When Hong Kong police fired tear gas at peaceful pro-democracy protesters in 2014, the news moved swiftly through social media. Photos and videos of mostly student demonstrators being gassed helped fuel the outrage that ultimately drove hundreds of thousands of people into the streets.'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html'}], 'link': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html', 'id': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html', 'guidislink': False, 'published': 'Fri, 18 Jan 2019 07:40:48 GMT', 'published_parsed': time.struct_time(tm_year=2019, tm_mon=1, tm_mday=18, tm_hour=7, tm_min=40, tm_sec=48, tm_wday=4, tm_yday=18, tm_isdst=0), 'media_content': [{'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-super-169.jpg', 'height': '619', 'width': '1100'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-large-11.jpg', 'height': '300', 'width': '300'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-vertical-large-gallery.jpg', 'height': '552', 'width': '414'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-video-synd-2.jpg', 'height': '480', 'width': '640'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-live-video.jpg', 'height': '324', 'width': '576'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-t1-main.jpg', 'height': '250', 'width': '250'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-vertical-gallery.jpg', 'height': '360', 'width': '270'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-story-body.jpg', 'height': '169', 'width': '300'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-t1-main.jpg', 'height': '250', 'width': '250'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-assign.jpg', 'height': '186', 'width': '248'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-hp-video.jpg', 'height': '144', 'width': '256'}]}

Now I know I can return some portions of this for instance the title by calling:

print(article.title)

But, I am stumped as to how to get the image data from the feed.

edited Jan 20 at 8:06

asked Jan 20 at 3:24

Obie

486

Please format your question properly for readability.

– Infected Drake
Jan 20 at 3:34

Okay, then exactly how am I to do this? What should I change to make it more readable?

– Obie
Jan 20 at 3:45

Okay done. Sorry that you downvoted my question since I did not think the result was code. I assumed that code was something I had coded, not a result that was returned. It is asinine nitpicking like this that makes me hesitant to use stackoverflow.

– Obie
Jan 20 at 3:49

SO is the best place to get answers afaik. So when you need help, you are expected to make your question readable. Moreover I did not downvote your answer on the pretext that you had not formatted the question properly, its because you do not have a minimal idea of what you're doing. The fact the you cannot comprehend that the returned sample is in JSON and you're trying BeautifulSoup on it made me downvote it. In the end I'd just you should get used to JSON parsing to resolve this. 🙂

– Infected Drake
Jan 20 at 5:23

I tried JSON parsing in the past and could not get it to work and other SO answers on similar questions had suggested BS. Yes I don't know what I am doing but what happened to the idea that SO was supposed to be welcoming per this article: stackoverflow.blog/2018/04/26/… You certainly don't follow the spirit of that post.

– Obie
Jan 20 at 6:11

|
show 2 more comments

-1

I currently have this code in Python using feedparser:

import feedparser



RSS_FEEDS = {'cnn': 'http://rss.cnn.com/rss/edition.rss'}    



def get_news_test(publication="cnn"):

    feed = feedparser.parse(RSS_FEEDS[publication])

    articles_cnn = feed['entries']



    for article in articles_cnn:

        print(article)





get_news_test()

The above code returns all the current articles. Here is a sample of one of the articles it returned:

{'title': "China's internet shutdowns tactics are spreading worldwide", 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'http://rss.cnn.com/rss/edition.rss', 'value': "China's internet shutdowns tactics are spreading worldwide"}, 'summary': 'When Hong Kong police fired tear gas at peaceful pro-democracy protesters in 2014, the news moved swiftly through social media. Photos and videos of mostly student demonstrators being gassed helped fuel the outrage that ultimately drove hundreds of thousands of people into the streets.', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'http://rss.cnn.com/rss/edition.rss', 'value': 'When Hong Kong police fired tear gas at peaceful pro-democracy protesters in 2014, the news moved swiftly through social media. Photos and videos of mostly student demonstrators being gassed helped fuel the outrage that ultimately drove hundreds of thousands of people into the streets.'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html'}], 'link': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html', 'id': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html', 'guidislink': False, 'published': 'Fri, 18 Jan 2019 07:40:48 GMT', 'published_parsed': time.struct_time(tm_year=2019, tm_mon=1, tm_mday=18, tm_hour=7, tm_min=40, tm_sec=48, tm_wday=4, tm_yday=18, tm_isdst=0), 'media_content': [{'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-super-169.jpg', 'height': '619', 'width': '1100'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-large-11.jpg', 'height': '300', 'width': '300'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-vertical-large-gallery.jpg', 'height': '552', 'width': '414'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-video-synd-2.jpg', 'height': '480', 'width': '640'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-live-video.jpg', 'height': '324', 'width': '576'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-t1-main.jpg', 'height': '250', 'width': '250'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-vertical-gallery.jpg', 'height': '360', 'width': '270'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-story-body.jpg', 'height': '169', 'width': '300'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-t1-main.jpg', 'height': '250', 'width': '250'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-assign.jpg', 'height': '186', 'width': '248'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-hp-video.jpg', 'height': '144', 'width': '256'}]}

Now I know I can return some portions of this for instance the title by calling:

print(article.title)

But, I am stumped as to how to get the image data from the feed.

edited Jan 20 at 8:06

asked Jan 20 at 3:24

Obie

486

Please format your question properly for readability.

– Infected Drake
Jan 20 at 3:34

Okay, then exactly how am I to do this? What should I change to make it more readable?

– Obie
Jan 20 at 3:45

Okay done. Sorry that you downvoted my question since I did not think the result was code. I assumed that code was something I had coded, not a result that was returned. It is asinine nitpicking like this that makes me hesitant to use stackoverflow.

– Obie
Jan 20 at 3:49

SO is the best place to get answers afaik. So when you need help, you are expected to make your question readable. Moreover I did not downvote your answer on the pretext that you had not formatted the question properly, its because you do not have a minimal idea of what you're doing. The fact the you cannot comprehend that the returned sample is in JSON and you're trying BeautifulSoup on it made me downvote it. In the end I'd just you should get used to JSON parsing to resolve this. 🙂

– Infected Drake
Jan 20 at 5:23

I tried JSON parsing in the past and could not get it to work and other SO answers on similar questions had suggested BS. Yes I don't know what I am doing but what happened to the idea that SO was supposed to be welcoming per this article: stackoverflow.blog/2018/04/26/… You certainly don't follow the spirit of that post.

– Obie
Jan 20 at 6:11

|
show 2 more comments

-1

I currently have this code in Python using feedparser:

import feedparser



RSS_FEEDS = {'cnn': 'http://rss.cnn.com/rss/edition.rss'}    



def get_news_test(publication="cnn"):

    feed = feedparser.parse(RSS_FEEDS[publication])

    articles_cnn = feed['entries']



    for article in articles_cnn:

        print(article)





get_news_test()

The above code returns all the current articles. Here is a sample of one of the articles it returned:

{'title': "China's internet shutdowns tactics are spreading worldwide", 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'http://rss.cnn.com/rss/edition.rss', 'value': "China's internet shutdowns tactics are spreading worldwide"}, 'summary': 'When Hong Kong police fired tear gas at peaceful pro-democracy protesters in 2014, the news moved swiftly through social media. Photos and videos of mostly student demonstrators being gassed helped fuel the outrage that ultimately drove hundreds of thousands of people into the streets.', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'http://rss.cnn.com/rss/edition.rss', 'value': 'When Hong Kong police fired tear gas at peaceful pro-democracy protesters in 2014, the news moved swiftly through social media. Photos and videos of mostly student demonstrators being gassed helped fuel the outrage that ultimately drove hundreds of thousands of people into the streets.'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html'}], 'link': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html', 'id': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html', 'guidislink': False, 'published': 'Fri, 18 Jan 2019 07:40:48 GMT', 'published_parsed': time.struct_time(tm_year=2019, tm_mon=1, tm_mday=18, tm_hour=7, tm_min=40, tm_sec=48, tm_wday=4, tm_yday=18, tm_isdst=0), 'media_content': [{'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-super-169.jpg', 'height': '619', 'width': '1100'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-large-11.jpg', 'height': '300', 'width': '300'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-vertical-large-gallery.jpg', 'height': '552', 'width': '414'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-video-synd-2.jpg', 'height': '480', 'width': '640'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-live-video.jpg', 'height': '324', 'width': '576'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-t1-main.jpg', 'height': '250', 'width': '250'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-vertical-gallery.jpg', 'height': '360', 'width': '270'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-story-body.jpg', 'height': '169', 'width': '300'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-t1-main.jpg', 'height': '250', 'width': '250'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-assign.jpg', 'height': '186', 'width': '248'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-hp-video.jpg', 'height': '144', 'width': '256'}]}

Now I know I can return some portions of this for instance the title by calling:

print(article.title)

But, I am stumped as to how to get the image data from the feed.

edited Jan 20 at 8:06

asked Jan 20 at 3:24

Obie

486

I currently have this code in Python using feedparser:

import feedparser



RSS_FEEDS = {'cnn': 'http://rss.cnn.com/rss/edition.rss'}    



def get_news_test(publication="cnn"):

    feed = feedparser.parse(RSS_FEEDS[publication])

    articles_cnn = feed['entries']



    for article in articles_cnn:

        print(article)





get_news_test()

The above code returns all the current articles. Here is a sample of one of the articles it returned:

{'title': "China's internet shutdowns tactics are spreading worldwide", 'title_detail': {'type': 'text/plain', 'language': None, 'base': 'http://rss.cnn.com/rss/edition.rss', 'value': "China's internet shutdowns tactics are spreading worldwide"}, 'summary': 'When Hong Kong police fired tear gas at peaceful pro-democracy protesters in 2014, the news moved swiftly through social media. Photos and videos of mostly student demonstrators being gassed helped fuel the outrage that ultimately drove hundreds of thousands of people into the streets.', 'summary_detail': {'type': 'text/html', 'language': None, 'base': 'http://rss.cnn.com/rss/edition.rss', 'value': 'When Hong Kong police fired tear gas at peaceful pro-democracy protesters in 2014, the news moved swiftly through social media. Photos and videos of mostly student demonstrators being gassed helped fuel the outrage that ultimately drove hundreds of thousands of people into the streets.'}, 'links': [{'rel': 'alternate', 'type': 'text/html', 'href': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html'}], 'link': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html', 'id': 'https://www.cnn.com/2019/01/17/africa/internet-shutdown-zimbabwe-censorship-intl/index.html', 'guidislink': False, 'published': 'Fri, 18 Jan 2019 07:40:48 GMT', 'published_parsed': time.struct_time(tm_year=2019, tm_mon=1, tm_mday=18, tm_hour=7, tm_min=40, tm_sec=48, tm_wday=4, tm_yday=18, tm_isdst=0), 'media_content': [{'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-super-169.jpg', 'height': '619', 'width': '1100'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-large-11.jpg', 'height': '300', 'width': '300'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-vertical-large-gallery.jpg', 'height': '552', 'width': '414'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-video-synd-2.jpg', 'height': '480', 'width': '640'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-live-video.jpg', 'height': '324', 'width': '576'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-t1-main.jpg', 'height': '250', 'width': '250'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-vertical-gallery.jpg', 'height': '360', 'width': '270'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-story-body.jpg', 'height': '169', 'width': '300'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-t1-main.jpg', 'height': '250', 'width': '250'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-assign.jpg', 'height': '186', 'width': '248'}, {'medium': 'image', 'url': 'https://cdn.cnn.com/cnnnext/dam/assets/190116165508-zimbabwe-protest-0115-01-hp-video.jpg', 'height': '144', 'width': '256'}]}

Now I know I can return some portions of this for instance the title by calling:

print(article.title)

But, I am stumped as to how to get the image data from the feed.

python rss feedparser

edited Jan 20 at 8:06

asked Jan 20 at 3:24

Obie

486

edited Jan 20 at 8:06

asked Jan 20 at 3:24

Obie

486

edited Jan 20 at 8:06

asked Jan 20 at 3:24

Obie

486

asked Jan 20 at 3:24

Obie

486

asked Jan 20 at 3:24

Obie

486

Please format your question properly for readability.

– Infected Drake
Jan 20 at 3:34

Okay, then exactly how am I to do this? What should I change to make it more readable?

– Obie
Jan 20 at 3:45

Okay done. Sorry that you downvoted my question since I did not think the result was code. I assumed that code was something I had coded, not a result that was returned. It is asinine nitpicking like this that makes me hesitant to use stackoverflow.

– Obie
Jan 20 at 3:49

SO is the best place to get answers afaik. So when you need help, you are expected to make your question readable. Moreover I did not downvote your answer on the pretext that you had not formatted the question properly, its because you do not have a minimal idea of what you're doing. The fact the you cannot comprehend that the returned sample is in JSON and you're trying BeautifulSoup on it made me downvote it. In the end I'd just you should get used to JSON parsing to resolve this. 🙂

– Infected Drake
Jan 20 at 5:23

I tried JSON parsing in the past and could not get it to work and other SO answers on similar questions had suggested BS. Yes I don't know what I am doing but what happened to the idea that SO was supposed to be welcoming per this article: stackoverflow.blog/2018/04/26/… You certainly don't follow the spirit of that post.

– Obie
Jan 20 at 6:11

|
show 2 more comments

Please format your question properly for readability.

– Infected Drake
Jan 20 at 3:34

Okay, then exactly how am I to do this? What should I change to make it more readable?

– Obie
Jan 20 at 3:45

Okay done. Sorry that you downvoted my question since I did not think the result was code. I assumed that code was something I had coded, not a result that was returned. It is asinine nitpicking like this that makes me hesitant to use stackoverflow.

– Obie
Jan 20 at 3:49

SO is the best place to get answers afaik. So when you need help, you are expected to make your question readable. Moreover I did not downvote your answer on the pretext that you had not formatted the question properly, its because you do not have a minimal idea of what you're doing. The fact the you cannot comprehend that the returned sample is in JSON and you're trying BeautifulSoup on it made me downvote it. In the end I'd just you should get used to JSON parsing to resolve this. 🙂

– Infected Drake
Jan 20 at 5:23

I tried JSON parsing in the past and could not get it to work and other SO answers on similar questions had suggested BS. Yes I don't know what I am doing but what happened to the idea that SO was supposed to be welcoming per this article: stackoverflow.blog/2018/04/26/… You certainly don't follow the spirit of that post.

– Obie
Jan 20 at 6:11

Please format your question properly for readability.

– Infected Drake
Jan 20 at 3:34

Okay, then exactly how am I to do this? What should I change to make it more readable?

– Obie
Jan 20 at 3:45

Okay done. Sorry that you downvoted my question since I did not think the result was code. I assumed that code was something I had coded, not a result that was returned. It is asinine nitpicking like this that makes me hesitant to use stackoverflow.

– Obie
Jan 20 at 3:49

SO is the best place to get answers afaik. So when you need help, you are expected to make your question readable. Moreover I did not downvote your answer on the pretext that you had not formatted the question properly, its because you do not have a minimal idea of what you're doing. The fact the you cannot comprehend that the returned sample is in JSON and you're trying BeautifulSoup on it made me downvote it. In the end I'd just you should get used to JSON parsing to resolve this. 🙂

– Infected Drake
Jan 20 at 5:23

I tried JSON parsing in the past and could not get it to work and other SO answers on similar questions had suggested BS. Yes I don't know what I am doing but what happened to the idea that SO was supposed to be welcoming per this article: stackoverflow.blog/2018/04/26/… You certainly don't follow the spirit of that post.

– Obie
Jan 20 at 6:11

|
show 2 more comments

1 Answer
1

active

oldest

votes

Each article entry has a list of assets in media_content. Each asset node contains the media type (I only saw 'image'), size, url, etc.

To simply list the media type and url for each asset, you can use the following:

import feedparser



feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")



for article in feed["entries"]:

    for media in article.media_content:

        print(f"medium: {media['medium']}")

        print(f"   url: {media['url']}")

Output:

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-t1-main.jpg

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-assign.jpg

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-hp-video.jpg

...

If you want to request and save assets of type 'image', you can use requests:

import feedparser

import os

import requests



feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")



for article in feed["entries"]:

    for media in article.media_content:

        if media["medium"] == "image":

            img_data = requests.get(media["url"]).content

            with open(os.path.basename(media["url"]), "wb") as handler:

                handler.write(img_data)

edited Jan 20 at 14:45

answered Jan 20 at 14:13

cody

4,45121124

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54273334%2fhow-do-i-access-the-image-and-image-url-in-an-rss-feed-using-python%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Each article entry has a list of assets in media_content. Each asset node contains the media type (I only saw 'image'), size, url, etc.

To simply list the media type and url for each asset, you can use the following:

import feedparser



feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")



for article in feed["entries"]:

    for media in article.media_content:

        print(f"medium: {media['medium']}")

        print(f"   url: {media['url']}")

Output:

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-t1-main.jpg

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-assign.jpg

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-hp-video.jpg

...

If you want to request and save assets of type 'image', you can use requests:

import feedparser

import os

import requests



feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")



for article in feed["entries"]:

    for media in article.media_content:

        if media["medium"] == "image":

            img_data = requests.get(media["url"]).content

            with open(os.path.basename(media["url"]), "wb") as handler:

                handler.write(img_data)

edited Jan 20 at 14:45

answered Jan 20 at 14:13

cody

4,45121124

add a comment |

Each article entry has a list of assets in media_content. Each asset node contains the media type (I only saw 'image'), size, url, etc.

To simply list the media type and url for each asset, you can use the following:

import feedparser



feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")



for article in feed["entries"]:

    for media in article.media_content:

        print(f"medium: {media['medium']}")

        print(f"   url: {media['url']}")

Output:

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-t1-main.jpg

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-assign.jpg

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-hp-video.jpg

...

If you want to request and save assets of type 'image', you can use requests:

import feedparser

import os

import requests



feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")



for article in feed["entries"]:

    for media in article.media_content:

        if media["medium"] == "image":

            img_data = requests.get(media["url"]).content

            with open(os.path.basename(media["url"]), "wb") as handler:

                handler.write(img_data)

edited Jan 20 at 14:45

answered Jan 20 at 14:13

cody

4,45121124

add a comment |

Each article entry has a list of assets in media_content. Each asset node contains the media type (I only saw 'image'), size, url, etc.

To simply list the media type and url for each asset, you can use the following:

import feedparser



feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")



for article in feed["entries"]:

    for media in article.media_content:

        print(f"medium: {media['medium']}")

        print(f"   url: {media['url']}")

Output:

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-t1-main.jpg

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-assign.jpg

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-hp-video.jpg

...

If you want to request and save assets of type 'image', you can use requests:

import feedparser

import os

import requests



feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")



for article in feed["entries"]:

    for media in article.media_content:

        if media["medium"] == "image":

            img_data = requests.get(media["url"]).content

            with open(os.path.basename(media["url"]), "wb") as handler:

                handler.write(img_data)

edited Jan 20 at 14:45

answered Jan 20 at 14:13

cody

4,45121124

Each article entry has a list of assets in media_content. Each asset node contains the media type (I only saw 'image'), size, url, etc.

To simply list the media type and url for each asset, you can use the following:

import feedparser



feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")



for article in feed["entries"]:

    for media in article.media_content:

        print(f"medium: {media['medium']}")

        print(f"   url: {media['url']}")

Output:

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-t1-main.jpg

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-assign.jpg

medium: image

   url: https://cdn.cnn.com/cnnnext/dam/assets/190107112254-01-game-of-thrones-spain-castle-of-zafra-hp-video.jpg

...

If you want to request and save assets of type 'image', you can use requests:

import feedparser

import os

import requests



feed = feedparser.parse("http://rss.cnn.com/rss/edition.rss")



for article in feed["entries"]:

    for media in article.media_content:

        if media["medium"] == "image":

            img_data = requests.get(media["url"]).content

            with open(os.path.basename(media["url"]), "wb") as handler:

                handler.write(img_data)

edited Jan 20 at 14:45

answered Jan 20 at 14:13

cody

4,45121124

edited Jan 20 at 14:45

answered Jan 20 at 14:13

cody

4,45121124

answered Jan 20 at 14:13

cody

4,45121124

answered Jan 20 at 14:13

cody

4,45121124

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Brtdku