How can I simplify this Python code (assignment from a book)?












1















I am studying "Python for Everybody" book written by Charles R. Severance and I have a question to the exercise2 from Chapter7.



The task is to go through the mbox-short.txt file and "When you encounter a line that starts with “X-DSPAM-Confidence:” pull apart the line to extract the floating-point number on the line. Count these lines and then compute the total of the spam confidence values from these lines. When you reach the end of the file, print out the average spam confidence."



Here is my way of doing this task:



fname = input('Enter the file name: ') 
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
exit()
count = 0
values = list()
for line in fhand:
if line.startswith('X-DSPAM-Confidence:'):
string = line
count = count + 1
colpos = string.find(":")
portion = string[colpos+1:]
portion = float(portion)
values.append(portion)
print('Average spam confidence:', sum(values)/count)


I know this code works because I get the same result as in the book, however, I think this code can be simpler. The reason I think so is because I used a list in this code (declared it and then stored values in it). However, "Lists" is the next topic in the book and when solving this task I didn't know anything about lists and had to google them. I solved this task this way, because this is what I'd do in the R language (which I am already quite familiar with), I'd make a vector in which I'd store the values from my iteration.



So my question is: Can this code be simplified? Can I do the same task without using list? If yes, how can I do it?










share|improve this question




















  • 1





    For questions about simplifing working code, you can ask on codereview.stackexchange.com/tour

    – cricket_007
    Jan 20 at 15:17











  • You could just make values a variable and += the actual values. A list just looks nicer in my opinion.

    – Bailey Kocin
    Jan 20 at 15:18
















1















I am studying "Python for Everybody" book written by Charles R. Severance and I have a question to the exercise2 from Chapter7.



The task is to go through the mbox-short.txt file and "When you encounter a line that starts with “X-DSPAM-Confidence:” pull apart the line to extract the floating-point number on the line. Count these lines and then compute the total of the spam confidence values from these lines. When you reach the end of the file, print out the average spam confidence."



Here is my way of doing this task:



fname = input('Enter the file name: ') 
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
exit()
count = 0
values = list()
for line in fhand:
if line.startswith('X-DSPAM-Confidence:'):
string = line
count = count + 1
colpos = string.find(":")
portion = string[colpos+1:]
portion = float(portion)
values.append(portion)
print('Average spam confidence:', sum(values)/count)


I know this code works because I get the same result as in the book, however, I think this code can be simpler. The reason I think so is because I used a list in this code (declared it and then stored values in it). However, "Lists" is the next topic in the book and when solving this task I didn't know anything about lists and had to google them. I solved this task this way, because this is what I'd do in the R language (which I am already quite familiar with), I'd make a vector in which I'd store the values from my iteration.



So my question is: Can this code be simplified? Can I do the same task without using list? If yes, how can I do it?










share|improve this question




















  • 1





    For questions about simplifing working code, you can ask on codereview.stackexchange.com/tour

    – cricket_007
    Jan 20 at 15:17











  • You could just make values a variable and += the actual values. A list just looks nicer in my opinion.

    – Bailey Kocin
    Jan 20 at 15:18














1












1








1


0






I am studying "Python for Everybody" book written by Charles R. Severance and I have a question to the exercise2 from Chapter7.



The task is to go through the mbox-short.txt file and "When you encounter a line that starts with “X-DSPAM-Confidence:” pull apart the line to extract the floating-point number on the line. Count these lines and then compute the total of the spam confidence values from these lines. When you reach the end of the file, print out the average spam confidence."



Here is my way of doing this task:



fname = input('Enter the file name: ') 
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
exit()
count = 0
values = list()
for line in fhand:
if line.startswith('X-DSPAM-Confidence:'):
string = line
count = count + 1
colpos = string.find(":")
portion = string[colpos+1:]
portion = float(portion)
values.append(portion)
print('Average spam confidence:', sum(values)/count)


I know this code works because I get the same result as in the book, however, I think this code can be simpler. The reason I think so is because I used a list in this code (declared it and then stored values in it). However, "Lists" is the next topic in the book and when solving this task I didn't know anything about lists and had to google them. I solved this task this way, because this is what I'd do in the R language (which I am already quite familiar with), I'd make a vector in which I'd store the values from my iteration.



So my question is: Can this code be simplified? Can I do the same task without using list? If yes, how can I do it?










share|improve this question
















I am studying "Python for Everybody" book written by Charles R. Severance and I have a question to the exercise2 from Chapter7.



The task is to go through the mbox-short.txt file and "When you encounter a line that starts with “X-DSPAM-Confidence:” pull apart the line to extract the floating-point number on the line. Count these lines and then compute the total of the spam confidence values from these lines. When you reach the end of the file, print out the average spam confidence."



Here is my way of doing this task:



fname = input('Enter the file name: ') 
try:
fhand = open(fname)
except:
print('File cannot be opened:', fname)
exit()
count = 0
values = list()
for line in fhand:
if line.startswith('X-DSPAM-Confidence:'):
string = line
count = count + 1
colpos = string.find(":")
portion = string[colpos+1:]
portion = float(portion)
values.append(portion)
print('Average spam confidence:', sum(values)/count)


I know this code works because I get the same result as in the book, however, I think this code can be simpler. The reason I think so is because I used a list in this code (declared it and then stored values in it). However, "Lists" is the next topic in the book and when solving this task I didn't know anything about lists and had to google them. I solved this task this way, because this is what I'd do in the R language (which I am already quite familiar with), I'd make a vector in which I'd store the values from my iteration.



So my question is: Can this code be simplified? Can I do the same task without using list? If yes, how can I do it?







python simplify






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 20 at 15:25









Mad Physicist

37.1k1671103




37.1k1671103










asked Jan 20 at 15:12









Uliana ZaspaUliana Zaspa

82




82








  • 1





    For questions about simplifing working code, you can ask on codereview.stackexchange.com/tour

    – cricket_007
    Jan 20 at 15:17











  • You could just make values a variable and += the actual values. A list just looks nicer in my opinion.

    – Bailey Kocin
    Jan 20 at 15:18














  • 1





    For questions about simplifing working code, you can ask on codereview.stackexchange.com/tour

    – cricket_007
    Jan 20 at 15:17











  • You could just make values a variable and += the actual values. A list just looks nicer in my opinion.

    – Bailey Kocin
    Jan 20 at 15:18








1




1





For questions about simplifing working code, you can ask on codereview.stackexchange.com/tour

– cricket_007
Jan 20 at 15:17





For questions about simplifing working code, you can ask on codereview.stackexchange.com/tour

– cricket_007
Jan 20 at 15:17













You could just make values a variable and += the actual values. A list just looks nicer in my opinion.

– Bailey Kocin
Jan 20 at 15:18





You could just make values a variable and += the actual values. A list just looks nicer in my opinion.

– Bailey Kocin
Jan 20 at 15:18












3 Answers
3






active

oldest

votes


















1














I could change the "values" object to a floating type. The overhead of a list is not really needed in the problem.



values = 0.0


Then in the loop use



values += portion 


Otherwise, there really is not a simpler way as this problem has tasks and you must meet all of the tasks in order to solve it.




  1. Open File

  2. Check For Error

  3. Loop Through Lines

  4. Find certain lines

  5. Total up said lines

  6. Print average


If you can do it in 3 lines of code great but that doesn't make what goes on in the background necessarily simpler. It will also probably look ugly.






share|improve this answer































    0














    You could filter the file's lines before the loop, then you can collapse the other variables into one, and get the values using list-comprehension. From that, you have your count from the length of that list.



    interesting_lines = (line.startswith('X-DSPAM-Confidence:') for line in fhand)
    values = [float(line[(line.find(":")+1):]) for line in interesting_lines]
    count = len(values)



    Can I do the same task without using list?




    If the output needs to be an average, yes, you can accumlate the sum and the count as their own variables, and not need a list to call sum(values) against



    Note that open(fname) is giving you an iterable collection anyway, and you're looping over the "list of lines" in the file.






    share|improve this answer

































      0














      List-comprehensions can often replace for-loops that add to a list:



      fname = input('Enter the file name: ') 
      try:
      fhand = open(fname)
      except:
      print('File cannot be opened:', fname)
      exit()

      values = [float(l[l.find(":")+1:]) for l in fhand if l.startswith('X-DSPAM-Confidence:')]

      print('Average spam confidence:', sum(values)/len(values))


      The inner part is simply your code combined, so perhaps less readable.



      EDIT: Without using lists, it can be done with "reduce":



      from functools import reduce
      fname = input('Enter the file name: ')
      try:
      fhand = open(fname)
      except:
      print('File cannot be opened:', fname)
      exit()

      sum, count = reduce(lambda acc, l: (acc[0] + float(l[l.find(":")+1:]), acc[1]+1) if l.startswith('X-DSPAM-Confidence:') else acc, fhand, (0,0))

      print('Average spam confidence:', sum / count)


      Reduce is often called "fold" in other languages, and it basically allows you to iterate over a collection with an "accumulator". Here, I iterate the collection with an accumulator which is a tuple of (sum, count). With each item, we add to the sum and increment the count. See Reduce documentation.



      All this being said, "simplify" does not necessarily mean as little code as possible, so I would stick with your own code if you're not comfortable with these shorthand notations.






      share|improve this answer

























        Your Answer






        StackExchange.ifUsing("editor", function () {
        StackExchange.using("externalEditor", function () {
        StackExchange.using("snippets", function () {
        StackExchange.snippets.init();
        });
        });
        }, "code-snippets");

        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "1"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54277827%2fhow-can-i-simplify-this-python-code-assignment-from-a-book%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        1














        I could change the "values" object to a floating type. The overhead of a list is not really needed in the problem.



        values = 0.0


        Then in the loop use



        values += portion 


        Otherwise, there really is not a simpler way as this problem has tasks and you must meet all of the tasks in order to solve it.




        1. Open File

        2. Check For Error

        3. Loop Through Lines

        4. Find certain lines

        5. Total up said lines

        6. Print average


        If you can do it in 3 lines of code great but that doesn't make what goes on in the background necessarily simpler. It will also probably look ugly.






        share|improve this answer




























          1














          I could change the "values" object to a floating type. The overhead of a list is not really needed in the problem.



          values = 0.0


          Then in the loop use



          values += portion 


          Otherwise, there really is not a simpler way as this problem has tasks and you must meet all of the tasks in order to solve it.




          1. Open File

          2. Check For Error

          3. Loop Through Lines

          4. Find certain lines

          5. Total up said lines

          6. Print average


          If you can do it in 3 lines of code great but that doesn't make what goes on in the background necessarily simpler. It will also probably look ugly.






          share|improve this answer


























            1












            1








            1







            I could change the "values" object to a floating type. The overhead of a list is not really needed in the problem.



            values = 0.0


            Then in the loop use



            values += portion 


            Otherwise, there really is not a simpler way as this problem has tasks and you must meet all of the tasks in order to solve it.




            1. Open File

            2. Check For Error

            3. Loop Through Lines

            4. Find certain lines

            5. Total up said lines

            6. Print average


            If you can do it in 3 lines of code great but that doesn't make what goes on in the background necessarily simpler. It will also probably look ugly.






            share|improve this answer













            I could change the "values" object to a floating type. The overhead of a list is not really needed in the problem.



            values = 0.0


            Then in the loop use



            values += portion 


            Otherwise, there really is not a simpler way as this problem has tasks and you must meet all of the tasks in order to solve it.




            1. Open File

            2. Check For Error

            3. Loop Through Lines

            4. Find certain lines

            5. Total up said lines

            6. Print average


            If you can do it in 3 lines of code great but that doesn't make what goes on in the background necessarily simpler. It will also probably look ugly.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Jan 20 at 15:25









            Bailey KocinBailey Kocin

            13719




            13719

























                0














                You could filter the file's lines before the loop, then you can collapse the other variables into one, and get the values using list-comprehension. From that, you have your count from the length of that list.



                interesting_lines = (line.startswith('X-DSPAM-Confidence:') for line in fhand)
                values = [float(line[(line.find(":")+1):]) for line in interesting_lines]
                count = len(values)



                Can I do the same task without using list?




                If the output needs to be an average, yes, you can accumlate the sum and the count as their own variables, and not need a list to call sum(values) against



                Note that open(fname) is giving you an iterable collection anyway, and you're looping over the "list of lines" in the file.






                share|improve this answer






























                  0














                  You could filter the file's lines before the loop, then you can collapse the other variables into one, and get the values using list-comprehension. From that, you have your count from the length of that list.



                  interesting_lines = (line.startswith('X-DSPAM-Confidence:') for line in fhand)
                  values = [float(line[(line.find(":")+1):]) for line in interesting_lines]
                  count = len(values)



                  Can I do the same task without using list?




                  If the output needs to be an average, yes, you can accumlate the sum and the count as their own variables, and not need a list to call sum(values) against



                  Note that open(fname) is giving you an iterable collection anyway, and you're looping over the "list of lines" in the file.






                  share|improve this answer




























                    0












                    0








                    0







                    You could filter the file's lines before the loop, then you can collapse the other variables into one, and get the values using list-comprehension. From that, you have your count from the length of that list.



                    interesting_lines = (line.startswith('X-DSPAM-Confidence:') for line in fhand)
                    values = [float(line[(line.find(":")+1):]) for line in interesting_lines]
                    count = len(values)



                    Can I do the same task without using list?




                    If the output needs to be an average, yes, you can accumlate the sum and the count as their own variables, and not need a list to call sum(values) against



                    Note that open(fname) is giving you an iterable collection anyway, and you're looping over the "list of lines" in the file.






                    share|improve this answer















                    You could filter the file's lines before the loop, then you can collapse the other variables into one, and get the values using list-comprehension. From that, you have your count from the length of that list.



                    interesting_lines = (line.startswith('X-DSPAM-Confidence:') for line in fhand)
                    values = [float(line[(line.find(":")+1):]) for line in interesting_lines]
                    count = len(values)



                    Can I do the same task without using list?




                    If the output needs to be an average, yes, you can accumlate the sum and the count as their own variables, and not need a list to call sum(values) against



                    Note that open(fname) is giving you an iterable collection anyway, and you're looping over the "list of lines" in the file.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Jan 20 at 15:27

























                    answered Jan 20 at 15:21









                    cricket_007cricket_007

                    81.8k1143111




                    81.8k1143111























                        0














                        List-comprehensions can often replace for-loops that add to a list:



                        fname = input('Enter the file name: ') 
                        try:
                        fhand = open(fname)
                        except:
                        print('File cannot be opened:', fname)
                        exit()

                        values = [float(l[l.find(":")+1:]) for l in fhand if l.startswith('X-DSPAM-Confidence:')]

                        print('Average spam confidence:', sum(values)/len(values))


                        The inner part is simply your code combined, so perhaps less readable.



                        EDIT: Without using lists, it can be done with "reduce":



                        from functools import reduce
                        fname = input('Enter the file name: ')
                        try:
                        fhand = open(fname)
                        except:
                        print('File cannot be opened:', fname)
                        exit()

                        sum, count = reduce(lambda acc, l: (acc[0] + float(l[l.find(":")+1:]), acc[1]+1) if l.startswith('X-DSPAM-Confidence:') else acc, fhand, (0,0))

                        print('Average spam confidence:', sum / count)


                        Reduce is often called "fold" in other languages, and it basically allows you to iterate over a collection with an "accumulator". Here, I iterate the collection with an accumulator which is a tuple of (sum, count). With each item, we add to the sum and increment the count. See Reduce documentation.



                        All this being said, "simplify" does not necessarily mean as little code as possible, so I would stick with your own code if you're not comfortable with these shorthand notations.






                        share|improve this answer






























                          0














                          List-comprehensions can often replace for-loops that add to a list:



                          fname = input('Enter the file name: ') 
                          try:
                          fhand = open(fname)
                          except:
                          print('File cannot be opened:', fname)
                          exit()

                          values = [float(l[l.find(":")+1:]) for l in fhand if l.startswith('X-DSPAM-Confidence:')]

                          print('Average spam confidence:', sum(values)/len(values))


                          The inner part is simply your code combined, so perhaps less readable.



                          EDIT: Without using lists, it can be done with "reduce":



                          from functools import reduce
                          fname = input('Enter the file name: ')
                          try:
                          fhand = open(fname)
                          except:
                          print('File cannot be opened:', fname)
                          exit()

                          sum, count = reduce(lambda acc, l: (acc[0] + float(l[l.find(":")+1:]), acc[1]+1) if l.startswith('X-DSPAM-Confidence:') else acc, fhand, (0,0))

                          print('Average spam confidence:', sum / count)


                          Reduce is often called "fold" in other languages, and it basically allows you to iterate over a collection with an "accumulator". Here, I iterate the collection with an accumulator which is a tuple of (sum, count). With each item, we add to the sum and increment the count. See Reduce documentation.



                          All this being said, "simplify" does not necessarily mean as little code as possible, so I would stick with your own code if you're not comfortable with these shorthand notations.






                          share|improve this answer




























                            0












                            0








                            0







                            List-comprehensions can often replace for-loops that add to a list:



                            fname = input('Enter the file name: ') 
                            try:
                            fhand = open(fname)
                            except:
                            print('File cannot be opened:', fname)
                            exit()

                            values = [float(l[l.find(":")+1:]) for l in fhand if l.startswith('X-DSPAM-Confidence:')]

                            print('Average spam confidence:', sum(values)/len(values))


                            The inner part is simply your code combined, so perhaps less readable.



                            EDIT: Without using lists, it can be done with "reduce":



                            from functools import reduce
                            fname = input('Enter the file name: ')
                            try:
                            fhand = open(fname)
                            except:
                            print('File cannot be opened:', fname)
                            exit()

                            sum, count = reduce(lambda acc, l: (acc[0] + float(l[l.find(":")+1:]), acc[1]+1) if l.startswith('X-DSPAM-Confidence:') else acc, fhand, (0,0))

                            print('Average spam confidence:', sum / count)


                            Reduce is often called "fold" in other languages, and it basically allows you to iterate over a collection with an "accumulator". Here, I iterate the collection with an accumulator which is a tuple of (sum, count). With each item, we add to the sum and increment the count. See Reduce documentation.



                            All this being said, "simplify" does not necessarily mean as little code as possible, so I would stick with your own code if you're not comfortable with these shorthand notations.






                            share|improve this answer















                            List-comprehensions can often replace for-loops that add to a list:



                            fname = input('Enter the file name: ') 
                            try:
                            fhand = open(fname)
                            except:
                            print('File cannot be opened:', fname)
                            exit()

                            values = [float(l[l.find(":")+1:]) for l in fhand if l.startswith('X-DSPAM-Confidence:')]

                            print('Average spam confidence:', sum(values)/len(values))


                            The inner part is simply your code combined, so perhaps less readable.



                            EDIT: Without using lists, it can be done with "reduce":



                            from functools import reduce
                            fname = input('Enter the file name: ')
                            try:
                            fhand = open(fname)
                            except:
                            print('File cannot be opened:', fname)
                            exit()

                            sum, count = reduce(lambda acc, l: (acc[0] + float(l[l.find(":")+1:]), acc[1]+1) if l.startswith('X-DSPAM-Confidence:') else acc, fhand, (0,0))

                            print('Average spam confidence:', sum / count)


                            Reduce is often called "fold" in other languages, and it basically allows you to iterate over a collection with an "accumulator". Here, I iterate the collection with an accumulator which is a tuple of (sum, count). With each item, we add to the sum and increment the count. See Reduce documentation.



                            All this being said, "simplify" does not necessarily mean as little code as possible, so I would stick with your own code if you're not comfortable with these shorthand notations.







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Jan 20 at 15:33

























                            answered Jan 20 at 15:19









                            JeppeJeppe

                            713615




                            713615






























                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Stack Overflow!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54277827%2fhow-can-i-simplify-this-python-code-assignment-from-a-book%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Liquibase includeAll doesn't find base path

                                How to use setInterval in EJS file?

                                Petrus Granier-Deferre