Regex in python to extract certain codes












0















I have written a regex in python to extract codes like:



I63.9 J45.909 M18.90 Z82.61 Z82.389 A030 A029 S87.02XD H4010X2 S12530K V675XXS



The regex which I am using is shown below:



import re
data="We have the following codes to extract, I63.9 J45.909 M18.90 Z82.61 Z82.389 A030 A029 S87.02XD H4010X2 S12530K V675XXS September 2018"
regular_expression=re.compile(r'[a-zA-Z]d{1,2}.*d{1,3}w{0,2}',re.I)
result=value_1.findall(data)
print(result)


Can someone tell me if it is the perfect regex to extract these codes or what could be a better and more robust regex to extract the above codes?










share|improve this question

























  • This seems like a job better suited for Split()

    – Robert Harvey
    Jan 18 at 16:13











  • is the requirement for code: an alphabet followed by 1 or 2 digits followed by 0 to n . followed by 1, 2 or 3 digits followed by 0,1 or 2 characters? none of the codes given in the sample needs the last w{0,2} part of the regex

    – Matt.G
    Jan 18 at 16:20













  • @Matt.G sorry for the confusion. Updated my question. There could also be codes like: S87.02XD H4010X2 S12530K V675XXS

    – Cathy
    Jan 18 at 16:53











  • @RobertHarvey I did not get you?

    – Cathy
    Jan 18 at 16:54











  • python-reference.readthedocs.io/en/latest/docs/str/split.html

    – Robert Harvey
    Jan 18 at 17:12
















0















I have written a regex in python to extract codes like:



I63.9 J45.909 M18.90 Z82.61 Z82.389 A030 A029 S87.02XD H4010X2 S12530K V675XXS



The regex which I am using is shown below:



import re
data="We have the following codes to extract, I63.9 J45.909 M18.90 Z82.61 Z82.389 A030 A029 S87.02XD H4010X2 S12530K V675XXS September 2018"
regular_expression=re.compile(r'[a-zA-Z]d{1,2}.*d{1,3}w{0,2}',re.I)
result=value_1.findall(data)
print(result)


Can someone tell me if it is the perfect regex to extract these codes or what could be a better and more robust regex to extract the above codes?










share|improve this question

























  • This seems like a job better suited for Split()

    – Robert Harvey
    Jan 18 at 16:13











  • is the requirement for code: an alphabet followed by 1 or 2 digits followed by 0 to n . followed by 1, 2 or 3 digits followed by 0,1 or 2 characters? none of the codes given in the sample needs the last w{0,2} part of the regex

    – Matt.G
    Jan 18 at 16:20













  • @Matt.G sorry for the confusion. Updated my question. There could also be codes like: S87.02XD H4010X2 S12530K V675XXS

    – Cathy
    Jan 18 at 16:53











  • @RobertHarvey I did not get you?

    – Cathy
    Jan 18 at 16:54











  • python-reference.readthedocs.io/en/latest/docs/str/split.html

    – Robert Harvey
    Jan 18 at 17:12














0












0








0








I have written a regex in python to extract codes like:



I63.9 J45.909 M18.90 Z82.61 Z82.389 A030 A029 S87.02XD H4010X2 S12530K V675XXS



The regex which I am using is shown below:



import re
data="We have the following codes to extract, I63.9 J45.909 M18.90 Z82.61 Z82.389 A030 A029 S87.02XD H4010X2 S12530K V675XXS September 2018"
regular_expression=re.compile(r'[a-zA-Z]d{1,2}.*d{1,3}w{0,2}',re.I)
result=value_1.findall(data)
print(result)


Can someone tell me if it is the perfect regex to extract these codes or what could be a better and more robust regex to extract the above codes?










share|improve this question
















I have written a regex in python to extract codes like:



I63.9 J45.909 M18.90 Z82.61 Z82.389 A030 A029 S87.02XD H4010X2 S12530K V675XXS



The regex which I am using is shown below:



import re
data="We have the following codes to extract, I63.9 J45.909 M18.90 Z82.61 Z82.389 A030 A029 S87.02XD H4010X2 S12530K V675XXS September 2018"
regular_expression=re.compile(r'[a-zA-Z]d{1,2}.*d{1,3}w{0,2}',re.I)
result=value_1.findall(data)
print(result)


Can someone tell me if it is the perfect regex to extract these codes or what could be a better and more robust regex to extract the above codes?







regex python-3.x data-extraction






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 18 at 16:52







Cathy

















asked Jan 18 at 16:10









CathyCathy

617




617













  • This seems like a job better suited for Split()

    – Robert Harvey
    Jan 18 at 16:13











  • is the requirement for code: an alphabet followed by 1 or 2 digits followed by 0 to n . followed by 1, 2 or 3 digits followed by 0,1 or 2 characters? none of the codes given in the sample needs the last w{0,2} part of the regex

    – Matt.G
    Jan 18 at 16:20













  • @Matt.G sorry for the confusion. Updated my question. There could also be codes like: S87.02XD H4010X2 S12530K V675XXS

    – Cathy
    Jan 18 at 16:53











  • @RobertHarvey I did not get you?

    – Cathy
    Jan 18 at 16:54











  • python-reference.readthedocs.io/en/latest/docs/str/split.html

    – Robert Harvey
    Jan 18 at 17:12



















  • This seems like a job better suited for Split()

    – Robert Harvey
    Jan 18 at 16:13











  • is the requirement for code: an alphabet followed by 1 or 2 digits followed by 0 to n . followed by 1, 2 or 3 digits followed by 0,1 or 2 characters? none of the codes given in the sample needs the last w{0,2} part of the regex

    – Matt.G
    Jan 18 at 16:20













  • @Matt.G sorry for the confusion. Updated my question. There could also be codes like: S87.02XD H4010X2 S12530K V675XXS

    – Cathy
    Jan 18 at 16:53











  • @RobertHarvey I did not get you?

    – Cathy
    Jan 18 at 16:54











  • python-reference.readthedocs.io/en/latest/docs/str/split.html

    – Robert Harvey
    Jan 18 at 17:12

















This seems like a job better suited for Split()

– Robert Harvey
Jan 18 at 16:13





This seems like a job better suited for Split()

– Robert Harvey
Jan 18 at 16:13













is the requirement for code: an alphabet followed by 1 or 2 digits followed by 0 to n . followed by 1, 2 or 3 digits followed by 0,1 or 2 characters? none of the codes given in the sample needs the last w{0,2} part of the regex

– Matt.G
Jan 18 at 16:20







is the requirement for code: an alphabet followed by 1 or 2 digits followed by 0 to n . followed by 1, 2 or 3 digits followed by 0,1 or 2 characters? none of the codes given in the sample needs the last w{0,2} part of the regex

– Matt.G
Jan 18 at 16:20















@Matt.G sorry for the confusion. Updated my question. There could also be codes like: S87.02XD H4010X2 S12530K V675XXS

– Cathy
Jan 18 at 16:53





@Matt.G sorry for the confusion. Updated my question. There could also be codes like: S87.02XD H4010X2 S12530K V675XXS

– Cathy
Jan 18 at 16:53













@RobertHarvey I did not get you?

– Cathy
Jan 18 at 16:54





@RobertHarvey I did not get you?

– Cathy
Jan 18 at 16:54













python-reference.readthedocs.io/en/latest/docs/str/split.html

– Robert Harvey
Jan 18 at 17:12





python-reference.readthedocs.io/en/latest/docs/str/split.html

– Robert Harvey
Jan 18 at 17:12












1 Answer
1






active

oldest

votes


















2














You can use this regex



pattern = r'[A-Z]+d+(.d+)?(w+)?'





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54257631%2fregex-in-python-to-extract-certain-codes%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    You can use this regex



    pattern = r'[A-Z]+d+(.d+)?(w+)?'





    share|improve this answer




























      2














      You can use this regex



      pattern = r'[A-Z]+d+(.d+)?(w+)?'





      share|improve this answer


























        2












        2








        2







        You can use this regex



        pattern = r'[A-Z]+d+(.d+)?(w+)?'





        share|improve this answer













        You can use this regex



        pattern = r'[A-Z]+d+(.d+)?(w+)?'






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 19 at 21:26









        lagripelagripe

        340114




        340114






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54257631%2fregex-in-python-to-extract-certain-codes%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Homophylophilia

            Updating UILabel text programmatically using a function

            Cloud Functions - OpenCV Videocapture Read method fails for larger files from cloud storage