how to exclude all title with find? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experience Should we burninate the [wrap] tag?Python != operation vs “is not”nested “and/or” if statementsHtml Parser pulling from previous webpageHow to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?Finding the index of an item given a list containing it in PythonHow do I sort a dictionary by value?How can I exclude all “permission denied” messages from “find”?How do I list all files of a directory?How to exclude a directory in find . commandFind current directory and file's directoryHow do I find all files containing specific text on Linux?

Why does Python start at index -1 when indexing a list from the end?

What is the correct way to use the pinch test for dehydration?

Diagram with tikz

Why is black pepper both grey and black?

I am not a queen, who am I?

Should I call the interviewer directly, if HR aren't responding?

Is it ethical to give a final exam after the professor has quit before teaching the remaining chapters of the course?

Is there a "higher Segal conjecture"?

3 doors, three guards, one stone

Antler Helmet: Can it work?

Why aren't air breathing engines used as small first stages

When is phishing education going too far?

Why is "Consequences inflicted." not a sentence?

What do you call a plan that's an alternative plan in case your initial plan fails?

What is this single-engine low-wing propeller plane?

How much radiation do nuclear physics experiments expose researchers to nowadays?

Do you forfeit tax refunds/credits if you aren't required to and don't file by April 15?

How to find all the available tools in macOS terminal?

Did Xerox really develop the first LAN?

Were Kohanim forbidden from serving in King David's army?

How can players work together to take actions that are otherwise impossible?

ListPlot join points by nearest neighbor rather than order

Sorting numerically

Should I discuss the type of campaign with my players?



how to exclude all title with find?



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
The Ask Question Wizard is Live!
Data science time! April 2019 and salary with experience
Should we burninate the [wrap] tag?Python != operation vs “is not”nested “and/or” if statementsHtml Parser pulling from previous webpageHow to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?Finding the index of an item given a list containing it in PythonHow do I sort a dictionary by value?How can I exclude all “permission denied” messages from “find”?How do I list all files of a directory?How to exclude a directory in find . commandFind current directory and file's directoryHow do I find all files containing specific text on Linux?



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








-1















i have function that get me all the titles from my website
i dont want to get the title from some products
is this the right way ?
i dont want titles from products with the words "OLP NL" or "Arcserve" or "LicSAPk" or "symantec"



def get_title ( u ):
html = requests.get ( u )
bsObj = BeautifulSoup ( html.content, 'xml' )
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>',
'' )
if (title.find ( 'Arcserve' ) or title.find ( 'OLP NL' ) or title.find (
'LicSAPk' ) or title.find (
'Symantec' ) is not -1):
return 'null'
else:
return title

if (title != 'null'):
ws1 [ 'B1' ] = title
meta_desc = get_metaDesc ( u )
ws1 [ 'C1' ] = meta_desc
meta_keyWrds = get_metaKeyWrds ( u )
ws1 [ 'D1' ] = meta_keyWrds
print ( "writing product no." + str ( i ) )
else:
print("skipped product no. " + str ( i ))
continue;


the problem is that the program exclude all my products and all i'm seeing is "skipped product no." ?
whay ? not all of them have these words ...










share|improve this question

















  • 1





    Example url please?

    – QHarr
    Mar 8 at 16:38











  • the url is xml sitemap

    – Dvir Yadai
    Mar 8 at 17:24











  • something you can share?

    – QHarr
    Mar 8 at 17:24











  • cdsoft.co.il/index.php?id_product=300610&controller=product

    – Dvir Yadai
    Mar 8 at 17:30

















-1















i have function that get me all the titles from my website
i dont want to get the title from some products
is this the right way ?
i dont want titles from products with the words "OLP NL" or "Arcserve" or "LicSAPk" or "symantec"



def get_title ( u ):
html = requests.get ( u )
bsObj = BeautifulSoup ( html.content, 'xml' )
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>',
'' )
if (title.find ( 'Arcserve' ) or title.find ( 'OLP NL' ) or title.find (
'LicSAPk' ) or title.find (
'Symantec' ) is not -1):
return 'null'
else:
return title

if (title != 'null'):
ws1 [ 'B1' ] = title
meta_desc = get_metaDesc ( u )
ws1 [ 'C1' ] = meta_desc
meta_keyWrds = get_metaKeyWrds ( u )
ws1 [ 'D1' ] = meta_keyWrds
print ( "writing product no." + str ( i ) )
else:
print("skipped product no. " + str ( i ))
continue;


the problem is that the program exclude all my products and all i'm seeing is "skipped product no." ?
whay ? not all of them have these words ...










share|improve this question

















  • 1





    Example url please?

    – QHarr
    Mar 8 at 16:38











  • the url is xml sitemap

    – Dvir Yadai
    Mar 8 at 17:24











  • something you can share?

    – QHarr
    Mar 8 at 17:24











  • cdsoft.co.il/index.php?id_product=300610&controller=product

    – Dvir Yadai
    Mar 8 at 17:30













-1












-1








-1








i have function that get me all the titles from my website
i dont want to get the title from some products
is this the right way ?
i dont want titles from products with the words "OLP NL" or "Arcserve" or "LicSAPk" or "symantec"



def get_title ( u ):
html = requests.get ( u )
bsObj = BeautifulSoup ( html.content, 'xml' )
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>',
'' )
if (title.find ( 'Arcserve' ) or title.find ( 'OLP NL' ) or title.find (
'LicSAPk' ) or title.find (
'Symantec' ) is not -1):
return 'null'
else:
return title

if (title != 'null'):
ws1 [ 'B1' ] = title
meta_desc = get_metaDesc ( u )
ws1 [ 'C1' ] = meta_desc
meta_keyWrds = get_metaKeyWrds ( u )
ws1 [ 'D1' ] = meta_keyWrds
print ( "writing product no." + str ( i ) )
else:
print("skipped product no. " + str ( i ))
continue;


the problem is that the program exclude all my products and all i'm seeing is "skipped product no." ?
whay ? not all of them have these words ...










share|improve this question














i have function that get me all the titles from my website
i dont want to get the title from some products
is this the right way ?
i dont want titles from products with the words "OLP NL" or "Arcserve" or "LicSAPk" or "symantec"



def get_title ( u ):
html = requests.get ( u )
bsObj = BeautifulSoup ( html.content, 'xml' )
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>',
'' )
if (title.find ( 'Arcserve' ) or title.find ( 'OLP NL' ) or title.find (
'LicSAPk' ) or title.find (
'Symantec' ) is not -1):
return 'null'
else:
return title

if (title != 'null'):
ws1 [ 'B1' ] = title
meta_desc = get_metaDesc ( u )
ws1 [ 'C1' ] = meta_desc
meta_keyWrds = get_metaKeyWrds ( u )
ws1 [ 'D1' ] = meta_keyWrds
print ( "writing product no." + str ( i ) )
else:
print("skipped product no. " + str ( i ))
continue;


the problem is that the program exclude all my products and all i'm seeing is "skipped product no." ?
whay ? not all of them have these words ...







python beautifulsoup find web-crawler






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 8 at 16:30









Dvir YadaiDvir Yadai

206




206







  • 1





    Example url please?

    – QHarr
    Mar 8 at 16:38











  • the url is xml sitemap

    – Dvir Yadai
    Mar 8 at 17:24











  • something you can share?

    – QHarr
    Mar 8 at 17:24











  • cdsoft.co.il/index.php?id_product=300610&controller=product

    – Dvir Yadai
    Mar 8 at 17:30












  • 1





    Example url please?

    – QHarr
    Mar 8 at 16:38











  • the url is xml sitemap

    – Dvir Yadai
    Mar 8 at 17:24











  • something you can share?

    – QHarr
    Mar 8 at 17:24











  • cdsoft.co.il/index.php?id_product=300610&controller=product

    – Dvir Yadai
    Mar 8 at 17:30







1




1





Example url please?

– QHarr
Mar 8 at 16:38





Example url please?

– QHarr
Mar 8 at 16:38













the url is xml sitemap

– Dvir Yadai
Mar 8 at 17:24





the url is xml sitemap

– Dvir Yadai
Mar 8 at 17:24













something you can share?

– QHarr
Mar 8 at 17:24





something you can share?

– QHarr
Mar 8 at 17:24













cdsoft.co.il/index.php?id_product=300610&controller=product

– Dvir Yadai
Mar 8 at 17:30





cdsoft.co.il/index.php?id_product=300610&controller=product

– Dvir Yadai
Mar 8 at 17:30












2 Answers
2






active

oldest

votes


















1














You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1) or you can create a function to evaluate the terms that you want to find



def TermFind(Title):
terms=['Arcserve','OLP NL','LicSAPk','Symantec']
disc=False
for val in terms:
if Title.find(val)!=-1:
disc=True
break
return disc


When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.






share|improve this answer























  • amazing - now it's working . thanks

    – Dvir Yadai
    Mar 8 at 20:14


















0














A similar idea using any



import requests 
from bs4 import BeautifulSoup

url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
html = requests.get(url)
bsObj = BeautifulSoup(html.content, 'lxml')
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
items = ['Arcserve','OLP NL','LicSAPk','Symantec']

if not any(item in title for item in items):
print(title)





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55067248%2fhow-to-exclude-all-title-with-find%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1) or you can create a function to evaluate the terms that you want to find



    def TermFind(Title):
    terms=['Arcserve','OLP NL','LicSAPk','Symantec']
    disc=False
    for val in terms:
    if Title.find(val)!=-1:
    disc=True
    break
    return disc


    When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.






    share|improve this answer























    • amazing - now it's working . thanks

      – Dvir Yadai
      Mar 8 at 20:14















    1














    You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1) or you can create a function to evaluate the terms that you want to find



    def TermFind(Title):
    terms=['Arcserve','OLP NL','LicSAPk','Symantec']
    disc=False
    for val in terms:
    if Title.find(val)!=-1:
    disc=True
    break
    return disc


    When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.






    share|improve this answer























    • amazing - now it's working . thanks

      – Dvir Yadai
      Mar 8 at 20:14













    1












    1








    1







    You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1) or you can create a function to evaluate the terms that you want to find



    def TermFind(Title):
    terms=['Arcserve','OLP NL','LicSAPk','Symantec']
    disc=False
    for val in terms:
    if Title.find(val)!=-1:
    disc=True
    break
    return disc


    When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.






    share|improve this answer













    You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1) or you can create a function to evaluate the terms that you want to find



    def TermFind(Title):
    terms=['Arcserve','OLP NL','LicSAPk','Symantec']
    disc=False
    for val in terms:
    if Title.find(val)!=-1:
    disc=True
    break
    return disc


    When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Mar 8 at 19:28









    TavoGLCTavoGLC

    44628




    44628












    • amazing - now it's working . thanks

      – Dvir Yadai
      Mar 8 at 20:14

















    • amazing - now it's working . thanks

      – Dvir Yadai
      Mar 8 at 20:14
















    amazing - now it's working . thanks

    – Dvir Yadai
    Mar 8 at 20:14





    amazing - now it's working . thanks

    – Dvir Yadai
    Mar 8 at 20:14













    0














    A similar idea using any



    import requests 
    from bs4 import BeautifulSoup

    url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
    html = requests.get(url)
    bsObj = BeautifulSoup(html.content, 'lxml')
    title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
    items = ['Arcserve','OLP NL','LicSAPk','Symantec']

    if not any(item in title for item in items):
    print(title)





    share|improve this answer



























      0














      A similar idea using any



      import requests 
      from bs4 import BeautifulSoup

      url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
      html = requests.get(url)
      bsObj = BeautifulSoup(html.content, 'lxml')
      title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
      items = ['Arcserve','OLP NL','LicSAPk','Symantec']

      if not any(item in title for item in items):
      print(title)





      share|improve this answer

























        0












        0








        0







        A similar idea using any



        import requests 
        from bs4 import BeautifulSoup

        url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
        html = requests.get(url)
        bsObj = BeautifulSoup(html.content, 'lxml')
        title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
        items = ['Arcserve','OLP NL','LicSAPk','Symantec']

        if not any(item in title for item in items):
        print(title)





        share|improve this answer













        A similar idea using any



        import requests 
        from bs4 import BeautifulSoup

        url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
        html = requests.get(url)
        bsObj = BeautifulSoup(html.content, 'lxml')
        title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
        items = ['Arcserve','OLP NL','LicSAPk','Symantec']

        if not any(item in title for item in items):
        print(title)






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 9 at 16:08









        QHarrQHarr

        38.3k82245




        38.3k82245



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55067248%2fhow-to-exclude-all-title-with-find%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Save data to MySQL database using ExtJS and PHP [closed]2019 Community Moderator ElectionHow can I prevent SQL injection in PHP?Which MySQL data type to use for storing boolean valuesPHP: Delete an element from an arrayHow do I connect to a MySQL Database in Python?Should I use the datetime or timestamp data type in MySQL?How to get a list of MySQL user accountsHow Do You Parse and Process HTML/XML in PHP?Reference — What does this symbol mean in PHP?How does PHP 'foreach' actually work?Why shouldn't I use mysql_* functions in PHP?

            Compiling GNU Global with universal-ctags support Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Tags for Emacs: Relationship between etags, ebrowse, cscope, GNU Global and exuberant ctagsVim and Ctags tips and trickscscope or ctags why choose one over the other?scons and ctagsctags cannot open option file “.ctags”Adding tag scopes in universal-ctagsShould I use Universal-ctags?Universal ctags on WindowsHow do I install GNU Global with universal ctags support using Homebrew?Universal ctags with emacsHow to highlight ctags generated by Universal Ctags in Vim?

            Add ONERROR event to image from jsp tldHow to add an image to a JPanel?Saving image from PHP URLHTML img scalingCheck if an image is loaded (no errors) with jQueryHow to force an <img> to take up width, even if the image is not loadedHow do I populate hidden form field with a value set in Spring ControllerStyling Raw elements Generated from JSP tagds with Jquery MobileLimit resizing of images with explicitly set width and height attributeserror TLD use in a jsp fileJsp tld files cannot be resolved