how to exclude all title with find? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experience Should we burninate the [wrap] tag?Python != operation vs “is not”nested “and/or” if statementsHtml Parser pulling from previous webpageHow to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?Finding the index of an item given a list containing it in PythonHow do I sort a dictionary by value?How can I exclude all “permission denied” messages from “find”?How do I list all files of a directory?How to exclude a directory in find . commandFind current directory and file's directoryHow do I find all files containing specific text on Linux?
Why does Python start at index -1 when indexing a list from the end?
What is the correct way to use the pinch test for dehydration?
Diagram with tikz
Why is black pepper both grey and black?
I am not a queen, who am I?
Should I call the interviewer directly, if HR aren't responding?
Is it ethical to give a final exam after the professor has quit before teaching the remaining chapters of the course?
Is there a "higher Segal conjecture"?
3 doors, three guards, one stone
Antler Helmet: Can it work?
Why aren't air breathing engines used as small first stages
When is phishing education going too far?
Why is "Consequences inflicted." not a sentence?
What do you call a plan that's an alternative plan in case your initial plan fails?
What is this single-engine low-wing propeller plane?
How much radiation do nuclear physics experiments expose researchers to nowadays?
Do you forfeit tax refunds/credits if you aren't required to and don't file by April 15?
How to find all the available tools in macOS terminal?
Did Xerox really develop the first LAN?
Were Kohanim forbidden from serving in King David's army?
How can players work together to take actions that are otherwise impossible?
ListPlot join points by nearest neighbor rather than order
Sorting numerically
Should I discuss the type of campaign with my players?
how to exclude all title with find?
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
The Ask Question Wizard is Live!
Data science time! April 2019 and salary with experience
Should we burninate the [wrap] tag?Python != operation vs “is not”nested “and/or” if statementsHtml Parser pulling from previous webpageHow to merge two dictionaries in a single expression?How do I check if a list is empty?How do I check whether a file exists without exceptions?Finding the index of an item given a list containing it in PythonHow do I sort a dictionary by value?How can I exclude all “permission denied” messages from “find”?How do I list all files of a directory?How to exclude a directory in find . commandFind current directory and file's directoryHow do I find all files containing specific text on Linux?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
i have function that get me all the titles from my website
i dont want to get the title from some products
is this the right way ?
i dont want titles from products with the words "OLP NL" or "Arcserve" or "LicSAPk" or "symantec"
def get_title ( u ):
html = requests.get ( u )
bsObj = BeautifulSoup ( html.content, 'xml' )
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>',
'' )
if (title.find ( 'Arcserve' ) or title.find ( 'OLP NL' ) or title.find (
'LicSAPk' ) or title.find (
'Symantec' ) is not -1):
return 'null'
else:
return title
if (title != 'null'):
ws1 [ 'B1' ] = title
meta_desc = get_metaDesc ( u )
ws1 [ 'C1' ] = meta_desc
meta_keyWrds = get_metaKeyWrds ( u )
ws1 [ 'D1' ] = meta_keyWrds
print ( "writing product no." + str ( i ) )
else:
print("skipped product no. " + str ( i ))
continue;
the problem is that the program exclude all my products and all i'm seeing is "skipped product no." ?
whay ? not all of them have these words ...
python beautifulsoup find web-crawler
add a comment |
i have function that get me all the titles from my website
i dont want to get the title from some products
is this the right way ?
i dont want titles from products with the words "OLP NL" or "Arcserve" or "LicSAPk" or "symantec"
def get_title ( u ):
html = requests.get ( u )
bsObj = BeautifulSoup ( html.content, 'xml' )
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>',
'' )
if (title.find ( 'Arcserve' ) or title.find ( 'OLP NL' ) or title.find (
'LicSAPk' ) or title.find (
'Symantec' ) is not -1):
return 'null'
else:
return title
if (title != 'null'):
ws1 [ 'B1' ] = title
meta_desc = get_metaDesc ( u )
ws1 [ 'C1' ] = meta_desc
meta_keyWrds = get_metaKeyWrds ( u )
ws1 [ 'D1' ] = meta_keyWrds
print ( "writing product no." + str ( i ) )
else:
print("skipped product no. " + str ( i ))
continue;
the problem is that the program exclude all my products and all i'm seeing is "skipped product no." ?
whay ? not all of them have these words ...
python beautifulsoup find web-crawler
1
Example url please?
– QHarr
Mar 8 at 16:38
the url is xml sitemap
– Dvir Yadai
Mar 8 at 17:24
something you can share?
– QHarr
Mar 8 at 17:24
cdsoft.co.il/index.php?id_product=300610&controller=product
– Dvir Yadai
Mar 8 at 17:30
add a comment |
i have function that get me all the titles from my website
i dont want to get the title from some products
is this the right way ?
i dont want titles from products with the words "OLP NL" or "Arcserve" or "LicSAPk" or "symantec"
def get_title ( u ):
html = requests.get ( u )
bsObj = BeautifulSoup ( html.content, 'xml' )
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>',
'' )
if (title.find ( 'Arcserve' ) or title.find ( 'OLP NL' ) or title.find (
'LicSAPk' ) or title.find (
'Symantec' ) is not -1):
return 'null'
else:
return title
if (title != 'null'):
ws1 [ 'B1' ] = title
meta_desc = get_metaDesc ( u )
ws1 [ 'C1' ] = meta_desc
meta_keyWrds = get_metaKeyWrds ( u )
ws1 [ 'D1' ] = meta_keyWrds
print ( "writing product no." + str ( i ) )
else:
print("skipped product no. " + str ( i ))
continue;
the problem is that the program exclude all my products and all i'm seeing is "skipped product no." ?
whay ? not all of them have these words ...
python beautifulsoup find web-crawler
i have function that get me all the titles from my website
i dont want to get the title from some products
is this the right way ?
i dont want titles from products with the words "OLP NL" or "Arcserve" or "LicSAPk" or "symantec"
def get_title ( u ):
html = requests.get ( u )
bsObj = BeautifulSoup ( html.content, 'xml' )
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>',
'' )
if (title.find ( 'Arcserve' ) or title.find ( 'OLP NL' ) or title.find (
'LicSAPk' ) or title.find (
'Symantec' ) is not -1):
return 'null'
else:
return title
if (title != 'null'):
ws1 [ 'B1' ] = title
meta_desc = get_metaDesc ( u )
ws1 [ 'C1' ] = meta_desc
meta_keyWrds = get_metaKeyWrds ( u )
ws1 [ 'D1' ] = meta_keyWrds
print ( "writing product no." + str ( i ) )
else:
print("skipped product no. " + str ( i ))
continue;
the problem is that the program exclude all my products and all i'm seeing is "skipped product no." ?
whay ? not all of them have these words ...
python beautifulsoup find web-crawler
python beautifulsoup find web-crawler
asked Mar 8 at 16:30
Dvir YadaiDvir Yadai
206
206
1
Example url please?
– QHarr
Mar 8 at 16:38
the url is xml sitemap
– Dvir Yadai
Mar 8 at 17:24
something you can share?
– QHarr
Mar 8 at 17:24
cdsoft.co.il/index.php?id_product=300610&controller=product
– Dvir Yadai
Mar 8 at 17:30
add a comment |
1
Example url please?
– QHarr
Mar 8 at 16:38
the url is xml sitemap
– Dvir Yadai
Mar 8 at 17:24
something you can share?
– QHarr
Mar 8 at 17:24
cdsoft.co.il/index.php?id_product=300610&controller=product
– Dvir Yadai
Mar 8 at 17:30
1
1
Example url please?
– QHarr
Mar 8 at 16:38
Example url please?
– QHarr
Mar 8 at 16:38
the url is xml sitemap
– Dvir Yadai
Mar 8 at 17:24
the url is xml sitemap
– Dvir Yadai
Mar 8 at 17:24
something you can share?
– QHarr
Mar 8 at 17:24
something you can share?
– QHarr
Mar 8 at 17:24
cdsoft.co.il/index.php?id_product=300610&controller=product
– Dvir Yadai
Mar 8 at 17:30
cdsoft.co.il/index.php?id_product=300610&controller=product
– Dvir Yadai
Mar 8 at 17:30
add a comment |
2 Answers
2
active
oldest
votes
You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1)
or you can create a function to evaluate the terms that you want to find
def TermFind(Title):
terms=['Arcserve','OLP NL','LicSAPk','Symantec']
disc=False
for val in terms:
if Title.find(val)!=-1:
disc=True
break
return disc
When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.
amazing - now it's working . thanks
– Dvir Yadai
Mar 8 at 20:14
add a comment |
A similar idea using any
import requests
from bs4 import BeautifulSoup
url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
html = requests.get(url)
bsObj = BeautifulSoup(html.content, 'lxml')
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
items = ['Arcserve','OLP NL','LicSAPk','Symantec']
if not any(item in title for item in items):
print(title)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55067248%2fhow-to-exclude-all-title-with-find%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1)
or you can create a function to evaluate the terms that you want to find
def TermFind(Title):
terms=['Arcserve','OLP NL','LicSAPk','Symantec']
disc=False
for val in terms:
if Title.find(val)!=-1:
disc=True
break
return disc
When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.
amazing - now it's working . thanks
– Dvir Yadai
Mar 8 at 20:14
add a comment |
You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1)
or you can create a function to evaluate the terms that you want to find
def TermFind(Title):
terms=['Arcserve','OLP NL','LicSAPk','Symantec']
disc=False
for val in terms:
if Title.find(val)!=-1:
disc=True
break
return disc
When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.
amazing - now it's working . thanks
– Dvir Yadai
Mar 8 at 20:14
add a comment |
You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1)
or you can create a function to evaluate the terms that you want to find
def TermFind(Title):
terms=['Arcserve','OLP NL','LicSAPk','Symantec']
disc=False
for val in terms:
if Title.find(val)!=-1:
disc=True
break
return disc
When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.
You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1)
or you can create a function to evaluate the terms that you want to find
def TermFind(Title):
terms=['Arcserve','OLP NL','LicSAPk','Symantec']
disc=False
for val in terms:
if Title.find(val)!=-1:
disc=True
break
return disc
When I used the if statement always returned True regardless of the title value. I couldn't find an explanation for such behavior, but you can try checking this [Python != operation vs "is not" and [nested "and/or" if statements. Hope it helps.
answered Mar 8 at 19:28
TavoGLCTavoGLC
44628
44628
amazing - now it's working . thanks
– Dvir Yadai
Mar 8 at 20:14
add a comment |
amazing - now it's working . thanks
– Dvir Yadai
Mar 8 at 20:14
amazing - now it's working . thanks
– Dvir Yadai
Mar 8 at 20:14
amazing - now it's working . thanks
– Dvir Yadai
Mar 8 at 20:14
add a comment |
A similar idea using any
import requests
from bs4 import BeautifulSoup
url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
html = requests.get(url)
bsObj = BeautifulSoup(html.content, 'lxml')
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
items = ['Arcserve','OLP NL','LicSAPk','Symantec']
if not any(item in title for item in items):
print(title)
add a comment |
A similar idea using any
import requests
from bs4 import BeautifulSoup
url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
html = requests.get(url)
bsObj = BeautifulSoup(html.content, 'lxml')
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
items = ['Arcserve','OLP NL','LicSAPk','Symantec']
if not any(item in title for item in items):
print(title)
add a comment |
A similar idea using any
import requests
from bs4 import BeautifulSoup
url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
html = requests.get(url)
bsObj = BeautifulSoup(html.content, 'lxml')
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
items = ['Arcserve','OLP NL','LicSAPk','Symantec']
if not any(item in title for item in items):
print(title)
A similar idea using any
import requests
from bs4 import BeautifulSoup
url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
html = requests.get(url)
bsObj = BeautifulSoup(html.content, 'lxml')
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
items = ['Arcserve','OLP NL','LicSAPk','Symantec']
if not any(item in title for item in items):
print(title)
answered Mar 9 at 16:08
QHarrQHarr
38.3k82245
38.3k82245
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55067248%2fhow-to-exclude-all-title-with-find%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Example url please?
– QHarr
Mar 8 at 16:38
the url is xml sitemap
– Dvir Yadai
Mar 8 at 17:24
something you can share?
– QHarr
Mar 8 at 17:24
cdsoft.co.il/index.php?id_product=300610&controller=product
– Dvir Yadai
Mar 8 at 17:30