Node.js Puppeteer & Cheerio Div Table ScrapingHow to horizontally center a <div>?How to make div not larger than its contents?How to make a div 100% height of the browser window?How do I debug Node.js applications?How do I get started with Node.jsWriting files in Node.jsHow do I pass command line arguments to a Node.js program?How to decide when to use Node.js?How to exit in Node.jsWhat is the purpose of Node.js module.exports and how do you use it?
Why are the 737's rear doors unusable in a water landing?
Why would the Red Woman birth a shadow if she worshipped the Lord of the Light?
How can I determine if the org that I'm currently connected to is a scratch org?
Examples of smooth manifolds admitting inbetween one and a continuum of complex structures
How to Recreate this in LaTeX? (Unsure What the Notation is Called)
One verb to replace 'be a member of' a club
Intersection Puzzle
CAST throwing error when run in stored procedure but not when run as raw query
Watching something be piped to a file live with tail
How does a predictive coding aid in lossless compression?
I would say: "You are another teacher", but she is a woman and I am a man
Why was the shrinking from 8″ made only to 5.25″ and not smaller (4″ or less)?
Im going to France and my passport expires June 19th
Zip/Tar file compressed to larger size?
Little known, relatively unlikely, but scientifically plausible, apocalyptic (or near apocalyptic) events
Can my sorcerer use a spellbook only to collect spells and scribe scrolls, not cast?
Is it possible to create a QR code using text?
Is it inappropriate for a student to attend their mentor's dissertation defense?
Expand and Contract
Alternative to sending password over mail?
Size of subfigure fitting its content (tikzpicture)
How to prevent "they're falling in love" trope
A category-like structure without composition?
Are there any examples of a variable being normally distributed that is *not* due to the Central Limit Theorem?
Node.js Puppeteer & Cheerio Div Table Scraping
How to horizontally center a <div>?How to make div not larger than its contents?How to make a div 100% height of the browser window?How do I debug Node.js applications?How do I get started with Node.jsWriting files in Node.jsHow do I pass command line arguments to a Node.js program?How to decide when to use Node.js?How to exit in Node.jsWhat is the purpose of Node.js module.exports and how do you use it?
I have been working on a node.js scraper using puppeteer and cheerio but am having an issue pulling some div table information. I need to pull the the fruit and vegetable tables but not the meat table and all 3 are not always present.
<div class="specs__title">
<h4>Fruit</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Apples</div>
<div class="col-6 specs__cell">4lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Grapes</div>
<div class="col-6 specs__cell">3lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Vegetables</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Carrots</div>
<div class="col-6 specs__cell">7lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Corn</div>
<div class="col-6 specs__cell">5lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Meat</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Turkey</div>
<div class="col-6 specs__cell">2lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Beef</div>
<div class="col-6 specs__cell">1lb</div>
</div>
</div>
Any help would be appreciated.
html node.js web-scraping puppeteer cheerio
add a comment |
I have been working on a node.js scraper using puppeteer and cheerio but am having an issue pulling some div table information. I need to pull the the fruit and vegetable tables but not the meat table and all 3 are not always present.
<div class="specs__title">
<h4>Fruit</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Apples</div>
<div class="col-6 specs__cell">4lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Grapes</div>
<div class="col-6 specs__cell">3lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Vegetables</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Carrots</div>
<div class="col-6 specs__cell">7lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Corn</div>
<div class="col-6 specs__cell">5lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Meat</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Turkey</div>
<div class="col-6 specs__cell">2lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Beef</div>
<div class="col-6 specs__cell">1lb</div>
</div>
</div>
Any help would be appreciated.
html node.js web-scraping puppeteer cheerio
do you have any code to share to see where you may have gone wrong?
– Andy Danger Gagne
Mar 7 at 23:45
The problem is I am not sure where to begin. I need something that will pull the html of the div.specs__table if the div.specs__title before it has Vegetables or Fruit in the h4 tag. I am not sure how to do that with cheerio.
– Matt h
Mar 8 at 0:03
add a comment |
I have been working on a node.js scraper using puppeteer and cheerio but am having an issue pulling some div table information. I need to pull the the fruit and vegetable tables but not the meat table and all 3 are not always present.
<div class="specs__title">
<h4>Fruit</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Apples</div>
<div class="col-6 specs__cell">4lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Grapes</div>
<div class="col-6 specs__cell">3lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Vegetables</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Carrots</div>
<div class="col-6 specs__cell">7lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Corn</div>
<div class="col-6 specs__cell">5lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Meat</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Turkey</div>
<div class="col-6 specs__cell">2lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Beef</div>
<div class="col-6 specs__cell">1lb</div>
</div>
</div>
Any help would be appreciated.
html node.js web-scraping puppeteer cheerio
I have been working on a node.js scraper using puppeteer and cheerio but am having an issue pulling some div table information. I need to pull the the fruit and vegetable tables but not the meat table and all 3 are not always present.
<div class="specs__title">
<h4>Fruit</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Apples</div>
<div class="col-6 specs__cell">4lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Grapes</div>
<div class="col-6 specs__cell">3lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Vegetables</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Carrots</div>
<div class="col-6 specs__cell">7lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Corn</div>
<div class="col-6 specs__cell">5lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Meat</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Turkey</div>
<div class="col-6 specs__cell">2lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Beef</div>
<div class="col-6 specs__cell">1lb</div>
</div>
</div>
Any help would be appreciated.
<div class="specs__title">
<h4>Fruit</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Apples</div>
<div class="col-6 specs__cell">4lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Grapes</div>
<div class="col-6 specs__cell">3lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Vegetables</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Carrots</div>
<div class="col-6 specs__cell">7lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Corn</div>
<div class="col-6 specs__cell">5lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Meat</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Turkey</div>
<div class="col-6 specs__cell">2lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Beef</div>
<div class="col-6 specs__cell">1lb</div>
</div>
</div>
<div class="specs__title">
<h4>Fruit</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Apples</div>
<div class="col-6 specs__cell">4lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Grapes</div>
<div class="col-6 specs__cell">3lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Vegetables</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Carrots</div>
<div class="col-6 specs__cell">7lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Corn</div>
<div class="col-6 specs__cell">5lbs</div>
</div>
</div>
<div class="specs__title">
<h4>Meat</h4>
</div>
<div class="specs__table">
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Turkey</div>
<div class="col-6 specs__cell">2lbs</div>
</div>
<div class="specs__group col-12 col-lg-6">
<div class="col-6 specs__cell specs__cell--label">Beef</div>
<div class="col-6 specs__cell">1lb</div>
</div>
</div>
html node.js web-scraping puppeteer cheerio
html node.js web-scraping puppeteer cheerio
asked Mar 7 at 22:55
Matt hMatt h
11
11
do you have any code to share to see where you may have gone wrong?
– Andy Danger Gagne
Mar 7 at 23:45
The problem is I am not sure where to begin. I need something that will pull the html of the div.specs__table if the div.specs__title before it has Vegetables or Fruit in the h4 tag. I am not sure how to do that with cheerio.
– Matt h
Mar 8 at 0:03
add a comment |
do you have any code to share to see where you may have gone wrong?
– Andy Danger Gagne
Mar 7 at 23:45
The problem is I am not sure where to begin. I need something that will pull the html of the div.specs__table if the div.specs__title before it has Vegetables or Fruit in the h4 tag. I am not sure how to do that with cheerio.
– Matt h
Mar 8 at 0:03
do you have any code to share to see where you may have gone wrong?
– Andy Danger Gagne
Mar 7 at 23:45
do you have any code to share to see where you may have gone wrong?
– Andy Danger Gagne
Mar 7 at 23:45
The problem is I am not sure where to begin. I need something that will pull the html of the div.specs__table if the div.specs__title before it has Vegetables or Fruit in the h4 tag. I am not sure how to do that with cheerio.
– Matt h
Mar 8 at 0:03
The problem is I am not sure where to begin. I need something that will pull the html of the div.specs__table if the div.specs__title before it has Vegetables or Fruit in the h4 tag. I am not sure how to do that with cheerio.
– Matt h
Mar 8 at 0:03
add a comment |
2 Answers
2
active
oldest
votes
It should look something like this: (not tested)
$('h4:contains("Fruits"),h4:contains("Vegetables")').map((i, h4) =>
return $(h4).parent().find('+ .specs__table').html()
).get()
add a comment |
I am not sure if this is the best way to do it but this is how I got it working.
for (let i = 0; i < 3; i++)
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Fruits")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Vegetables")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
;
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55054116%2fnode-js-puppeteer-cheerio-div-table-scraping%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
It should look something like this: (not tested)
$('h4:contains("Fruits"),h4:contains("Vegetables")').map((i, h4) =>
return $(h4).parent().find('+ .specs__table').html()
).get()
add a comment |
It should look something like this: (not tested)
$('h4:contains("Fruits"),h4:contains("Vegetables")').map((i, h4) =>
return $(h4).parent().find('+ .specs__table').html()
).get()
add a comment |
It should look something like this: (not tested)
$('h4:contains("Fruits"),h4:contains("Vegetables")').map((i, h4) =>
return $(h4).parent().find('+ .specs__table').html()
).get()
It should look something like this: (not tested)
$('h4:contains("Fruits"),h4:contains("Vegetables")').map((i, h4) =>
return $(h4).parent().find('+ .specs__table').html()
).get()
answered Mar 8 at 1:55
pguardiariopguardiario
36.8k980117
36.8k980117
add a comment |
add a comment |
I am not sure if this is the best way to do it but this is how I got it working.
for (let i = 0; i < 3; i++)
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Fruits")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Vegetables")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
;
add a comment |
I am not sure if this is the best way to do it but this is how I got it working.
for (let i = 0; i < 3; i++)
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Fruits")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Vegetables")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
;
add a comment |
I am not sure if this is the best way to do it but this is how I got it working.
for (let i = 0; i < 3; i++)
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Fruits")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Vegetables")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
;
I am not sure if this is the best way to do it but this is how I got it working.
for (let i = 0; i < 3; i++)
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Fruits")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Vegetables")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
;
for (let i = 0; i < 3; i++)
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Fruits")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Vegetables")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
;
for (let i = 0; i < 3; i++)
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Fruits")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
if($('#specsContainer > div.specs__title > h4', html).eq(i).text() == "Vegetables")
console.log($('#specsContainer > div.specs__table', html).eq(i).html());
;
;
answered Mar 8 at 4:10
Matt hMatt h
11
11
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55054116%2fnode-js-puppeteer-cheerio-div-table-scraping%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
do you have any code to share to see where you may have gone wrong?
– Andy Danger Gagne
Mar 7 at 23:45
The problem is I am not sure where to begin. I need something that will pull the html of the div.specs__table if the div.specs__title before it has Vegetables or Fruit in the h4 tag. I am not sure how to do that with cheerio.
– Matt h
Mar 8 at 0:03