Running Logstash on multiple nodes with JDBC input pluginSetting up an ELK clusterBad indexing performance of elasticsearchlogstash indexing files multiple times?MapperParsingException[failed to parse [timestamp]]; nested: IllegalArgumentException[Invalid format:Optimizing Bulk Indexing in elasticsearchElasticsearch 5.1.1 consuming more index spaceUse the same connection for multiple Logstash configurationsSetting up elasticsearch; 2 nodes in the same machineConfigure Multiple Data soureces in Logstash configuration files?

Does an object always see its latest internal state irrespective of thread?

Is it possible to do 50 km distance without any previous training?

What's that red-plus icon near a text?

Paid for article while in US on F-1 visa?

If human space travel is limited by the G force vulnerability, is there a way to counter G forces?

Can I make popcorn with any corn?

Rock identification in KY

Are astronomers waiting to see something in an image from a gravitational lens that they've already seen in an adjacent image?

Important Resources for Dark Age Civilizations?

Can you really stack all of this on an Opportunity Attack?

Approximately how much travel time was saved by the opening of the Suez Canal in 1869?

How much RAM could one put in a typical 80386 setup?

What does the "remote control" for a QF-4 look like?

Do infinite dimensional systems make sense?

How can I prevent hyper evolved versions of regular creatures from wiping out their cousins?

Is it legal for company to use my work email to pretend I still work there?

Unable to deploy metadata from Partner Developer scratch org because of extra fields

Why is consensus so controversial in Britain?

When a company launches a new product do they "come out" with a new product or do they "come up" with a new product?

Add text to same line using sed

Why is Minecraft giving an OpenGL error?

Client team has low performances and low technical skills: we always fix their work and now they stop collaborate with us. How to solve?

Is it inappropriate for a student to attend their mentor's dissertation defense?

Perform and show arithmetic with LuaLaTeX



Running Logstash on multiple nodes with JDBC input plugin


Setting up an ELK clusterBad indexing performance of elasticsearchlogstash indexing files multiple times?MapperParsingException[failed to parse [timestamp]]; nested: IllegalArgumentException[Invalid format:Optimizing Bulk Indexing in elasticsearchElasticsearch 5.1.1 consuming more index spaceUse the same connection for multiple Logstash configurationsSetting up elasticsearch; 2 nodes in the same machineConfigure Multiple Data soureces in Logstash configuration files?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








0















I have a basic HA setup for Logstash - two identical nodes in two separate AWS availability zones. Each node runs a pipeline that extracts a dataset from DB cluster and then outputs it downstream it to ELasticSearch cluster for indexing. This works fine with one Logstash node, but two nodes running in parallel send the same data twice down to ES for indexing due to the fact that each node tracks :sql_last_value separately. Since I use the same ID as the document ID across both nodes, all repeated data is simply updated instead of being inserted twice. In other words, there is 1 insert and 1 update per each dataset. This is, obviously, not very efficient and puts unnecessary load on ELK resources. It gets worse as additional Logstash nodes are added.



Does anyone know a better way of how parallel Logstash nodes should be set up, so each node doesn’t extract the same dataset if it’s been already extracted by another previous node? One poor man’s solution could be creating a shared NFS folder between Logstash nodes and having each node write :sql_last_value there, but I am not sure what kind of side effect I may run into with this setup, especially under higher loads. Thank you!










share|improve this question






















  • Looks like you found the answer in feature request from 2015. github.com/elastic/logstash/issues/2632

    – Alain Collins
    Mar 11 at 18:18











  • No, I can't tell I found the answer to this. Still sending duplicate data to ES from each logstash node.

    – demisx
    Mar 14 at 18:22











  • I believe that the answer is that there's a 4-year old feature request that hasn't been addressed.

    – Alain Collins
    Mar 14 at 23:02











  • I don't see how this qualifies as an answer.

    – demisx
    Mar 15 at 3:52











  • It's not an answer - it's a comment. It does, however, serve as a pointer to the information that shows the current status of the answer and may someday - if you're a real optimist - show the solution. If anyone else finds your question here, they will have more information than they had without the link.

    – Alain Collins
    Mar 16 at 17:49

















0















I have a basic HA setup for Logstash - two identical nodes in two separate AWS availability zones. Each node runs a pipeline that extracts a dataset from DB cluster and then outputs it downstream it to ELasticSearch cluster for indexing. This works fine with one Logstash node, but two nodes running in parallel send the same data twice down to ES for indexing due to the fact that each node tracks :sql_last_value separately. Since I use the same ID as the document ID across both nodes, all repeated data is simply updated instead of being inserted twice. In other words, there is 1 insert and 1 update per each dataset. This is, obviously, not very efficient and puts unnecessary load on ELK resources. It gets worse as additional Logstash nodes are added.



Does anyone know a better way of how parallel Logstash nodes should be set up, so each node doesn’t extract the same dataset if it’s been already extracted by another previous node? One poor man’s solution could be creating a shared NFS folder between Logstash nodes and having each node write :sql_last_value there, but I am not sure what kind of side effect I may run into with this setup, especially under higher loads. Thank you!










share|improve this question






















  • Looks like you found the answer in feature request from 2015. github.com/elastic/logstash/issues/2632

    – Alain Collins
    Mar 11 at 18:18











  • No, I can't tell I found the answer to this. Still sending duplicate data to ES from each logstash node.

    – demisx
    Mar 14 at 18:22











  • I believe that the answer is that there's a 4-year old feature request that hasn't been addressed.

    – Alain Collins
    Mar 14 at 23:02











  • I don't see how this qualifies as an answer.

    – demisx
    Mar 15 at 3:52











  • It's not an answer - it's a comment. It does, however, serve as a pointer to the information that shows the current status of the answer and may someday - if you're a real optimist - show the solution. If anyone else finds your question here, they will have more information than they had without the link.

    – Alain Collins
    Mar 16 at 17:49













0












0








0








I have a basic HA setup for Logstash - two identical nodes in two separate AWS availability zones. Each node runs a pipeline that extracts a dataset from DB cluster and then outputs it downstream it to ELasticSearch cluster for indexing. This works fine with one Logstash node, but two nodes running in parallel send the same data twice down to ES for indexing due to the fact that each node tracks :sql_last_value separately. Since I use the same ID as the document ID across both nodes, all repeated data is simply updated instead of being inserted twice. In other words, there is 1 insert and 1 update per each dataset. This is, obviously, not very efficient and puts unnecessary load on ELK resources. It gets worse as additional Logstash nodes are added.



Does anyone know a better way of how parallel Logstash nodes should be set up, so each node doesn’t extract the same dataset if it’s been already extracted by another previous node? One poor man’s solution could be creating a shared NFS folder between Logstash nodes and having each node write :sql_last_value there, but I am not sure what kind of side effect I may run into with this setup, especially under higher loads. Thank you!










share|improve this question














I have a basic HA setup for Logstash - two identical nodes in two separate AWS availability zones. Each node runs a pipeline that extracts a dataset from DB cluster and then outputs it downstream it to ELasticSearch cluster for indexing. This works fine with one Logstash node, but two nodes running in parallel send the same data twice down to ES for indexing due to the fact that each node tracks :sql_last_value separately. Since I use the same ID as the document ID across both nodes, all repeated data is simply updated instead of being inserted twice. In other words, there is 1 insert and 1 update per each dataset. This is, obviously, not very efficient and puts unnecessary load on ELK resources. It gets worse as additional Logstash nodes are added.



Does anyone know a better way of how parallel Logstash nodes should be set up, so each node doesn’t extract the same dataset if it’s been already extracted by another previous node? One poor man’s solution could be creating a shared NFS folder between Logstash nodes and having each node write :sql_last_value there, but I am not sure what kind of side effect I may run into with this setup, especially under higher loads. Thank you!







elasticsearch logstash elastic-stack logstash-jdbc






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Mar 8 at 1:56









demisxdemisx

2,21622328




2,21622328












  • Looks like you found the answer in feature request from 2015. github.com/elastic/logstash/issues/2632

    – Alain Collins
    Mar 11 at 18:18











  • No, I can't tell I found the answer to this. Still sending duplicate data to ES from each logstash node.

    – demisx
    Mar 14 at 18:22











  • I believe that the answer is that there's a 4-year old feature request that hasn't been addressed.

    – Alain Collins
    Mar 14 at 23:02











  • I don't see how this qualifies as an answer.

    – demisx
    Mar 15 at 3:52











  • It's not an answer - it's a comment. It does, however, serve as a pointer to the information that shows the current status of the answer and may someday - if you're a real optimist - show the solution. If anyone else finds your question here, they will have more information than they had without the link.

    – Alain Collins
    Mar 16 at 17:49

















  • Looks like you found the answer in feature request from 2015. github.com/elastic/logstash/issues/2632

    – Alain Collins
    Mar 11 at 18:18











  • No, I can't tell I found the answer to this. Still sending duplicate data to ES from each logstash node.

    – demisx
    Mar 14 at 18:22











  • I believe that the answer is that there's a 4-year old feature request that hasn't been addressed.

    – Alain Collins
    Mar 14 at 23:02











  • I don't see how this qualifies as an answer.

    – demisx
    Mar 15 at 3:52











  • It's not an answer - it's a comment. It does, however, serve as a pointer to the information that shows the current status of the answer and may someday - if you're a real optimist - show the solution. If anyone else finds your question here, they will have more information than they had without the link.

    – Alain Collins
    Mar 16 at 17:49
















Looks like you found the answer in feature request from 2015. github.com/elastic/logstash/issues/2632

– Alain Collins
Mar 11 at 18:18





Looks like you found the answer in feature request from 2015. github.com/elastic/logstash/issues/2632

– Alain Collins
Mar 11 at 18:18













No, I can't tell I found the answer to this. Still sending duplicate data to ES from each logstash node.

– demisx
Mar 14 at 18:22





No, I can't tell I found the answer to this. Still sending duplicate data to ES from each logstash node.

– demisx
Mar 14 at 18:22













I believe that the answer is that there's a 4-year old feature request that hasn't been addressed.

– Alain Collins
Mar 14 at 23:02





I believe that the answer is that there's a 4-year old feature request that hasn't been addressed.

– Alain Collins
Mar 14 at 23:02













I don't see how this qualifies as an answer.

– demisx
Mar 15 at 3:52





I don't see how this qualifies as an answer.

– demisx
Mar 15 at 3:52













It's not an answer - it's a comment. It does, however, serve as a pointer to the information that shows the current status of the answer and may someday - if you're a real optimist - show the solution. If anyone else finds your question here, they will have more information than they had without the link.

– Alain Collins
Mar 16 at 17:49





It's not an answer - it's a comment. It does, however, serve as a pointer to the information that shows the current status of the answer and may someday - if you're a real optimist - show the solution. If anyone else finds your question here, they will have more information than they had without the link.

– Alain Collins
Mar 16 at 17:49












0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55055660%2frunning-logstash-on-multiple-nodes-with-jdbc-input-plugin%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55055660%2frunning-logstash-on-multiple-nodes-with-jdbc-input-plugin%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Save data to MySQL database using ExtJS and PHP [closed]2019 Community Moderator ElectionHow can I prevent SQL injection in PHP?Which MySQL data type to use for storing boolean valuesPHP: Delete an element from an arrayHow do I connect to a MySQL Database in Python?Should I use the datetime or timestamp data type in MySQL?How to get a list of MySQL user accountsHow Do You Parse and Process HTML/XML in PHP?Reference — What does this symbol mean in PHP?How does PHP 'foreach' actually work?Why shouldn't I use mysql_* functions in PHP?

Compiling GNU Global with universal-ctags support Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Tags for Emacs: Relationship between etags, ebrowse, cscope, GNU Global and exuberant ctagsVim and Ctags tips and trickscscope or ctags why choose one over the other?scons and ctagsctags cannot open option file “.ctags”Adding tag scopes in universal-ctagsShould I use Universal-ctags?Universal ctags on WindowsHow do I install GNU Global with universal ctags support using Homebrew?Universal ctags with emacsHow to highlight ctags generated by Universal Ctags in Vim?

Add ONERROR event to image from jsp tldHow to add an image to a JPanel?Saving image from PHP URLHTML img scalingCheck if an image is loaded (no errors) with jQueryHow to force an <img> to take up width, even if the image is not loadedHow do I populate hidden form field with a value set in Spring ControllerStyling Raw elements Generated from JSP tagds with Jquery MobileLimit resizing of images with explicitly set width and height attributeserror TLD use in a jsp fileJsp tld files cannot be resolved