Is there a way to write to Kiba CSV destination line by line or in batches instead of all at once? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experience Should we burninate the [wrap] tag?How to do a aggregation transformation in a kiba etl script (kiba gem)?Using Kiba: Is it possible to define and run two pipelines in the same file? Using an intermediate destination & a second sourceIs there a sample implementation of Kiba ETL Job using s3 bucket with csv files as source and the destination is in s3 bucket also?Saving and loading etl pipeline from databaseETL to csv files, split up and then pushed to s3 to be consume by redshift
When is phishing education going too far?
Should I call the interviewer directly, if HR aren't responding?
Withdrew £2800, but only £2000 shows as withdrawn on online banking; what are my obligations?
Proof involving the spectral radius and Jordan Canonical form
Can a non-EU citizen traveling with me come with me through the EU passport line?
Is there a concise way to say "all of the X, one of each"?
Single word antonym of "flightless"
If 'B is more likely given A', then 'A is more likely given B'
Do I really need recursive chmod to restrict access to a folder?
Do you forfeit tax refunds/credits if you aren't required to and don't file by April 15?
Does surprise arrest existing movement?
I need to find the potential function of a vector field.
Did Xerox really develop the first LAN?
Should I discuss the type of campaign with my players?
Is there a documented rationale why the House Ways and Means chairman can demand tax info?
How do I stop a creek from eroding my steep embankment?
Bonus calculation: Am I making a mountain out of a molehill?
Can inflation occur in a positive-sum game currency system such as the Stack Exchange reputation system?
What LEGO pieces have "real-world" functionality?
What is the musical term for a note that continously plays through a melody?
What does the "x" in "x86" represent?
What is a Meta algorithm?
Can Pao de Queijo, and similar foods, be kosher for Passover?
If Jon Snow became King of the Seven Kingdoms what would his regnal number be?
Is there a way to write to Kiba CSV destination line by line or in batches instead of all at once?
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
The Ask Question Wizard is Live!
Data science time! April 2019 and salary with experience
Should we burninate the [wrap] tag?How to do a aggregation transformation in a kiba etl script (kiba gem)?Using Kiba: Is it possible to define and run two pipelines in the same file? Using an intermediate destination & a second sourceIs there a sample implementation of Kiba ETL Job using s3 bucket with csv files as source and the destination is in s3 bucket also?Saving and loading etl pipeline from databaseETL to csv files, split up and then pushed to s3 to be consume by redshift
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
Kiba is really cool!
I'm trying to set up a ETL process in my Rails app where I'll dump a large amount of data from my SQL DB to a CSV file. If I were to implement this myself I'd use something like find_each
to load say 1000 records at a time and write them to the file in batches. Is there a way to do this using Kiba? From my understanding by default all of the rows
from the Source get passed to the Destination, which wouldn't be feasible for my situation.
kiba-etl
add a comment |
Kiba is really cool!
I'm trying to set up a ETL process in my Rails app where I'll dump a large amount of data from my SQL DB to a CSV file. If I were to implement this myself I'd use something like find_each
to load say 1000 records at a time and write them to the file in batches. Is there a way to do this using Kiba? From my understanding by default all of the rows
from the Source get passed to the Destination, which wouldn't be feasible for my situation.
kiba-etl
add a comment |
Kiba is really cool!
I'm trying to set up a ETL process in my Rails app where I'll dump a large amount of data from my SQL DB to a CSV file. If I were to implement this myself I'd use something like find_each
to load say 1000 records at a time and write them to the file in batches. Is there a way to do this using Kiba? From my understanding by default all of the rows
from the Source get passed to the Destination, which wouldn't be feasible for my situation.
kiba-etl
Kiba is really cool!
I'm trying to set up a ETL process in my Rails app where I'll dump a large amount of data from my SQL DB to a CSV file. If I were to implement this myself I'd use something like find_each
to load say 1000 records at a time and write them to the file in batches. Is there a way to do this using Kiba? From my understanding by default all of the rows
from the Source get passed to the Destination, which wouldn't be feasible for my situation.
kiba-etl
kiba-etl
asked Mar 8 at 16:14
user1376350user1376350
12314
12314
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Glad you like Kiba!
I'm going to make you happy by stating that your understanding is incorrect.
The rows are yielded & processed one by one in Kiba.
To see how things work exactly, I suggest you try it this code:
class MySource
def initialize(enumerable)
@enumerable = enumerable
end
def each
@enumerable.each do |item|
puts "Source is reading #item"
yield item
end
end
end
class MyDestination
def write(row)
puts "Destination is writing #row"
end
end
source MySource, (1..10)
destination MyDestination
Run this and you'll see that each item is read then written.
Now to your actual concrete case - what's above means that you can implement your source this way:
class ActiveRecord
def initialize(model:)
@model = model
end
def each
@model.find_each do |record|
yield record
end
end
end
then you can use it like this:
source ActiveRecordSource, model: Person.where("age > 21")
(You could also leverage find_in_batches
if you wanted each row to be an array of multiple records, but that's probably not what you need here).
Hope this properly answers your question!
Thanks! Clearly I made a bad assumption. I really appreciate you taking the time to clear that up.
– user1376350
Mar 8 at 19:51
You welcome :-) Happy hacking!
– Thibaut Barrère
Mar 15 at 14:42
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55067007%2fis-there-a-way-to-write-to-kiba-csv-destination-line-by-line-or-in-batches-inste%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Glad you like Kiba!
I'm going to make you happy by stating that your understanding is incorrect.
The rows are yielded & processed one by one in Kiba.
To see how things work exactly, I suggest you try it this code:
class MySource
def initialize(enumerable)
@enumerable = enumerable
end
def each
@enumerable.each do |item|
puts "Source is reading #item"
yield item
end
end
end
class MyDestination
def write(row)
puts "Destination is writing #row"
end
end
source MySource, (1..10)
destination MyDestination
Run this and you'll see that each item is read then written.
Now to your actual concrete case - what's above means that you can implement your source this way:
class ActiveRecord
def initialize(model:)
@model = model
end
def each
@model.find_each do |record|
yield record
end
end
end
then you can use it like this:
source ActiveRecordSource, model: Person.where("age > 21")
(You could also leverage find_in_batches
if you wanted each row to be an array of multiple records, but that's probably not what you need here).
Hope this properly answers your question!
Thanks! Clearly I made a bad assumption. I really appreciate you taking the time to clear that up.
– user1376350
Mar 8 at 19:51
You welcome :-) Happy hacking!
– Thibaut Barrère
Mar 15 at 14:42
add a comment |
Glad you like Kiba!
I'm going to make you happy by stating that your understanding is incorrect.
The rows are yielded & processed one by one in Kiba.
To see how things work exactly, I suggest you try it this code:
class MySource
def initialize(enumerable)
@enumerable = enumerable
end
def each
@enumerable.each do |item|
puts "Source is reading #item"
yield item
end
end
end
class MyDestination
def write(row)
puts "Destination is writing #row"
end
end
source MySource, (1..10)
destination MyDestination
Run this and you'll see that each item is read then written.
Now to your actual concrete case - what's above means that you can implement your source this way:
class ActiveRecord
def initialize(model:)
@model = model
end
def each
@model.find_each do |record|
yield record
end
end
end
then you can use it like this:
source ActiveRecordSource, model: Person.where("age > 21")
(You could also leverage find_in_batches
if you wanted each row to be an array of multiple records, but that's probably not what you need here).
Hope this properly answers your question!
Thanks! Clearly I made a bad assumption. I really appreciate you taking the time to clear that up.
– user1376350
Mar 8 at 19:51
You welcome :-) Happy hacking!
– Thibaut Barrère
Mar 15 at 14:42
add a comment |
Glad you like Kiba!
I'm going to make you happy by stating that your understanding is incorrect.
The rows are yielded & processed one by one in Kiba.
To see how things work exactly, I suggest you try it this code:
class MySource
def initialize(enumerable)
@enumerable = enumerable
end
def each
@enumerable.each do |item|
puts "Source is reading #item"
yield item
end
end
end
class MyDestination
def write(row)
puts "Destination is writing #row"
end
end
source MySource, (1..10)
destination MyDestination
Run this and you'll see that each item is read then written.
Now to your actual concrete case - what's above means that you can implement your source this way:
class ActiveRecord
def initialize(model:)
@model = model
end
def each
@model.find_each do |record|
yield record
end
end
end
then you can use it like this:
source ActiveRecordSource, model: Person.where("age > 21")
(You could also leverage find_in_batches
if you wanted each row to be an array of multiple records, but that's probably not what you need here).
Hope this properly answers your question!
Glad you like Kiba!
I'm going to make you happy by stating that your understanding is incorrect.
The rows are yielded & processed one by one in Kiba.
To see how things work exactly, I suggest you try it this code:
class MySource
def initialize(enumerable)
@enumerable = enumerable
end
def each
@enumerable.each do |item|
puts "Source is reading #item"
yield item
end
end
end
class MyDestination
def write(row)
puts "Destination is writing #row"
end
end
source MySource, (1..10)
destination MyDestination
Run this and you'll see that each item is read then written.
Now to your actual concrete case - what's above means that you can implement your source this way:
class ActiveRecord
def initialize(model:)
@model = model
end
def each
@model.find_each do |record|
yield record
end
end
end
then you can use it like this:
source ActiveRecordSource, model: Person.where("age > 21")
(You could also leverage find_in_batches
if you wanted each row to be an array of multiple records, but that's probably not what you need here).
Hope this properly answers your question!
answered Mar 8 at 17:19
Thibaut BarrèreThibaut Barrère
7,38521522
7,38521522
Thanks! Clearly I made a bad assumption. I really appreciate you taking the time to clear that up.
– user1376350
Mar 8 at 19:51
You welcome :-) Happy hacking!
– Thibaut Barrère
Mar 15 at 14:42
add a comment |
Thanks! Clearly I made a bad assumption. I really appreciate you taking the time to clear that up.
– user1376350
Mar 8 at 19:51
You welcome :-) Happy hacking!
– Thibaut Barrère
Mar 15 at 14:42
Thanks! Clearly I made a bad assumption. I really appreciate you taking the time to clear that up.
– user1376350
Mar 8 at 19:51
Thanks! Clearly I made a bad assumption. I really appreciate you taking the time to clear that up.
– user1376350
Mar 8 at 19:51
You welcome :-) Happy hacking!
– Thibaut Barrère
Mar 15 at 14:42
You welcome :-) Happy hacking!
– Thibaut Barrère
Mar 15 at 14:42
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55067007%2fis-there-a-way-to-write-to-kiba-csv-destination-line-by-line-or-in-batches-inste%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown