selecting multiple rows with a list of id pyspark2019 Community Moderator ElectionCreating multiple pyspark dataframes from a single dataframeHow do I check if a list is empty?Finding the index of an item given a list containing it in PythonDifference between append vs. extend list methods in PythonHow to randomly select an item from a list?How to make a flat list out of list of lists?How to concatenate two lists in Python?How to clone or copy a list?How do I list all files of a directory?How to read a file line-by-line into a list?Catch multiple exceptions in one line (except block)

How can I portion out frozen cookie dough?

If nine coins are tossed, what is the probability that the number of heads is even?

Use Mercury as quenching liquid for swords?

A running toilet that stops itself

Having the player face themselves after the mid-game

PTIJ: Sport in the Torah

Short story about an infectious indestructible metal bar?

Generating a list with duplicate entries

Short SF story. Females use stingers to implant eggs in yearfathers

How does learning spells work when leveling a multiclass character?

What is the purpose of a disclaimer like "this is not legal advice"?

How do you make a gun that shoots melee weapons and/or swords?

Will the concrete slab in a partially heated shed conduct a lot of heat to the unconditioned area?

Do I need a return ticket to Canada if I'm a Japanese National?

What can I do if someone tampers with my SSH public key?

An Undercover Army

How to educate team mate to take screenshots for bugs with out unwanted stuff

Who has more? Ireland or Iceland?

Why aren't there more Gauls like Obelix?

What exactly is the meaning of "fine wine"?

Create chunks from an array

Why do phishing e-mails use faked e-mail addresses instead of the real one?

Why do we say 'Pairwise Disjoint', rather than 'Disjoint'?

Tabular environment - text vertically positions itself by bottom of tikz picture in adjacent cell



selecting multiple rows with a list of id pyspark



2019 Community Moderator ElectionCreating multiple pyspark dataframes from a single dataframeHow do I check if a list is empty?Finding the index of an item given a list containing it in PythonDifference between append vs. extend list methods in PythonHow to randomly select an item from a list?How to make a flat list out of list of lists?How to concatenate two lists in Python?How to clone or copy a list?How do I list all files of a directory?How to read a file line-by-line into a list?Catch multiple exceptions in one line (except block)










0















I have a table in spark, which has ID and numOfReq attributes.
in ID, it is between 1 to 100 and it's not in order, and each ID can be repeated many times in the table. I want to extract rows with 1, 47, 54 and 89 IDs. I can do it with a for loop like this pseudo code:



temp = [None , None, None, None]
i = 0
for id in idList:
temp[i] = table.filter(table['ID'] == id)
i += 1


but it took a long time to do so.
is there any filter or library which do this fast? what should I insert in my code? I need something in pyspark










share|improve this question
























  • Do you want 4 different tables for 1, 47, 54 and 89 respectively? Secondly, you use id in the for loop and then use temp[i]? i is undefined. You mention that it took you a lot of time, so did you try it in PySpark?

    – cph_sto
    2 days ago












  • i is iteration counter and that's pseudo code. yes, I need exactly that for tables and in pyspark it took a long time to be done.

    – Mohammad Hassan Bigdeli Shamlo
    2 days ago












  • i is not a problem, the problem is to select that 4 tables.

    – Mohammad Hassan Bigdeli Shamlo
    2 days ago











  • Your pseudo-code looks fine though.

    – cph_sto
    2 days ago











  • Check this - may be this solves your problem, though the logic used is quite similar, but with dictionary instead. stackoverflow.com/questions/54743574/…

    – cph_sto
    2 days ago















0















I have a table in spark, which has ID and numOfReq attributes.
in ID, it is between 1 to 100 and it's not in order, and each ID can be repeated many times in the table. I want to extract rows with 1, 47, 54 and 89 IDs. I can do it with a for loop like this pseudo code:



temp = [None , None, None, None]
i = 0
for id in idList:
temp[i] = table.filter(table['ID'] == id)
i += 1


but it took a long time to do so.
is there any filter or library which do this fast? what should I insert in my code? I need something in pyspark










share|improve this question
























  • Do you want 4 different tables for 1, 47, 54 and 89 respectively? Secondly, you use id in the for loop and then use temp[i]? i is undefined. You mention that it took you a lot of time, so did you try it in PySpark?

    – cph_sto
    2 days ago












  • i is iteration counter and that's pseudo code. yes, I need exactly that for tables and in pyspark it took a long time to be done.

    – Mohammad Hassan Bigdeli Shamlo
    2 days ago












  • i is not a problem, the problem is to select that 4 tables.

    – Mohammad Hassan Bigdeli Shamlo
    2 days ago











  • Your pseudo-code looks fine though.

    – cph_sto
    2 days ago











  • Check this - may be this solves your problem, though the logic used is quite similar, but with dictionary instead. stackoverflow.com/questions/54743574/…

    – cph_sto
    2 days ago













0












0








0








I have a table in spark, which has ID and numOfReq attributes.
in ID, it is between 1 to 100 and it's not in order, and each ID can be repeated many times in the table. I want to extract rows with 1, 47, 54 and 89 IDs. I can do it with a for loop like this pseudo code:



temp = [None , None, None, None]
i = 0
for id in idList:
temp[i] = table.filter(table['ID'] == id)
i += 1


but it took a long time to do so.
is there any filter or library which do this fast? what should I insert in my code? I need something in pyspark










share|improve this question
















I have a table in spark, which has ID and numOfReq attributes.
in ID, it is between 1 to 100 and it's not in order, and each ID can be repeated many times in the table. I want to extract rows with 1, 47, 54 and 89 IDs. I can do it with a for loop like this pseudo code:



temp = [None , None, None, None]
i = 0
for id in idList:
temp[i] = table.filter(table['ID'] == id)
i += 1


but it took a long time to do so.
is there any filter or library which do this fast? what should I insert in my code? I need something in pyspark







python pyspark






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 2 days ago







Mohammad Hassan Bigdeli Shamlo

















asked 2 days ago









Mohammad Hassan Bigdeli ShamloMohammad Hassan Bigdeli Shamlo

175




175












  • Do you want 4 different tables for 1, 47, 54 and 89 respectively? Secondly, you use id in the for loop and then use temp[i]? i is undefined. You mention that it took you a lot of time, so did you try it in PySpark?

    – cph_sto
    2 days ago












  • i is iteration counter and that's pseudo code. yes, I need exactly that for tables and in pyspark it took a long time to be done.

    – Mohammad Hassan Bigdeli Shamlo
    2 days ago












  • i is not a problem, the problem is to select that 4 tables.

    – Mohammad Hassan Bigdeli Shamlo
    2 days ago











  • Your pseudo-code looks fine though.

    – cph_sto
    2 days ago











  • Check this - may be this solves your problem, though the logic used is quite similar, but with dictionary instead. stackoverflow.com/questions/54743574/…

    – cph_sto
    2 days ago

















  • Do you want 4 different tables for 1, 47, 54 and 89 respectively? Secondly, you use id in the for loop and then use temp[i]? i is undefined. You mention that it took you a lot of time, so did you try it in PySpark?

    – cph_sto
    2 days ago












  • i is iteration counter and that's pseudo code. yes, I need exactly that for tables and in pyspark it took a long time to be done.

    – Mohammad Hassan Bigdeli Shamlo
    2 days ago












  • i is not a problem, the problem is to select that 4 tables.

    – Mohammad Hassan Bigdeli Shamlo
    2 days ago











  • Your pseudo-code looks fine though.

    – cph_sto
    2 days ago











  • Check this - may be this solves your problem, though the logic used is quite similar, but with dictionary instead. stackoverflow.com/questions/54743574/…

    – cph_sto
    2 days ago
















Do you want 4 different tables for 1, 47, 54 and 89 respectively? Secondly, you use id in the for loop and then use temp[i]? i is undefined. You mention that it took you a lot of time, so did you try it in PySpark?

– cph_sto
2 days ago






Do you want 4 different tables for 1, 47, 54 and 89 respectively? Secondly, you use id in the for loop and then use temp[i]? i is undefined. You mention that it took you a lot of time, so did you try it in PySpark?

– cph_sto
2 days ago














i is iteration counter and that's pseudo code. yes, I need exactly that for tables and in pyspark it took a long time to be done.

– Mohammad Hassan Bigdeli Shamlo
2 days ago






i is iteration counter and that's pseudo code. yes, I need exactly that for tables and in pyspark it took a long time to be done.

– Mohammad Hassan Bigdeli Shamlo
2 days ago














i is not a problem, the problem is to select that 4 tables.

– Mohammad Hassan Bigdeli Shamlo
2 days ago





i is not a problem, the problem is to select that 4 tables.

– Mohammad Hassan Bigdeli Shamlo
2 days ago













Your pseudo-code looks fine though.

– cph_sto
2 days ago





Your pseudo-code looks fine though.

– cph_sto
2 days ago













Check this - may be this solves your problem, though the logic used is quite similar, but with dictionary instead. stackoverflow.com/questions/54743574/…

– cph_sto
2 days ago





Check this - may be this solves your problem, though the logic used is quite similar, but with dictionary instead. stackoverflow.com/questions/54743574/…

– cph_sto
2 days ago












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55023471%2fselecting-multiple-rows-with-a-list-of-id-pyspark%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55023471%2fselecting-multiple-rows-with-a-list-of-id-pyspark%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

1928 у кіно

Захаров Федір Захарович

Ель Греко