Bigquery to get Speed based on time/location data (rows above/below current row) Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!How to get the current time in PythonGet current time in milliseconds in Python?BigQuery Data Location Settingget list of pandas dataframe columns based on data typeCondition based on nth row above or below current row - PythonIs there a way to resample time series data in BigQuery?BigQuery speed - time from __SHUFFLE1_REPARTITION0BigQuery: SELECT in WHERE-clause with filter based on a value in the current rowbigquery for time series dataPandas time operations

Why does my GNOME settings mention "Moto C Plus"?

Lights are flickering on and off after accidentally bumping into light switch

Output the slug and name of a CPT single post taxonomy term

Is Vivien of the Wilds + Wilderness Reclamation a competitive combo?

Compiling and throwing simple dynamic exceptions at runtime for JVM

Help Recreating a Table

Can gravitational waves pass through a black hole?

Proving inequality for positive definite matrix

Does the Pact of the Blade warlock feature allow me to customize the properties of the pact weapon I create?

How to break 信じようとしていただけかも知れない into separate parts?

Coin Game with infinite paradox

Will I be more secure with my own router behind my ISP's router?

A German immigrant ancestor has a "Registration Affidavit of Alien Enemy" on file. What does that mean exactly?

Assertions In A Mock Callout Test

How to produce a PS1 prompt in bash or ksh93 similar to tcsh

How can I introduce the names of fantasy creatures to the reader?

Has a Nobel Peace laureate ever been accused of war crimes?

Recursive calls to a function - why is the address of the parameter passed to it lowering with each call?

Suing a Police Officer Instead of the Police Department

Why these surprising proportionalities of integrals involving odd zeta values?

What *exactly* is electrical current, voltage, and resistance?

How to ask rejected full-time candidates to apply to teach individual courses?

How to keep bees out of canned beverages?

Can I ask an author to send me his ebook?



Bigquery to get Speed based on time/location data (rows above/below current row)



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!How to get the current time in PythonGet current time in milliseconds in Python?BigQuery Data Location Settingget list of pandas dataframe columns based on data typeCondition based on nth row above or below current row - PythonIs there a way to resample time series data in BigQuery?BigQuery speed - time from __SHUFFLE1_REPARTITION0BigQuery: SELECT in WHERE-clause with filter based on a value in the current rowbigquery for time series dataPandas time operations



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








3















I have a table in Bigquery with tracking data for Nascar drivers (dummy data for a project I am working on). The x and y coordinates are taken 10 times a second. The capture_frame signifies the current frame, and each sequential capture_frame should be 100 milliseconds apart because the data is taken every 100 ms.



I want to calculate each driver's speed per lap. I know how to do this in pandas but I think this is possible in bigquery. To calculate speed, I am looking at 2 rows before capture_frame and 2 rows after and then dividing by the difference in epoch time, which should be 400 milliseconds.



Here is an example of a few capture frames for 1 race for one driver for the first lap. There are a few hundred capture frames per lap and then 20 drivers mixed in as well, but it is easier to understand if we look at just one driver/race/lap.



+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| Race | Capture | Lap | Driver | … | X | Y | Epoch_time | Delta_dist | Curr_speed |
| | _frame | | | | | | | | |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 1 | 1 | Logano | …. | 2.1 | 1 | 1552089720 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 2 | 1 | Logano | … | 2.2 | 1.1 | 1552089820 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 3 | 1 | Logano | … | 2.22 | 1.2 | 1552089920 | 2.265921446 | 0.005664804 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 4 | 1 | Logano | .. | 3.22 | 1.5 | 1552090020 | 3.124163888 | 0.00781041 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 5 | 1 | Logano | .. | 4.22 | 1.8 | 1552090120 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 6 | 1 | Logano | .. | 5.22 | 1.9 | 1552090220 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+


The delta_dist for frame 3 is calculated by sqrt((4.22-2.1)^2 + (1.8-1)^2)/1 and the the curr_speed is that number divided by 400. The first/last 2 distances and speeds of the race will be null because there are no prior x or y coordinates which is okay as there isn't really any speed when you are .1 second from starting or stopping.



In pandas I would do (this is not great code as I am just bringing each driver and race in on its own):



#laps_per_race dictionary with num laps per race
for driver in driver_list:
for race in race_list:
driver_race_query = “SELECT * from nascar_xyz where driver=driver and Race=race”.format(driver=driver, race=race)
df_entire_race = client.query(driver_race_query).to_dataframe()
num_laps = laps_per_race[race]
for lap in num_laps:
#get subset of dataframe just for this lap
df = df_entire_race.loc[df_entire_race['Lap'] == lap]
df.sort_values(‘Epoch_time’, inplace=True)
df[‘prev_x’] = df[‘X’].shift(2)
df[‘next_x’] = df[‘X’].shift(-2)
df[‘prev_y’] = df[‘Y’].shift(2)
df[‘next_y’] = df[‘Y’].shift(-2)
#this is just distance function sqrt((x2-x1)^2 + (y2-y1)^2)
df['delta_dist'] = np.sqrt((df[‘X’].shift(-2) - df[‘X’].shift(2))**2 + (df[‘Y’].shift(-2) - df[‘Y’].shift(2))**2))

#400.0 is the time actual difference
df['Curr_speed'] = df['delta_dist']/400.0


I think in my sql query I either have to do a group by or partition by to because I want to look in each race by driver_id, then lap (if that level of abstraction makes sense). Maybe for the speed and looking capture_frames ahead I can do something with windowing (https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts) or something called lag which seems like the equivalent of .shift() in pandas.










share|improve this question



















  • 2





    Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

    – Felipe Hoffa
    Mar 9 at 1:11











  • it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

    – Mikhail Berlyant
    Mar 9 at 12:17

















3















I have a table in Bigquery with tracking data for Nascar drivers (dummy data for a project I am working on). The x and y coordinates are taken 10 times a second. The capture_frame signifies the current frame, and each sequential capture_frame should be 100 milliseconds apart because the data is taken every 100 ms.



I want to calculate each driver's speed per lap. I know how to do this in pandas but I think this is possible in bigquery. To calculate speed, I am looking at 2 rows before capture_frame and 2 rows after and then dividing by the difference in epoch time, which should be 400 milliseconds.



Here is an example of a few capture frames for 1 race for one driver for the first lap. There are a few hundred capture frames per lap and then 20 drivers mixed in as well, but it is easier to understand if we look at just one driver/race/lap.



+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| Race | Capture | Lap | Driver | … | X | Y | Epoch_time | Delta_dist | Curr_speed |
| | _frame | | | | | | | | |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 1 | 1 | Logano | …. | 2.1 | 1 | 1552089720 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 2 | 1 | Logano | … | 2.2 | 1.1 | 1552089820 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 3 | 1 | Logano | … | 2.22 | 1.2 | 1552089920 | 2.265921446 | 0.005664804 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 4 | 1 | Logano | .. | 3.22 | 1.5 | 1552090020 | 3.124163888 | 0.00781041 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 5 | 1 | Logano | .. | 4.22 | 1.8 | 1552090120 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 6 | 1 | Logano | .. | 5.22 | 1.9 | 1552090220 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+


The delta_dist for frame 3 is calculated by sqrt((4.22-2.1)^2 + (1.8-1)^2)/1 and the the curr_speed is that number divided by 400. The first/last 2 distances and speeds of the race will be null because there are no prior x or y coordinates which is okay as there isn't really any speed when you are .1 second from starting or stopping.



In pandas I would do (this is not great code as I am just bringing each driver and race in on its own):



#laps_per_race dictionary with num laps per race
for driver in driver_list:
for race in race_list:
driver_race_query = “SELECT * from nascar_xyz where driver=driver and Race=race”.format(driver=driver, race=race)
df_entire_race = client.query(driver_race_query).to_dataframe()
num_laps = laps_per_race[race]
for lap in num_laps:
#get subset of dataframe just for this lap
df = df_entire_race.loc[df_entire_race['Lap'] == lap]
df.sort_values(‘Epoch_time’, inplace=True)
df[‘prev_x’] = df[‘X’].shift(2)
df[‘next_x’] = df[‘X’].shift(-2)
df[‘prev_y’] = df[‘Y’].shift(2)
df[‘next_y’] = df[‘Y’].shift(-2)
#this is just distance function sqrt((x2-x1)^2 + (y2-y1)^2)
df['delta_dist'] = np.sqrt((df[‘X’].shift(-2) - df[‘X’].shift(2))**2 + (df[‘Y’].shift(-2) - df[‘Y’].shift(2))**2))

#400.0 is the time actual difference
df['Curr_speed'] = df['delta_dist']/400.0


I think in my sql query I either have to do a group by or partition by to because I want to look in each race by driver_id, then lap (if that level of abstraction makes sense). Maybe for the speed and looking capture_frames ahead I can do something with windowing (https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts) or something called lag which seems like the equivalent of .shift() in pandas.










share|improve this question



















  • 2





    Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

    – Felipe Hoffa
    Mar 9 at 1:11











  • it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

    – Mikhail Berlyant
    Mar 9 at 12:17













3












3








3


1






I have a table in Bigquery with tracking data for Nascar drivers (dummy data for a project I am working on). The x and y coordinates are taken 10 times a second. The capture_frame signifies the current frame, and each sequential capture_frame should be 100 milliseconds apart because the data is taken every 100 ms.



I want to calculate each driver's speed per lap. I know how to do this in pandas but I think this is possible in bigquery. To calculate speed, I am looking at 2 rows before capture_frame and 2 rows after and then dividing by the difference in epoch time, which should be 400 milliseconds.



Here is an example of a few capture frames for 1 race for one driver for the first lap. There are a few hundred capture frames per lap and then 20 drivers mixed in as well, but it is easier to understand if we look at just one driver/race/lap.



+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| Race | Capture | Lap | Driver | … | X | Y | Epoch_time | Delta_dist | Curr_speed |
| | _frame | | | | | | | | |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 1 | 1 | Logano | …. | 2.1 | 1 | 1552089720 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 2 | 1 | Logano | … | 2.2 | 1.1 | 1552089820 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 3 | 1 | Logano | … | 2.22 | 1.2 | 1552089920 | 2.265921446 | 0.005664804 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 4 | 1 | Logano | .. | 3.22 | 1.5 | 1552090020 | 3.124163888 | 0.00781041 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 5 | 1 | Logano | .. | 4.22 | 1.8 | 1552090120 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 6 | 1 | Logano | .. | 5.22 | 1.9 | 1552090220 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+


The delta_dist for frame 3 is calculated by sqrt((4.22-2.1)^2 + (1.8-1)^2)/1 and the the curr_speed is that number divided by 400. The first/last 2 distances and speeds of the race will be null because there are no prior x or y coordinates which is okay as there isn't really any speed when you are .1 second from starting or stopping.



In pandas I would do (this is not great code as I am just bringing each driver and race in on its own):



#laps_per_race dictionary with num laps per race
for driver in driver_list:
for race in race_list:
driver_race_query = “SELECT * from nascar_xyz where driver=driver and Race=race”.format(driver=driver, race=race)
df_entire_race = client.query(driver_race_query).to_dataframe()
num_laps = laps_per_race[race]
for lap in num_laps:
#get subset of dataframe just for this lap
df = df_entire_race.loc[df_entire_race['Lap'] == lap]
df.sort_values(‘Epoch_time’, inplace=True)
df[‘prev_x’] = df[‘X’].shift(2)
df[‘next_x’] = df[‘X’].shift(-2)
df[‘prev_y’] = df[‘Y’].shift(2)
df[‘next_y’] = df[‘Y’].shift(-2)
#this is just distance function sqrt((x2-x1)^2 + (y2-y1)^2)
df['delta_dist'] = np.sqrt((df[‘X’].shift(-2) - df[‘X’].shift(2))**2 + (df[‘Y’].shift(-2) - df[‘Y’].shift(2))**2))

#400.0 is the time actual difference
df['Curr_speed'] = df['delta_dist']/400.0


I think in my sql query I either have to do a group by or partition by to because I want to look in each race by driver_id, then lap (if that level of abstraction makes sense). Maybe for the speed and looking capture_frames ahead I can do something with windowing (https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts) or something called lag which seems like the equivalent of .shift() in pandas.










share|improve this question
















I have a table in Bigquery with tracking data for Nascar drivers (dummy data for a project I am working on). The x and y coordinates are taken 10 times a second. The capture_frame signifies the current frame, and each sequential capture_frame should be 100 milliseconds apart because the data is taken every 100 ms.



I want to calculate each driver's speed per lap. I know how to do this in pandas but I think this is possible in bigquery. To calculate speed, I am looking at 2 rows before capture_frame and 2 rows after and then dividing by the difference in epoch time, which should be 400 milliseconds.



Here is an example of a few capture frames for 1 race for one driver for the first lap. There are a few hundred capture frames per lap and then 20 drivers mixed in as well, but it is easier to understand if we look at just one driver/race/lap.



+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| Race | Capture | Lap | Driver | … | X | Y | Epoch_time | Delta_dist | Curr_speed |
| | _frame | | | | | | | | |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 1 | 1 | Logano | …. | 2.1 | 1 | 1552089720 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 2 | 1 | Logano | … | 2.2 | 1.1 | 1552089820 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 3 | 1 | Logano | … | 2.22 | 1.2 | 1552089920 | 2.265921446 | 0.005664804 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 4 | 1 | Logano | .. | 3.22 | 1.5 | 1552090020 | 3.124163888 | 0.00781041 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 5 | 1 | Logano | .. | 4.22 | 1.8 | 1552090120 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 6 | 1 | Logano | .. | 5.22 | 1.9 | 1552090220 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+


The delta_dist for frame 3 is calculated by sqrt((4.22-2.1)^2 + (1.8-1)^2)/1 and the the curr_speed is that number divided by 400. The first/last 2 distances and speeds of the race will be null because there are no prior x or y coordinates which is okay as there isn't really any speed when you are .1 second from starting or stopping.



In pandas I would do (this is not great code as I am just bringing each driver and race in on its own):



#laps_per_race dictionary with num laps per race
for driver in driver_list:
for race in race_list:
driver_race_query = “SELECT * from nascar_xyz where driver=driver and Race=race”.format(driver=driver, race=race)
df_entire_race = client.query(driver_race_query).to_dataframe()
num_laps = laps_per_race[race]
for lap in num_laps:
#get subset of dataframe just for this lap
df = df_entire_race.loc[df_entire_race['Lap'] == lap]
df.sort_values(‘Epoch_time’, inplace=True)
df[‘prev_x’] = df[‘X’].shift(2)
df[‘next_x’] = df[‘X’].shift(-2)
df[‘prev_y’] = df[‘Y’].shift(2)
df[‘next_y’] = df[‘Y’].shift(-2)
#this is just distance function sqrt((x2-x1)^2 + (y2-y1)^2)
df['delta_dist'] = np.sqrt((df[‘X’].shift(-2) - df[‘X’].shift(2))**2 + (df[‘Y’].shift(-2) - df[‘Y’].shift(2))**2))

#400.0 is the time actual difference
df['Curr_speed'] = df['delta_dist']/400.0


I think in my sql query I either have to do a group by or partition by to because I want to look in each race by driver_id, then lap (if that level of abstraction makes sense). Maybe for the speed and looking capture_frames ahead I can do something with windowing (https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts) or something called lag which seems like the equivalent of .shift() in pandas.







python sql pandas google-bigquery geospatial






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 9 at 2:58









Pikachu the Purple Wizard

2,12461529




2,12461529










asked Mar 9 at 0:51









H. RoseH. Rose

191




191







  • 2





    Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

    – Felipe Hoffa
    Mar 9 at 1:11











  • it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

    – Mikhail Berlyant
    Mar 9 at 12:17












  • 2





    Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

    – Felipe Hoffa
    Mar 9 at 1:11











  • it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

    – Mikhail Berlyant
    Mar 9 at 12:17







2




2





Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

– Felipe Hoffa
Mar 9 at 1:11





Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

– Felipe Hoffa
Mar 9 at 1:11













it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

– Mikhail Berlyant
Mar 9 at 12:17





it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

– Mikhail Berlyant
Mar 9 at 12:17












1 Answer
1






active

oldest

votes


















1














You are in the right path. I'll take a public dataset of buses moving around Staten Island - and I'll use the geographical distance by looking at their lat,lon:



WITH data AS (
SELECT bus, ST_GeogPoint(longitude, latitude) point
, PARSE_TIMESTAMP('%Y%m%d %H%M%S',FORMAT('%i %06d', day, time)) ts
FROM `fh-bigquery.mta_nyc_si.201410_bustime`
WHERE day=20141014
AND bus IN (7043, 7086, 7076, 2421, 7052, 7071)
)


SELECT *
FROM (
SELECT bus, ts, distance/time speed
FROM (
SELECT bus, ts
, ST_DISTANCE(point, LAG(point, 3) OVER(PARTITION BY bus ORDER BY ts)) distance
, TIMESTAMP_DIFF(ts, LAG(ts, 3) OVER(PARTITION BY bus ORDER BY ts), SECOND) time
FROM data
)
WHERE time IS NOT null
)
WHERE speed < 500


enter image description here






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55072921%2fbigquery-to-get-speed-based-on-time-location-data-rows-above-below-current-row%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    You are in the right path. I'll take a public dataset of buses moving around Staten Island - and I'll use the geographical distance by looking at their lat,lon:



    WITH data AS (
    SELECT bus, ST_GeogPoint(longitude, latitude) point
    , PARSE_TIMESTAMP('%Y%m%d %H%M%S',FORMAT('%i %06d', day, time)) ts
    FROM `fh-bigquery.mta_nyc_si.201410_bustime`
    WHERE day=20141014
    AND bus IN (7043, 7086, 7076, 2421, 7052, 7071)
    )


    SELECT *
    FROM (
    SELECT bus, ts, distance/time speed
    FROM (
    SELECT bus, ts
    , ST_DISTANCE(point, LAG(point, 3) OVER(PARTITION BY bus ORDER BY ts)) distance
    , TIMESTAMP_DIFF(ts, LAG(ts, 3) OVER(PARTITION BY bus ORDER BY ts), SECOND) time
    FROM data
    )
    WHERE time IS NOT null
    )
    WHERE speed < 500


    enter image description here






    share|improve this answer



























      1














      You are in the right path. I'll take a public dataset of buses moving around Staten Island - and I'll use the geographical distance by looking at their lat,lon:



      WITH data AS (
      SELECT bus, ST_GeogPoint(longitude, latitude) point
      , PARSE_TIMESTAMP('%Y%m%d %H%M%S',FORMAT('%i %06d', day, time)) ts
      FROM `fh-bigquery.mta_nyc_si.201410_bustime`
      WHERE day=20141014
      AND bus IN (7043, 7086, 7076, 2421, 7052, 7071)
      )


      SELECT *
      FROM (
      SELECT bus, ts, distance/time speed
      FROM (
      SELECT bus, ts
      , ST_DISTANCE(point, LAG(point, 3) OVER(PARTITION BY bus ORDER BY ts)) distance
      , TIMESTAMP_DIFF(ts, LAG(ts, 3) OVER(PARTITION BY bus ORDER BY ts), SECOND) time
      FROM data
      )
      WHERE time IS NOT null
      )
      WHERE speed < 500


      enter image description here






      share|improve this answer

























        1












        1








        1







        You are in the right path. I'll take a public dataset of buses moving around Staten Island - and I'll use the geographical distance by looking at their lat,lon:



        WITH data AS (
        SELECT bus, ST_GeogPoint(longitude, latitude) point
        , PARSE_TIMESTAMP('%Y%m%d %H%M%S',FORMAT('%i %06d', day, time)) ts
        FROM `fh-bigquery.mta_nyc_si.201410_bustime`
        WHERE day=20141014
        AND bus IN (7043, 7086, 7076, 2421, 7052, 7071)
        )


        SELECT *
        FROM (
        SELECT bus, ts, distance/time speed
        FROM (
        SELECT bus, ts
        , ST_DISTANCE(point, LAG(point, 3) OVER(PARTITION BY bus ORDER BY ts)) distance
        , TIMESTAMP_DIFF(ts, LAG(ts, 3) OVER(PARTITION BY bus ORDER BY ts), SECOND) time
        FROM data
        )
        WHERE time IS NOT null
        )
        WHERE speed < 500


        enter image description here






        share|improve this answer













        You are in the right path. I'll take a public dataset of buses moving around Staten Island - and I'll use the geographical distance by looking at their lat,lon:



        WITH data AS (
        SELECT bus, ST_GeogPoint(longitude, latitude) point
        , PARSE_TIMESTAMP('%Y%m%d %H%M%S',FORMAT('%i %06d', day, time)) ts
        FROM `fh-bigquery.mta_nyc_si.201410_bustime`
        WHERE day=20141014
        AND bus IN (7043, 7086, 7076, 2421, 7052, 7071)
        )


        SELECT *
        FROM (
        SELECT bus, ts, distance/time speed
        FROM (
        SELECT bus, ts
        , ST_DISTANCE(point, LAG(point, 3) OVER(PARTITION BY bus ORDER BY ts)) distance
        , TIMESTAMP_DIFF(ts, LAG(ts, 3) OVER(PARTITION BY bus ORDER BY ts), SECOND) time
        FROM data
        )
        WHERE time IS NOT null
        )
        WHERE speed < 500


        enter image description here







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 9 at 1:44









        Felipe HoffaFelipe Hoffa

        23k253125




        23k253125





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55072921%2fbigquery-to-get-speed-based-on-time-location-data-rows-above-below-current-row%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            1928 у кіно

            Захаров Федір Захарович

            Ель Греко