Selecting records with highest timestamp - for a join

Started by Alexander Farberover 9 years ago7 messagesgeneral
Jump to latest
#1Alexander Farber
alexander.farber@gmail.com

Good evening,

I have a question please on which kind of statement to use -

In a table I store user info coming from social networks (Facebook,
Twitter, ...):

CREATE TABLE words_social (
sid varchar(255) NOT NULL,

social integer NOT NULL CHECK (0 <= social AND social <= 6),
female integer NOT NULL CHECK (female = 0 OR female = 1),
given varchar(255) NOT NULL CHECK (given ~ '\S'),
family varchar(255),
photo varchar(255) CHECK (photo ~* '^https?://...'),
place varchar(255),
stamp integer NOT NULL, /* HOW TO USE THE LATEST stamp? */

uid integer NOT NULL REFERENCES words_users ON DELETE CASCADE,
PRIMARY KEY(sid, social)
);

I.e. a user can have several records in the above table, but I always use
the most recent one (the one with the highest "stamp") to display that user
in my game.

Then I use a custom function to retrieve current games info for a
particular user:

CREATE OR REPLACE FUNCTION words_get_games(in_uid integer)
RETURNS TABLE (
out_gid integer,
out_created integer,
out_finished integer,
out_player1 integer,
out_player2 integer,
out_played1 integer,
out_played2 integer,
out_score1 integer,
out_score2 integer,
out_hand1 text,
out_hand2 text,
out_letters varchar[15][15],
out_values integer[15][15],
out_bid integer,
out_last_tiles jsonb,
out_last_score integer
) AS
$func$
SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int,
EXTRACT(EPOCH FROM g.finished)::int,
g.player1,
g.player2, -- can be NULL
EXTRACT(EPOCH FROM g.played1)::int,
EXTRACT(EPOCH FROM g.played2)::int,
g.score1,
g.score2,
ARRAY_TO_STRING(g.hand1, ''),
REGEXP_REPLACE(ARRAY_TO_STRING(g.hand2, ''), '.', '?', 'g'),
g.letters,
g.values,
g.bid,
m.tiles,
m.score
FROM words_games g LEFT JOIN words_moves m USING(mid)
WHERE g.player1 = in_uid
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP -
INTERVAL '1 day')
UNION SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int,
EXTRACT(EPOCH FROM g.finished)::int,
g.player2,
g.player1, -- can not be NULL
EXTRACT(EPOCH FROM g.played2)::int,
EXTRACT(EPOCH FROM g.played1)::int,
g.score2,
g.score1,
ARRAY_TO_STRING(g.hand2, ''),
REGEXP_REPLACE(ARRAY_TO_STRING(g.hand1, ''), '.', '?', 'g'),
g.letters,
g.values,
g.bid,
m.tiles,
m.score
FROM words_games g LEFT JOIN words_moves m USING(mid)
WHERE g.player2 = in_uid
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP -
INTERVAL '1 day');

$func$ LANGUAGE sql;

I would like to extend the above custom function, so that user info (given
and last names, photo) is returned too.

How to approach this problem please, should I use CTE for this?

Thank you for any hints
Alex

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Alexander Farber (#1)
Re: Selecting records with highest timestamp - for a join

On 10/19/2016 11:35 AM, Alexander Farber wrote:

Good evening,

I have a question please on which kind of statement to use -

In a table I store user info coming from social networks (Facebook,
Twitter, ...):

CREATE TABLE words_social (
sid varchar(255) NOT NULL,

social integer NOT NULL CHECK (0 <= social AND social <= 6),
female integer NOT NULL CHECK (female = 0 OR female = 1),
given varchar(255) NOT NULL CHECK (given ~ '\S'),
family varchar(255),
photo varchar(255) CHECK (photo ~* '^https?://...'),
place varchar(255),
stamp integer NOT NULL, /* HOW TO USE THE LATEST stamp? */

uid integer NOT NULL REFERENCES words_users ON DELETE CASCADE,
PRIMARY KEY(sid, social)
);

I.e. a user can have several records in the above table, but I always
use the most recent one (the one with the highest "stamp") to display
that user in my game.

Then I use a custom function to retrieve current games info for a
particular user:

CREATE OR REPLACE FUNCTION words_get_games(in_uid integer)
RETURNS TABLE (
out_gid integer,
out_created integer,
out_finished integer,
out_player1 integer,
out_player2 integer,
out_played1 integer,
out_played2 integer,
out_score1 integer,
out_score2 integer,
out_hand1 text,
out_hand2 text,
out_letters varchar[15][15],
out_values integer[15][15],
out_bid integer,
out_last_tiles jsonb,
out_last_score integer
) AS
$func$
SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int,
EXTRACT(EPOCH FROM g.finished)::int,
g.player1,
g.player2, -- can be NULL
EXTRACT(EPOCH FROM g.played1)::int,
EXTRACT(EPOCH FROM g.played2)::int,
g.score1,
g.score2,
ARRAY_TO_STRING(g.hand1, ''),
REGEXP_REPLACE(ARRAY_TO_STRING(g.hand2, ''), '.', '?', 'g'),
g.letters,
g.values,
g.bid,
m.tiles,
m.score
FROM words_games g LEFT JOIN words_moves m USING(mid)
WHERE g.player1 = in_uid
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP -
INTERVAL '1 day')
UNION SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int,
EXTRACT(EPOCH FROM g.finished)::int,
g.player2,
g.player1, -- can not be NULL
EXTRACT(EPOCH FROM g.played2)::int,
EXTRACT(EPOCH FROM g.played1)::int,
g.score2,
g.score1,
ARRAY_TO_STRING(g.hand2, ''),
REGEXP_REPLACE(ARRAY_TO_STRING(g.hand1, ''), '.', '?', 'g'),
g.letters,
g.values,
g.bid,
m.tiles,
m.score
FROM words_games g LEFT JOIN words_moves m USING(mid)
WHERE g.player2 = in_uid
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP -
INTERVAL '1 day');

$func$ LANGUAGE sql;

I would like to extend the above custom function, so that user info
(given and last names, photo) is returned too.

How to approach this problem please, should I use CTE for this?

For player1, player2 or both?

Since you are returning a table from words_get_games() you can
experiment by joining it's output to words_social.

Thank you for any hints
Alex

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Alexander Farber
alexander.farber@gmail.com
In reply to: Adrian Klaver (#2)
Re: Selecting records with highest timestamp - for a join

Adrian, for both player1 and player2 (because I need to display player
photos above the game board).

SQL join with words_social - yes, but how to take the most recent record
from that table?

For example there are user infos from Google+, Facebook, Twitter - but the
user has used Facebook to login lately and would expect her Facebook-photo
to be seen (the record with the highest "stamp" value).

Regards
Alex

On Wed, Oct 19, 2016 at 8:51 PM, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 10/19/2016 11:35 AM, Alexander Farber wrote:

In a table I store user info coming from social networks:

CREATE TABLE words_social (

sid varchar(255) NOT NULL,

social integer NOT NULL CHECK (0 <= social AND social <= 6),
female integer NOT NULL CHECK (female = 0 OR female = 1),
given varchar(255) NOT NULL CHECK (given ~ '\S'),
family varchar(255),
photo varchar(255) CHECK (photo ~* '^https?://...'),
place varchar(255),
stamp integer NOT NULL, /* HOW TO USE THE LATEST stamp? */

uid integer NOT NULL REFERENCES words_users ON DELETE CASCADE,
PRIMARY KEY(sid, social)
);

I.e. a user can have several records in the above table, but I always
use the most recent one (the one with the highest "stamp") to display
that user in my game.

Then I use a custom function to retrieve current games info for a
particular user:

CREATE OR REPLACE FUNCTION words_get_games(in_uid integer)
RETURNS TABLE (
out_gid integer,
out_created integer,
out_finished integer,
out_player1 integer,
out_player2 integer,
out_played1 integer,
out_played2 integer,
out_score1 integer,
out_score2 integer,
out_hand1 text,
out_hand2 text,
out_letters varchar[15][15],
out_values integer[15][15],
out_bid integer,
out_last_tiles jsonb,
out_last_score integer
) AS
$func$
SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int,
EXTRACT(EPOCH FROM g.finished)::int,
g.player1,
g.player2, -- can be NULL
EXTRACT(EPOCH FROM g.played1)::int,
EXTRACT(EPOCH FROM g.played2)::int,
g.score1,
g.score2,
ARRAY_TO_STRING(g.hand1, ''),
REGEXP_REPLACE(ARRAY_TO_STRING(g.hand2, ''), '.', '?',
'g'),
g.letters,
g.values,
g.bid,
m.tiles,
m.score
FROM words_games g LEFT JOIN words_moves m USING(mid)
WHERE g.player1 = in_uid
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP -
INTERVAL '1 day')
UNION SELECT
g.gid,
EXTRACT(EPOCH FROM g.created)::int,
EXTRACT(EPOCH FROM g.finished)::int,
g.player2,
g.player1, -- can not be NULL
EXTRACT(EPOCH FROM g.played2)::int,
EXTRACT(EPOCH FROM g.played1)::int,
g.score2,
g.score1,
ARRAY_TO_STRING(g.hand2, ''),
REGEXP_REPLACE(ARRAY_TO_STRING(g.hand1, ''), '.', '?',
'g'),
g.letters,
g.values,
g.bid,
m.tiles,
m.score
FROM words_games g LEFT JOIN words_moves m USING(mid)
WHERE g.player2 = in_uid
AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP -
INTERVAL '1 day');

$func$ LANGUAGE sql;

I would like to extend the above custom function, so that user info
(given and last names, photo) is returned too.

How to approach this problem please, should I use CTE for this?

For player1, player2 or both?

Since you are returning a table from words_get_games() you can experiment
by joining it's output to words_social.

#4Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Alexander Farber (#3)
Re: Selecting records with highest timestamp - for a join

On 10/19/2016 12:44 PM, Alexander Farber wrote:

Adrian, for both player1 and player2 (because I need to display player
photos above the game board).

SQL join with words_social - yes, but how to take the most recent record
from that table?

For example there are user infos from Google+, Facebook, Twitter - but
the user has used Facebook to login lately and would expect her
Facebook-photo to be seen (the record with the highest "stamp" value).

I have not dug into your function deep enough to understand all the relationships
involved so I cannot offer anything specific. A generic method:

test[5432]=# create table ts_stamp_test(id serial PRIMARY KEY, uid integer, stamp integer NOT NULL);
CREATE TABLE
test[5432]=# insert into ts_stamp_test (uid, stamp) values (1, 5), (2, 10), (1, 12), (2, 15), (1, 18), (2, 30);
INSERT 0 6
test[5432]=# select * from ts_stamp_test;
id | uid | stamp
----+-----+-------
1 | 1 | 5
2 | 2 | 10
3 | 1 | 12
4 | 2 | 15
5 | 1 | 18
6 | 2 | 30
(6 rows)

test[5432]=# select * from ts_stamp_test where uid = 1 order by stamp desc limit 1;
id | uid | stamp
----+-----+-------
5 | 1 | 18
(1 row)

Regards
Alex

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Alexander Farber (#1)
Re: Selecting records with highest timestamp - for a join

On Wed, Oct 19, 2016 at 11:35 AM, Alexander Farber <
alexander.farber@gmail.com> wrote:

I.e. a user can have several records in the above table, but I always use
the most recent one (the one with the highest "stamp") to display that user
in my game.

​And if the second most recent has a picture but the most recent one does
not? Do you want to accept the missing value because its on a more recent
record or do you want to take the most recent non-missing value?

​Assuming "most recent not missing" and given:

PRIMARY KEY(sid, social)

​You basically want:

SELECT s_id, first_nonnull(photo ORDER BY stamp DESC, social)
FROM ...
GROUP BY s_id

You need to write a custom first_nonnull function that ignores NULL and a
custom aggregate to go along with it. Examples abound on the Internet.

Note that the Window function first_value doesn't quite do this...you want
to constrain the result to be non-null unless all candidate values are null
(or there are none).

If you have a unique index on (sid, stamp) you could solve the alternative
problem with a simple (sid, max(timestamptz) join back against the social
table.

David J.

#6Alexander Farber
alexander.farber@gmail.com
In reply to: David G. Johnston (#5)
Re: Selecting records with highest timestamp - for a join

Please let me rephrase my question so that it is better understandable -

In PostgreSQL 9.5.3 I keep player infos from various social networks:

# TABLE words_social;
sid | social | female | given | family | photo | place | stamp |
uid
-------+--------+--------+---------+--------+-------+-------+------------+-----
aaaaa | 1 | 0 | Abcde1 | | | | 1470237061 |
1
aaaaa | 2 | 0 | Abcde2 | | | | 1477053188 |
1
aaaaa | 3 | 0 | Abcde3 | | | | 1477053330 |
1
kkkkk | 3 | 0 | Klmnop3 | | | | 1477053810 |
2
kkkkk | 4 | 0 | Klmnop4 | | | | 1477053857 |
2
ggggg | 2 | 0 | Ghijk2 | | | | 1477053456 |
3
ggggg | 3 | 0 | Ghijk3 | | | | 1477053645 |
3
ggggg | 4 | 0 | Ghijk4 | | | | 1477053670 |
3
xxxxx | 4 | 0 | Xyzok | | | | 1470237393 |
4
(9 rows)

The 1,2,3,4 in column "social" means "Facebook", "Twitter", etc.

For a player I can always select her "most recent" info by:

# select * from words_social s1 WHERE stamp = (SELECT max(stamp) FROM
words_social s2 WHERE s1.uid = s2.uid);
sid | social | female | given | family | photo | place | stamp |
uid
-------+--------+--------+---------+--------+-------+-------+------------+-----
aaaaa | 3 | 0 | Abcde3 | | | | 1477053330 |
1
kkkkk | 4 | 0 | Klmnop4 | | | | 1477053857 |
2
ggggg | 4 | 0 | Ghijk4 | | | | 1477053670 |
3
xxxxx | 4 | 0 | Xyzok | | | | 1470237393 |
4
(4 rows)

Then there is another table storing current games (I have omitted some
columns with game data below):

# select gid, created, finished, player1, player2 from words_games;
gid | created | finished | player1 | player2
-----+-------------------------------+----------+---------+---------
1 | 2016-10-21 14:51:12.624507+02 | | 4 | 1
2 | 2016-10-21 14:51:22.631507+02 | | 3 |
(2 rows)

Whenever a user (for example with uid=1) connects to the server, I send her
the games she is taking part in:

# select gid, created, finished, player1, player2 from words_games where
player1=1
union select gid, created, finished, player2, player1 from words_games
where player2=1;
gid | created | finished | player1 | player2
-----+-------------------------------+----------+---------+---------
1 | 2016-10-21 14:51:12.624507+02 | | 4 | 1
(1 row)

My problem: to the above UNION SELECT statement I need to add user infos
from words_social table.

(So that I can display user photos and names above the game board)

So I try this with CTE:

# with user_infos AS (select * from words_social s1 WHERE stamp = (SELECT
max(stamp) FROM words_social s2 WHERE s1.uid = s2.uid))
select g.gid, g.created, g.finished, g.player1, g.player2,
i.given from words_games g join user_infos i on (g.player1=i.uid) where
g.player1=1
union select g.gid, g.created, g.finished, g.player2, g.player1, i.given
from words_games g join user_infos i on (g.player2=i.uid) where g.player2=1;
gid | created | finished | player1 | player2 | given
-----+-------------------------------+----------+---------+---------+--------
1 | 2016-10-21 14:51:12.624507+02 | | 1 | 4 | Abcde3
(1 row)

This works well (I have advanced since me first asking few days ago), but I
still have the following problem -

I am worried that the CTE-table user_infos will get very large, once my
game has many players.

How to rewrite my query, so that I fetch games and users (player1, player2)
for a certain user id (uid) - without making huge intermediate tables?

Thank you
Alex

#7Alexander Farber
alexander.farber@gmail.com
In reply to: Alexander Farber (#6)
Re: Selecting records with highest timestamp - for a join

Should I use LATERAL JOIN?

On Fri, Oct 21, 2016 at 3:24 PM, Alexander Farber <
alexander.farber@gmail.com> wrote:

Show quoted text

In PostgreSQL 9.5.3 I keep player infos from various social networks:

# TABLE words_social;
sid  | social | female |  given  | family | photo | place |   stamp    |
uid
-------+--------+--------+---------+--------+-------+-------
+------------+-----
aaaaa |      1 |      0 | Abcde1  |        |       |       | 1470237061 |
1
aaaaa |      2 |      0 | Abcde2  |        |       |       | 1477053188 |
1
aaaaa |      3 |      0 | Abcde3  |        |       |       | 1477053330 |
1
kkkkk |      3 |      0 | Klmnop3 |        |       |       | 1477053810 |
2
kkkkk |      4 |      0 | Klmnop4 |        |       |       | 1477053857 |
2
ggggg |      2 |      0 | Ghijk2  |        |       |       | 1477053456 |
3
ggggg |      3 |      0 | Ghijk3  |        |       |       | 1477053645 |
3
ggggg |      4 |      0 | Ghijk4  |        |       |       | 1477053670 |
3
xxxxx |      4 |      0 | Xyzok   |        |       |       | 1470237393 |
4
(9 rows)

The 1,2,3,4 in column "social" means "Facebook", "Twitter", etc.

For a player I can always select her "most recent" info by:

# select * from words_social s1 WHERE stamp = (SELECT max(stamp) FROM
words_social s2 WHERE s1.uid = s2.uid);
sid  | social | female |  given  | family | photo | place |   stamp    |
uid
-------+--------+--------+---------+--------+-------+-------
+------------+-----
aaaaa |      3 |      0 | Abcde3  |        |       |       | 1477053330 |
1
kkkkk |      4 |      0 | Klmnop4 |        |       |       | 1477053857 |
2
ggggg |      4 |      0 | Ghijk4  |        |       |       | 1477053670 |
3
xxxxx |      4 |      0 | Xyzok   |        |       |       | 1470237393 |
4
(4 rows)

Then there is another table storing current games (I have omitted some
columns with game data below):

# select gid, created, finished, player1, player2 from words_games;
gid | created | finished | player1 | player2
-----+-------------------------------+----------+---------+---------
1 | 2016-10-21 14:51:12.624507+02 | | 4 | 1
2 | 2016-10-21 14:51:22.631507+02 | | 3 |
(2 rows)

Whenever a user (for example with uid=1) connects to the server, I send
her the games she is taking part in:

# select gid, created, finished, player1, player2 from words_games where
player1=1
union select gid, created, finished, player2, player1 from words_games
where player2=1;
gid | created | finished | player1 | player2
-----+-------------------------------+----------+---------+---------
1 | 2016-10-21 14:51:12.624507+02 | | 4 | 1
(1 row)

My problem: to the above UNION SELECT statement I need to add user infos
from words_social table.

(So that I can display user photos and names above the game board)

So I try this with CTE:

# with user_infos AS (select * from words_social s1 WHERE stamp = (SELECT
max(stamp) FROM words_social s2 WHERE s1.uid = s2.uid))
select g.gid, g.created, g.finished, g.player1, g.player2,
i.given from words_games g join user_infos i on (g.player1=i.uid) where
g.player1=1
union select g.gid, g.created, g.finished, g.player2, g.player1, i.given
from words_games g join user_infos i on (g.player2=i.uid) where g.player2=1;
gid | created | finished | player1 | player2 | given
-----+-------------------------------+----------+---------+-
--------+--------
1 | 2016-10-21 14:51:12.624507+02 | | 1 | 4 |
Abcde3
(1 row)

This works well (I have advanced since me first asking few days ago), but
I still have the following problem -

I am worried that the CTE-table user_infos will get very large, once my
game has many players.