Fetching last n records from Posgresql

Started by Deole, Pushkar (Pushkar)about 10 years ago9 messagesgeneral
Jump to latest

Hi,

Does PostgreSQL support a query to fetch last 'n' records that match the selection criteria. I am trying to fetch records from a table with start date that falls in last 30 days, however, I want to fetch the oldest 'n' records and not the recent ones. I know there is a LIMIT clause which I can use but it will fetch the first 'n' records.
I came across an approach which says that I can reverse the order and then use LIMIT and then order the records back using timestamp as below, but looking at the execution plan, it has to do a sort twice which may affect the performance of query if 'n' is large number:

WITH t AS (
SELECT * FROM mytable ORDER BY record_date ASC LIMIT 5
)
SELECT * FROM t ORDER BY record_date DESC;

Any thoughts/opinions?

Thanks,
Pushkar

#2John R Pierce
pierce@hogranch.com
In reply to: Deole, Pushkar (Pushkar) (#1)
Re: Fetching last n records from Posgresql

On 3/29/2016 11:40 PM, Deole, Pushkar (Pushkar) wrote:

WITH t AS (

SELECT * FROM mytable ORDER BY record_date ASC LIMIT 5

)

SELECT * FROM t ORDER BY record_date DESC;

why do it twice when you can just do....

select * from t order by record_date desc limit 5;

--
john r pierce, recycling bits in santa cruz

In reply to: John R Pierce (#2)
Re: Fetching last n records from Posgresql

select * from t order by record_date desc limit 5;

this will return the recent 5 records.. what I want is the oldest 5 records (in last 30 days)

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of John R Pierce
Sent: Wednesday, March 30, 2016 12:38 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Fetching last n records from Posgresql

On 3/29/2016 11:40 PM, Deole, Pushkar (Pushkar) wrote:

WITH t AS (
SELECT * FROM mytable ORDER BY record_date ASC LIMIT 5
)
SELECT * FROM t ORDER BY record_date DESC;

why do it twice when you can just do....

select * from t order by record_date desc limit 5;

--

john r pierce, recycling bits in santa cruz

#4Sándor Daku
daku.sandor@gmail.com
In reply to: Deole, Pushkar (Pushkar) (#3)
Re: Fetching last n records from Posgresql

On 30 March 2016 at 11:19, Deole, Pushkar (Pushkar) <pdeole@avaya.com>
wrote:

select * from t order by record_date desc limit 5;

this will return the recent 5 records.. what I want is the oldest 5
records (in last 30 days)

*From:* pgsql-general-owner@postgresql.org [mailto:
pgsql-general-owner@postgresql.org] *On Behalf Of *John R Pierce
*Sent:* Wednesday, March 30, 2016 12:38 PM
*To:* pgsql-general@postgresql.org
*Subject:* Re: [GENERAL] Fetching last n records from Posgresql

On 3/29/2016 11:40 PM, Deole, Pushkar (Pushkar) wrote:

WITH t AS (

SELECT * FROM mytable ORDER BY record_date ASC LIMIT 5

)

SELECT * FROM t ORDER BY record_date DESC;

why do it twice when you can just do....

select * from t order by record_date desc limit 5;
--

john r pierce, recycling bits in santa cruz

Then:

select * from t where record_date>now()-'30 day'::interval order by
record_date limit 5;

Regards,
Sándor

Ezt az e-mailt egy Avast védelemmel rendelkező, vírusmentes számítógépről
küldték.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&amp;utm_source=link&amp;utm_campaign=sig-email&amp;utm_content=webmail&gt;
<#DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

#5Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Deole, Pushkar (Pushkar) (#3)
Re: Fetching last n records from Posgresql

<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Il 30/03/2016 11:19, Deole, Pushkar
(Pushkar) ha scritto:<br>
</div>
<blockquote
cite="mid:78A4D6BFBAA5BA49A5E94DA00A6A76E309864466@AZ-FFEXMB04.global.avaya.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 14 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";
color:black;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";
color:black;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:"Consolas","serif";
color:black;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">select * from t order by record_date desc
limit 5;<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">this will return the recent 5 records..
what I want is the oldest 5 records (in last 30 days)</p>
</div>
</blockquote>
so remove "desc", in order to have ascending ordering, thus first 5
records are the five oldest:<br>
<br>
select * from t order by record_date limit 5<br>
<br>
Cheers<br>
Moreno.-<br>
<br>
<br>
</body>
</html>

In reply to: Moreno Andreo (#5)
Re: Fetching last n records from Posgresql

I am sorry I didn’t clarify my requirement properly.. I want the ‘n’ oldest records, however, they should sorted with the recent record first and I want this to happen in the query itself so I don’t have to care about sorting through the application..

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Moreno Andreo
Sent: Wednesday, March 30, 2016 3:03 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Fetching last n records from Posgresql

Il 30/03/2016 11:19, Deole, Pushkar (Pushkar) ha scritto:
select * from t order by record_date desc limit 5;

this will return the recent 5 records.. what I want is the oldest 5 records (in last 30 days)
so remove "desc", in order to have ascending ordering, thus first 5 records are the five oldest:

select * from t order by record_date limit 5

Cheers
Moreno.-

#7Moreno Andreo
moreno.andreo@evolu-s.it
In reply to: Deole, Pushkar (Pushkar) (#6)
Re: Fetching last n records from Posgresql

<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Il 30/03/2016 11:36, Deole, Pushkar
(Pushkar) ha scritto:<br>
</div>
<blockquote
cite="mid:78A4D6BFBAA5BA49A5E94DA00A6A76E309864486@AZ-FFEXMB04.global.avaya.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 14 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";
color:black;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";
color:black;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Consolas;
color:black;}
span.EmailStyle19
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D">I
am sorry I didn’t clarify my requirement properly.. I want
the ‘n’ oldest records, however, they should sorted with the
recent record first and I want this to happen in the query
itself so I don’t have to care about sorting through the
application..<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF
1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;;color:windowtext">From:</span></b><span
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;;color:windowtext">
<a class="moz-txt-link-abbreviated" href="mailto:pgsql-general-owner@postgresql.org">pgsql-general-owner@postgresql.org</a>
[<a class="moz-txt-link-freetext" href="mailto:pgsql-general-owner@postgresql.org">mailto:pgsql-general-owner@postgresql.org</a>]
<b>On Behalf Of </b>Moreno Andreo<br>
<b>Sent:</b> Wednesday, March 30, 2016 3:03 PM<br>
<b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:pgsql-general@postgresql.org">pgsql-general@postgresql.org</a><br>
<b>Subject:</b> Re: [GENERAL] Fetching last n records
from Posgresql<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">Il 30/03/2016 11:19, Deole, Pushkar
(Pushkar) ha scritto:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal">select * from t order by record_date desc
limit 5;<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">this will return the recent 5 records..
what I want is the oldest 5 records (in last 30 days)<o:p></o:p></p>
</blockquote>
<p class="MsoNormal" style="margin-bottom:12.0pt">so remove
"desc", in order to have ascending ordering, thus first 5
records are the five oldest:<br>
<br>
select * from t order by record_date limit 5<br>
<br>
Cheers<br>
Moreno.-<br>
<br>
<o:p></o:p></p>
</div>
</blockquote>
select * from (select * from t where record_date &gt;=current_date()
- '30 days' order by record_date limit 5) order by record_date desc<br>
<br>
Not tested, but should work... <br>
<br>
Cheers<br>
Moreno.-<br>
</body>
</html>

#8Sándor Daku
daku.sandor@gmail.com
In reply to: Deole, Pushkar (Pushkar) (#6)
Re: Fetching last n records from Posgresql

On 30 March 2016 at 11:36, Deole, Pushkar (Pushkar) <pdeole@avaya.com>
wrote:

I am sorry I didn’t clarify my requirement properly.. I want the ‘n’
oldest records, however, they should sorted with the recent record first
and I want this to happen in the query itself so I don’t have to care about
sorting through the application..

*From:* pgsql-general-owner@postgresql.org [mailto:
pgsql-general-owner@postgresql.org] *On Behalf Of *Moreno Andreo
*Sent:* Wednesday, March 30, 2016 3:03 PM
*To:* pgsql-general@postgresql.org
*Subject:* Re: [GENERAL] Fetching last n records from Posgresql

Il 30/03/2016 11:19, Deole, Pushkar (Pushkar) ha scritto:

select * from t order by record_date desc limit 5;

this will return the recent 5 records.. what I want is the oldest 5
records (in last 30 days)

so remove "desc", in order to have ascending ordering, thus first 5
records are the five oldest:

select * from t order by record_date limit 5

Cheers
Moreno.-

Slight modification then...

select * from (select * from t where record_date>now()-'30 day'::interval
order by record_date limit 5) as t order by record_date desc;

Regards,
Sándor

Ezt az e-mailt egy Avast védelemmel rendelkező, vírusmentes számítógépről
küldték.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&amp;utm_source=link&amp;utm_campaign=sig-email&amp;utm_content=webmail&gt;
<#DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

#9Francisco Olarte
folarte@peoplecall.com
In reply to: Deole, Pushkar (Pushkar) (#1)
Re: Fetching last n records from Posgresql

Hi Pushkar:

On Wed, Mar 30, 2016 at 8:40 AM, Deole, Pushkar (Pushkar)
<pdeole@avaya.com> wrote:

Does PostgreSQL support a query to fetch last ‘n’ records that match the
selection criteria. I am trying to fetch records from a table with start
date that falls in last 30 days, however, I want to fetch the oldest ‘n’
records and not the recent ones. I know there is a LIMIT clause which I can
use but it will fetch the first ‘n’ records.

I came across an approach which says that I can reverse the order and then
use LIMIT and then order the records back using timestamp as below, but
looking at the execution plan, it has to do a sort twice which may affect
the performance of query if ‘n’ is large number:

To get at the last N records you generally have to approaches, read all of them

Do you have indexes on the record date? Because in this case it seems
that could be solved by a reverse index scan, In my case with a
somehow big table:

$ \d carrier_cdrs_201603
Table "public.carrier_cdrs_201603"
Column | Type | Modifiers
---------+--------------------------+-----------
...
setup | timestamp with time zone |
...
Indexes:
"idx_carrier_cdrs_201603_setup" btree (setup)
...
$ explain select * from carrier_cdrs_201603 order by setup desc limit 1000;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..46.25 rows=1000 width=81)
-> Index Scan Backward using idx_carrier_cdrs_201603_setup on
carrier_cdrs_201603 (cost=0.42..33585.03 rows=732883 width=81)
(2 rows)

$ explain select * from ( select * from carrier_cdrs_201603 order by
setup desc limit 1000 ) last_1000 order by setup;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=106.08..108.58 rows=1000 width=81)
Sort Key: carrier_cdrs_201603.setup
-> Limit (cost=0.42..46.25 rows=1000 width=81)
-> Index Scan Backward using idx_carrier_cdrs_201603_setup
on carrier_cdrs_201603 (cost=0.42..33585.03 rows=732883 width=81)
(4 rows)

$ explain with last_1000 as ( select * from carrier_cdrs_201603 order
by setup desc limit 1000 ) select * from last_1000 order by setup;
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=116.08..118.58 rows=1000 width=184)
Sort Key: last_1000.setup
CTE last_1000
-> Limit (cost=0.42..46.25 rows=1000 width=81)
-> Index Scan Backward using idx_carrier_cdrs_201603_setup
on carrier_cdrs_201603 (cost=0.42..33585.03 rows=732883 width=81)
-> CTE Scan on last_1000 (cost=0.00..20.00 rows=1000 width=184)
(6 rows)

The faster for me seems to be the subquery way, with timings and
usaing 10k records it says:

$ explain analyze select * from ( select * from carrier_cdrs_201603
order by setup desc limit 10000 ) last_10000 order by setup;

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1223.09..1248.09 rows=10000 width=81) (actual
time=29.646..35.780 rows=10000 loops=1)
Sort Key: carrier_cdrs_201603.setup
Sort Method: quicksort Memory: 1791kB
-> Limit (cost=0.42..458.70 rows=10000 width=81) (actual
time=0.015..20.707 rows=10000 loops=1)
-> Index Scan Backward using idx_carrier_cdrs_201603_setup
on carrier_cdrs_201603 (cost=0.42..33627.38 rows=733773 width=81)
(actual time=0.013..8.835 rows=10000 loops=1)
Total runtime: 41.913 ms
(6 rows)

And I fear its scanning and feeding into the sort, and accounting for
a part of the scan time in the sort phase as just the inner query
gives:

$ explain analyze select * from carrier_cdrs_201603 order by setup
desc limit 10000 ;

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..458.70 rows=10000 width=81) (actual
time=0.015..20.938 rows=10000 loops=1)
-> Index Scan Backward using idx_carrier_cdrs_201603_setup on
carrier_cdrs_201603 (cost=0.42..33627.38 rows=733773 width=81)
(actual time=0.013..8.803 rows=10000 loops=1)
Total runtime: 27.020 ms
(3 rows)

So, 14 ms to sort 10k records seems like a reasonable price to pay.

As you see, only one sort, in whichever order I do it, and postgres
sorts really fast. This is very difficult to avoid. A smarter
optimizer could turn the sort into a reverse, but it seems difficult.
Or you could try to use a cursor, goto to the last record, and then
skip N backwards and go fro there, but IMHO it's not worth the
complexity, and, at least in my case, it is slower for 1000 records,
but YMMV

$ begin;
BEGIN
Time: 61.229 ms
$ declare last_1000 scroll cursor for select * from
carrier_cdrs_201603 order by setup;
DECLARE CURSOR
Time: 61.025 ms
$ move last in last_1000;
MOVE 1
Time: 282.142 ms
$ move backward 1000 in last_1000;
MOVE 1000
Time: 61.969 ms
$ fetch all from last_1000;
Time: 248.071 ms
$ close last_1000;
CLOSE CURSOR
Time: 60.922 ms
$ commit;
COMMIT
Time: 60.814 ms

Note how once you account for my 60ms RTT It's taking 220 ms to go to
the end, and 188 to fetch the result, while:

cdrs=# select * from ( select * from carrier_cdrs_201603 order by
setup desc limit 1000 ) last_1000 order by setup;
Time: 248.566 ms

I can do the select in just 188 too. ( This are just 1000 records, but
without explain analyze a nice chunk of the time is spent sending them
over my 60 ms RTT connection ).

Anyway, try things, measure, post results so we know what happens.

Francisco Olarte.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general