Is data import from Oracle table to Postgres table in Ora2pg consecutive or random?
Hello friends,
I want to migrate a large Oracle table to PostgreSQL, which is
approximately 200GB in size and includes a LOB segment. This table has a
primary key column named "ID," which increments by one unit (similar to a
sequence). During the migration of this large table, sometimes the process
terminates prematurely (due to network issues, etc.). As a result, I have
to restart the import process either from the beginning or by using a WHERE
clause in the ora2pg configuration file to import the remaining data from
where it was previously terminated.
However, I've noticed that the import process in ora2pg doesn't seem to
follow a consecutive order, starting from a small ID number and progressing
to a larger ID number. Instead, it appears to import data randomly from
various rows of the table. This makes it impossible to continue importing
data from the point where it was terminated using a WHERE clause in ora2pg.
My questions are:
- Is it correct that ora2pg imports data from the Oracle table to the
PostgreSQL table randomly and not sequentially from a smaller ID to a
larger ID?
- Or can we add additional parameters to the ora2pg.conf file to control
this process and ensure that the data is imported sequentially, following
the primary key from smallest to largest?
Thank you,
Hans
Show quoted text
Hello friends,
I want to migrate a large Oracle table to PostgreSQL, which is
approximately 200GB in size and includes a LOB segment. This table has a
primary key column named "ID," which increments by one unit (similar to a
sequence). During the migration of this large table, sometimes the process
terminates prematurely (due to network issues, etc.). As a result, I have
to restart the import process either from the beginning or by using a WHERE
clause in the ora2pg configuration file to import the remaining data from
where it was previously terminated.However, I've noticed that the import process in ora2pg doesn't seem to
follow a consecutive order, starting from a small ID number and progressing
to a larger ID number. Instead, it appears to import data randomly from
various rows of the table. This makes it impossible to continue importing
data from the point where it was terminated using a WHERE clause in ora2pg.My questions are:
- Is it correct that ora2pg imports data from the Oracle table to the
PostgreSQL table randomly and not sequentially from a smaller ID to a
larger ID?
- Or can we add additional parameters to the ora2pg.conf file to control
this process and ensure that the data is imported sequentially, following
the primary key from smallest to largest?
Thank you,
Hans
On 2023-09-08 17:19:01 +0700, duc hiep ha wrote:
I want to migrate a large Oracle table to PostgreSQL, which is approximately
200GB in size and includes a LOB segment. This table has a primary key column
named "ID," which increments by one unit (similar to a sequence). During the
migration of this large table, sometimes the process terminates prematurely
(due to network issues, etc.). As a result, I have to restart the import
process either from the beginning or by using a WHERE clause in the ora2pg
configuration file to import the remaining data from where it was previously
terminated.However, I've noticed that the import process in ora2pg doesn't seem to follow
a consecutive order, starting from a small ID number and progressing to a
larger ID number. Instead, it appears to import data randomly from various rows
of the table. This makes it impossible to continue importing data from the
point where it was terminated using a WHERE clause in ora2pg.My questions are:
- Is it correct that ora2pg imports data from the Oracle table to the
PostgreSQL table randomly and not sequentially from a smaller ID to a larger
ID?
Oracle doesn't in general order tables by primary key (it does offer
"index organized tables", but the default is "heap organized"), so a
simple "select * from tablename" will return rows in semi-random order.
I think the rowid is in ascending order (but I can't test that at the
moment) so you may be able to use the rowid in your where clause.
- Or can we add additional parameters to the ora2pg.conf file to control this
process and ensure that the data is imported sequentially, following the
primary key from smallest to largest?
I can't answer that.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
El día martes, septiembre 12, 2023 a las 02:19:19 +0200, Peter J. Holzer escribió:
- Or can we add additional parameters to the ora2pg.conf file to control this
process and ensure that the data is imported sequentially, following the
primary key from smallest to largest?
AFAIK, a simple SELECT in PostgreSQL without an ORDER BY will return the rows
in random order.
matthias
--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub
On 12 Sep 2023, at 14:26, Matthias Apitz <guru@unixarea.de> wrote:
El día martes, septiembre 12, 2023 a las 02:19:19 +0200, Peter J. Holzer escribió:
- Or can we add additional parameters to the ora2pg.conf file to control this
process and ensure that the data is imported sequentially, following the
primary key from smallest to largest?AFAIK, a simple SELECT in PostgreSQL without an ORDER BY will return the rows
in random order.
It will return the rows in some order, without guarantees about randomness or
asc/desc ordering. Any query which relies on ordering, whichever is wanted,
should include an ORDER BY clause.
--
Daniel Gustafsson
I have not tried this in a while but I think a SELECT with a "hint" will return rows in the order of the index in the hint. This does NOT work for distributed queries.
Show quoted text
On 09/12/2023 9:10 AM EDT Daniel Gustafsson <daniel@yesql.se> wrote:
On 12 Sep 2023, at 14:26, Matthias Apitz <guru@unixarea.de> wrote:
El día martes, septiembre 12, 2023 a las 02:19:19 +0200, Peter J. Holzer escribió:
- Or can we add additional parameters to the ora2pg.conf file to control this
process and ensure that the data is imported sequentially, following the
primary key from smallest to largest?AFAIK, a simple SELECT in PostgreSQL without an ORDER BY will return the rows
in random order.It will return the rows in some order, without guarantees about randomness or
asc/desc ordering. Any query which relies on ordering, whichever is wanted,
should include an ORDER BY clause.--
Daniel Gustafsson