Thread: Is data import from Oracle table to Postgres table in Ora2pg consecutive or random?

Hello friends,

I want to migrate a large Oracle table to PostgreSQL, which is approximately 200GB in size and includes a LOB segment. This table has a primary key column named "ID," which increments by one unit (similar to a sequence). During the migration of this large table, sometimes the process terminates prematurely (due to network issues, etc.). As a result, I have to restart the import process either from the beginning or by using a WHERE clause in the ora2pg configuration file to import the remaining data from where it was previously terminated.

However, I've noticed that the import process in ora2pg doesn't seem to follow a consecutive order, starting from a small ID number and progressing to a larger ID number. Instead, it appears to import data randomly from various rows of the table. This makes it impossible to continue importing data from the point where it was terminated using a WHERE clause in ora2pg.

My questions are:

- Is it correct that ora2pg imports data from the Oracle table to the PostgreSQL table randomly and not sequentially from a smaller ID to a larger ID?
- Or can we add additional parameters to the ora2pg.conf file to control this process and ensure that the data is imported sequentially, following the primary key from smallest to largest?
Thank you,
Hans

Hello friends,

I want to migrate a large Oracle table to PostgreSQL, which is approximately 200GB in size and includes a LOB segment. This table has a primary key column named "ID," which increments by one unit (similar to a sequence). During the migration of this large table, sometimes the process terminates prematurely (due to network issues, etc.). As a result, I have to restart the import process either from the beginning or by using a WHERE clause in the ora2pg configuration file to import the remaining data from where it was previously terminated.

However, I've noticed that the import process in ora2pg doesn't seem to follow a consecutive order, starting from a small ID number and progressing to a larger ID number. Instead, it appears to import data randomly from various rows of the table. This makes it impossible to continue importing data from the point where it was terminated using a WHERE clause in ora2pg.

My questions are:

- Is it correct that ora2pg imports data from the Oracle table to the PostgreSQL table randomly and not sequentially from a smaller ID to a larger ID?
- Or can we add additional parameters to the ora2pg.conf file to control this process and ensure that the data is imported sequentially, following the primary key from smallest to largest?
Thank you,
Hans

Re: Is data import from Oracle table to Postgres table in Ora2pg consecutive or random?

From
"Peter J. Holzer"
Date:
On 2023-09-08 17:19:01 +0700, duc hiep ha wrote:
> I want to migrate a large Oracle table to PostgreSQL, which is approximately
> 200GB in size and includes a LOB segment. This table has a primary key column
> named "ID," which increments by one unit (similar to a sequence). During the
> migration of this large table, sometimes the process terminates prematurely
> (due to network issues, etc.). As a result, I have to restart the import
> process either from the beginning or by using a WHERE clause in the ora2pg
> configuration file to import the remaining data from where it was previously
> terminated.
>
> However, I've noticed that the import process in ora2pg doesn't seem to follow
> a consecutive order, starting from a small ID number and progressing to a
> larger ID number. Instead, it appears to import data randomly from various rows
> of the table. This makes it impossible to continue importing data from the
> point where it was terminated using a WHERE clause in ora2pg.
>
> My questions are:
>
> - Is it correct that ora2pg imports data from the Oracle table to the
> PostgreSQL table randomly and not sequentially from a smaller ID to a larger
> ID?

Oracle doesn't in general order tables by primary key (it does offer
"index organized tables", but the default is "heap organized"), so a
simple "select * from tablename" will return rows in semi-random order.

I think the rowid is in ascending order (but I can't test that at the
moment) so you may be able to use the rowid in your where clause.

> - Or can we add additional parameters to the ora2pg.conf file to control this
> process and ensure that the data is imported sequentially, following the
> primary key from smallest to largest?

I can't answer that.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment
El día martes, septiembre 12, 2023 a las 02:19:19 +0200, Peter J. Holzer escribió:

> > - Or can we add additional parameters to the ora2pg.conf file to control this
> > process and ensure that the data is imported sequentially, following the
> > primary key from smallest to largest?

AFAIK, a simple SELECT in PostgreSQL without an ORDER BY will return the rows
in random order.

    matthias

-- 
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub



Re: Is data import from Oracle table to Postgres table in Ora2pg consecutive or random?

From
Daniel Gustafsson
Date:
> On 12 Sep 2023, at 14:26, Matthias Apitz <guru@unixarea.de> wrote:
>
> El día martes, septiembre 12, 2023 a las 02:19:19 +0200, Peter J. Holzer escribió:
>
>>> - Or can we add additional parameters to the ora2pg.conf file to control this
>>> process and ensure that the data is imported sequentially, following the
>>> primary key from smallest to largest?
>
> AFAIK, a simple SELECT in PostgreSQL without an ORDER BY will return the rows
> in random order.

It will return the rows in some order, without guarantees about randomness or
asc/desc ordering.  Any query which relies on ordering, whichever is wanted,
should include an ORDER BY clause.

--
Daniel Gustafsson




I have not tried this in a while but I think a SELECT with a "hint" will return rows in the order of the index in the
hint. This does NOT work for distributed queries. 


> On 09/12/2023 9:10 AM EDT Daniel Gustafsson <daniel@yesql.se> wrote:
>
>
> > On 12 Sep 2023, at 14:26, Matthias Apitz <guru@unixarea.de> wrote:
> >
> > El día martes, septiembre 12, 2023 a las 02:19:19 +0200, Peter J. Holzer escribió:
> >
> >>> - Or can we add additional parameters to the ora2pg.conf file to control this
> >>> process and ensure that the data is imported sequentially, following the
> >>> primary key from smallest to largest?
> >
> > AFAIK, a simple SELECT in PostgreSQL without an ORDER BY will return the rows
> > in random order.
>
> It will return the rows in some order, without guarantees about randomness or
> asc/desc ordering.  Any query which relies on ordering, whichever is wanted,
> should include an ORDER BY clause.
>
> --
> Daniel Gustafsson