Lessons From OOW09 #3 – What’s New with Streams?

The big news with streams is the Golden Gate acquisition. Oracle acquired Golden Gate, a smaller company that has a solution very similar to what Oracle Streams is doing. During OpenWorld Oracle announced that Golden Gate is the new strategic direction and that future development will be towards integrating Streams with Golden Gate. Oracle will still support streams, fix bugs, etc – but don’t expect new features other than the Golden Gate integration.

I missed the session where Golden Gate explain what its all about, but I’m planning to invite Golden Gate representative to give us a presentation and explain exactly what they do.

In the mean while, interesting new stuff with the old streams:

  1. One to many replication should be faster and more reliable in 11gR2.
  2. New statement DML handlers – allows manipulating LCRs using SQL statements instead of the PL/SQL code used in the past. According to Oracle it should be 4 times faster this way. One common use case of DML handlers that can now be implemented with the statement handlers is converting “delete” statements into an update that marks a row as “deleted” by modifying a varchar in a column.
  3. Keep Columns – new rule based transformation that allows you to specify which columns should be preserved in an LCR. In the past you can specify columns to drop, but not which columns to keep.
  4. Built in script for recording changes in a log table. This is an incredibly common use case for Streams, and Oracle now has a single command that automatically sets up the necessary statement handles and keep columns. Just call DBMS_STREAMS_ADM.MAINTAIN_CHANGE_TABLE.
  5. XSTREAM – Oracle says its a new API for allowing external programs to insert and pull changes from streams. Its cool, but I’m not convinced its new, since I’ve heard about a similar feature in previous versions under a different name.
  6. Streams performance advisor – thats an 11gR1 feature, but I didn’t know about it. Its a PL/SQL package and a bunch of views that can be used to report on streams performance. It should also be able to detect your topology automatically. You use this by running DBMS_PERFORMANCE_ADVISOR_ADM.ANALYZE_CURRENT_PERFORMANCE and checking a bunch of views. Documented here.
  7. If I followed correctly, the new advisor package uses the UTL_SPADV package, that you can also use for performance analysis or monitoring.
  8. DBMS_COMPARISON – a package that can be used to compare tables between two servers (one should be 11g and the other 10g or 11g). Can be useful when troubleshooting streams.
  9. Streams cross DB tracing – allows tracking a message when trying to troubleshoot apply that doesn’t work properly. You enable message tracking in a session (set message tracking) and then you monitor the views to see the actions that happen to it.
  10. 11g has greatly improved and more detailed apply error messages. This is probably my favorite new feature 🙂 Most of the time I no longer need to print the LCR to debug the issue.
  11. Not sure if this is 11gR1 or R2 – but apparently propagation is no longer managed by “jobs” but has now moved to the new scheduler, making it much more manageable.
  12. Bunch of nice improvements. I’m looking forward to seeing what Golden Gate is doing and why it is so much better.


Streams on RAC

We had RAC system as streams source for 18 month now. But just today I configured a RAC as streams target.

It was somewhat of an anticlimax since there is absolutely nothing interesting to do.

We do downstream capture, so I had to place the archive logs on a shared drive. Every place where I used the SID before, I now used the DB name.

Capture and Apply processes both started on the same node. When I stopped that node, I saw the other node modifying a service called SYS${streams user}.{streams queue name}.{target db name}’ , changing it to run on the remaining node.

Then the capture and apply processes started on the remaining node and everything continued as usual.

As I said, no big deal. I just wanted to let everyone know that it is no big deal.

When Archive Logs Are Not Copied To Standby

I debug this same issue at least once a month since January 2008, and I always debug it from scratch because I’m a DBA with bad memory and I never remember how I solved it the last 18 times.

So, if you added a second (or third) archive log destination and then you see errors in the alter log that look like:

Check that the primary and standby are using a password file
and remote_login_passwordfile is set to SHARED or EXCLUSIVE,
and that the SYS password is same in the password files.
returning error ORA-16191


  1. That you are using password files
  2. That the name of the password file is $ORACLE_HOME/dbs/orapw${ORACLE_SID}
  3. That the password of sys is the same on all the servers involved in the archive copy
  4. If one of the databases is 11g, make sure you set sec_case_sensitive_logon to false
  5. Also check that you created the password file with ignore_case=y

I think this covers my usual debug process. Except it normally takes me an hour to remember that 11g has this thing with case sensitive passwords, and what the parameter is called, and that there is another thingy to do with the password files… Hopefully, not anymore.

A Year with Streams

Whenever Tom Kyte is asked about replication, he has a default answer: Don’t.

Well, last month marked our first year anniversary of our use of streams based replication to create a reporting DB.

In many ways, I’m very proud of this solution. Our reporting DB allows our software-as-a-service to offer very flexible integrations with all kinds of reporting systems without having us worry about impact on our production environment. Our senior DBA designed a terrific architecture that allows for relatively pain-free maintenance on the system. We have several customers using this solution and it seems to be making some profit.

For a year anniversary, we celebrate by announcing a formal SLA for the service. SLA discussions are always interesting. We are looking at 3 important components:

  1. Setup times – How long after the customer requested this service do we expect to deliver it?
  2. Availability – How many hours a month do we expect the service to be down?
  3. Time to replication – A change was made on production. How soon can we show it on the reporting DB?

Setup times are our most volatile component – we range between 15 minutes when everything goes well, to around 2 month when we run into an Oracle bug, Oracle offers no work-around and patch after patch fails to solve the problem.

With that kind of variance, offering any kind of SLA seems a bit futile. Next week I’m at HotSos, Maybe Robyn Sands has the solution. BTW. I find it incredible that until last month I’ve never heard of Robyn Sands, and now not only I hear about her all over the place, the more I hear about her, the more I like her. A lady who’s an Oracle pro, knows her stats and wants to stop root cause madness. I can’t wait to meet her.

Back to the point,  we don’t see how we can offer an SLA on the setup time, which is bad. What about the availability after the service is up? We are doing much better here, in the sense that downtimes are predictable. The problem is that we have many of them. We have mandatory 2 hours a month due to an Oracle bug that requires us to rebuild the environment from scratch every month. We also have an additional monthly downtime for applying the various patches we need in order to solve other bugs.

In short, Streams is still buggy. So even if you design your replication to be very safe, put primary keys on all your tables, and get everything else correct, you still have trouble maintaining high availability and predictable setup times. This is for a product that has been out since 9.*

What about replication times? Here, I feel like the guy who jumped off the 50th floor window. Somewhere around floor 30 someone else looks out his window and asks him “hey! how are you doing?” and he replies “So far, so good!”. So far, we’ve been able to resolve all the performance problems we had with replication (#1 tip – don’t use “like” in replication rules).

With all the troubles we are having with Streams, would I use this solution again? I think that given all the constraints we had to work with, replicating data to a reporting DB was the best solution.

I would have preferred a different solution. I would have liked our product to have a strong, consistent, web services API that would allow integration with external systems without any need to a direct access to the database. Unfortunately, the web services API  arrived late, and we have to support versions that don’t have it yet. Also, integrating with web services API seems to require more work than database level integration and business was very eager to offer an easier solution.

Another solution that might have been preferable (although I’m less sure), is a reporting utility where we could load customized queries for our customers, run them as a low priority process on production and send the customers the results. The problem with this solution is that it may require more work than Streams (keeping an eye on all those custom code running on production), and guaranteeing response times for the reports would have been an issue. With the reporting database we somehow managed not to have any guarantee regarding performance of the reports.

To sum things up, when streams is good it is very very good, and when it is bad, it is still somewhat better than the alternatives. At least it gives me lots of blog material.

Streams without Primary Key

We all know that every table should have a primary key. It is one of those database-architecture-101 basics. You need it for data integrity and you need it for cross references between tables. DBAs and developers argue about many things, but I’ve yet to hear anyone argue that primary key is not needed.

Which is why I was very surprised to discover a schema for an application that contained 200 tables without primary keys. Apperantly the application was not very picky and had no problem with duplicate values in these tables in these tables. The tables were tiny, so there were no performance issues. I decided to leave this issue alone, and all was well.

Until we decided to use Oracle Streams to replicate the schema to another DB.

Oracle Documentation says:

“If possible, each table for which changes are applied by an apply process should have a primary key. When a primary key is not possible, Oracle recommends that each table have a set of columns that can be used as a unique identifier for each row of the table. If the tables that you plan to use in your Oracle Streams environment do not have a primary key or a set of unique columns, then consider altering these tables accordingly.”

And then it says:

“In the absence of substitute key columns, primary key constraints, and unique key constraints, an apply process uses all of the columns in the table as the key columns, excluding LOB, LONG, and LONG RAW columns. In this case, you must create an unconditional supplemental log group containing these columns at the source database. Using substitute key columns is preferable when there is no primary key constraint for a table because fewer columns are needed in the row LCR.”

I think the documentation doesn’t make a clear enough statement here. Let me restate what they are trying to say:

  1. You cannot use Oracle Streams to replicate a table which has duplicate rows. It will not work.
  2. If you don’t have unique constraint on your table in some form, you will have duplicate rows.

The first sentence is important. Streams replication will not work with duplicates. There is no way around it. Here’s why:

Remember that with stream replications you are taking DML transactions that run on one schema (I’ll ignore DDLs for now as it is irrelevant to this topic) and attempt to run them on another schema. The way it is done is by mining the redo log and converting DML statements into LCRs. DML LCRs are also called ROW LCRs, and for a good reason – each DML statement is converted into one or more LCRs, each changing one row at most.

How does the apply process knows which row? Remember that you cannot use the ROWID, because the change will be applied in a different schema.

If each row is uniquely identified by a set of columns, and if you configured streams correctly by enabling supplemental logging of these columns, then the information in the supplamental logging will be placed in the LCR, and will be used to uniquely identified the row you try to modify.

But if the information in the DML statement and in the supplamental logging is not enough to uniquely identify the row, perhaps because the row is not unique at all, because your application allows duplicates. Then apply will fail and in the apply error queue you will see error  ora-01422: exact fetch returns more than requested number of rows. Because apply used all the information it had, expecting it to define a single row, but due to the duplicates it found more than one row. It cannot apply a row LCR to more than one row, so it fails.

Oracle does offer a workaround of sorts – Apply has a parameter called allow_duplicate_rows. Setting it to true makes the apply process update or delete one of the duplicates it found. Just one of them. The original DML updated or deleted all of them.

Lets reiterate – if you have duplicate rows in your original table, the streams replication of the table will have these duplicates as well.

If you try to update or delete these rows in the original table, you will update or delete all of the identical rows. In the replica, however, either apply will fail and you’ll not delete any row, or it will succeed and delete only one of the rows. Depending on the allow_duplicate_rows parameter. In either case, your replicated table is no longer identical to the original. Because Oracle Streams replication does not work with duplicate data.

Troubleshooting Streams @ Openworld Unconference

Thanks for everyone who attended the session – you were attentive, intelligent and supportive audience, I couldn’t have hoped for better. I was especially stoked to see Lewis Cunningham in the audience, since he i an expert on the topic. Thank you Lewis for giving encouraging nods throughout the session 🙂
I definitely got an appetite for speaking a bit more, and I’m now furiously scribbling and sending abstracts. I hope to see you in my future presentations. 

I promised to upload my material, so here we go:


  • Powerpoint (including the component diagram)
  • Script for creating the replication environment (intentionally buggy!) and also (working) script for removing the replication. This creates replication from HR schema to a new schema called MYHR, but only  replicates one table.
  • Script with troubleshooting queries used in the session.

Streams Replication At Oracle Openworld Unconference

I’m planning to give a session about Oracle Streams at the Openworld Unconference. Thursday at 1pm.

I’ll give a short “intro to streams” and then I’ll demonstrate basic troubleshooting techniques. I have streams configured on my laptop, and I’m planning to break it repeatedly for your amusement 🙂

Please drop by. Oracle Unconference looks very empty this year, and I’m worried about an empty room even more than I’m worried about speaking in front of people. I’ll be very happy to answer lots of audiance questions, even if not directly related to streams.

Streams and Grants

I didn’t blog about streams in a while. Since the last time I blogged about streams our configuration changed a lot, but I’ll write about it another time. Its more fun to write about war stories than about nice clean architectures with ultra easy setup scripts.

Yesterday another happy customer went live on our reporting database. After the ultra easy setup, everything seemed to work fine. But today between 11am to 3pm, we accumulated over 20 errors in dba_apply_errors table. All of them were “ORA-00942: table or view does not exist”. All of them for “create view” and “create table” DDL.

Important streams fact of the day: Schema level replication does not replicate object grants given to that schema or on that schema. Even if you replicate DDL.

To resolve the issue we had to give the grants manually on the reporting database, and then use dbms_apply_adm.execute_all_errors to rerun all the DDLs that failed. It was very neat to see that once the grants were given, execute_all_errors worked perfectly and recreated all the views we missed.

If we anticipated more privileges changes on the replicated schemas, we would have had to find a permenant solution for their replication. Perhaps adding a global DDL rule with a filter to replicate grants only for these specific schemas. Lucky for me, privilege changes are not expected to happen in the future.

Streams – Part II

Two weeks after the last streams post, and I need to give some updates and corrections.

1) I’ve written that:

“And you should also check which archived logs can be deleted:

Actually, that is not a good way to check for which archive logs can be deleted. According to Metalink note 373066.1 RMAN use dba_capture.required_checkpoint_scn to determine which logs can be purged. So I suggest to use this when doing managed backups as well:
case when (r.next_scn > c.required_checkpoint_scn) then 'NO' else 'YES' end purgable

2) I’ve made some severe mistakes when trying to extend the streams implementation by replicating few more tables. I decided to start from scratch by running: dbms_streams_admin.remove_streams_configuration.

This does not work. The solution was found in Morgan’s Library:
-- the following provided by Michael M. Brennan from RSA Security who
-- reports that REMOVE_STREAMS_CONFIGURATION does not work unless the
-- propagation has been dropped first.
SELECT propagation_name FROM dba_propagation;
exec dbms_propagation_adm.drop_propagation('PROPAGATION$_143');
exec dbms_streams_adm.remove_streams_configuration;

3) Unfortunately, when I tried to start a new capture process, I find myself with the following unexpected problem:
SQL> select required_checkpoint_scn from dba_capture;


Not good. This means that my capture process is looking at the archive logs for scn 0. I have no control over required_checkpoint_scn and have no clue how this happened. Oracle support is still researching that.

But this means that I can refer you to two metalink articles that can help with debug (or at least with opening SRs on streams related issues):

Note 313279.1 “Capture Process Looping In Initializing/Dictionary Initialization State” shows how to start a trace against the capture process and how to interpret the result trace file.
Note 273674.1 “Streams Configuration Report and Health Check Script” contains cute scripts that summarize all the the streams configuration into a nice HTML page. I didn’t find the script very useful for my debugging, but it is absolutely mandatory when opening a streams related SR. They will not talk to you before you sent them the result of the script, and they’ll ask you to run it again whenever they run out of ideas on what to say next.

Oracle Streams Replication Example

I promised to show off our streams implementation, because I found so few examples out there. Keep in mind that your streams implementation will probably look nothing like mine at all, and even if it will look like mine, you can get the same results in very different way. This example is not an excuse to avoid reading the docs.

Our goal: Set up a reporting server, with replication of schemas arriving from many source DBs. We also want to filter out and not replicate some of the less critical tables.

Decisions, decisions: When implementing streams there are several things you want to ask yourself: Do you want bi-directional replications? Do you want to capture the changes on the source DB, destination DB or a third server? Do you do the instantiation (first copy of data) with DataPump or RMan?

We went with one-directional replication, capturing will be done on the destination server, and we’ll use DataPump to initially move the schemas to the destination.

Create users: You want to create a user both on source and destination server that will own all the streams related objects. Also, you want to give him his own tablespace (or at least keep his objects away from SYSTEM tablespace) because in case of errors lots of data can accumulate in stream queues. The docs recommend making the tablespace autoextend, but I’m very much against it. After creating the strmadmin user, the rest of my instructions should be run from this user.

CREATE TABLESPACE streams_tbs DATAFILE '/data/oradata/DB05/streams_tbs.dbf' SIZE 25M;

CREATE USER strmadmin
IDENTIFIED BY strmadminpw

GRANT DBA TO strmadmin;

grantee => 'strmadmin',
grant_privileges => true);

DBLinks: You will need a DBLink from source to destination DB. For convenience, I created a DBLink from destination to source which I’ll use during the configuration of Streams and will drop afterwards. Of course, the servers should be able to tnsping each other before this will work.

--on source
CONNECT strmadmin/strmadminpw;
select count(*) from dba_tables@REP01;

-- on target
CONNECT strmadmin/strmadminpw;
select count(*) from dba_tables@DB05;

Directory Objects: I decided to use Datapump for instantiation and to use DBMS_STREAMS to create a configuration scripts. This means I needed two directory objects on destination (for the script and Datapump) and one of the source (just Datapump). I used existing directories, so I had no need to create them, but in general you create a directory like so:

create directory EXP_DP_VOL as '/exp_dp';

Copying Archive redo logs: You need to get the archived redo logs from the source server to the server doing the capturing (destination server in my case, but can be a 3rd server). There are two decisions to make here: do you use ARCH or LGWR to write the log files to the remote location? And will the new archiving destination be MANDATORY or OPTIONAL (i.e. does the writing have to succeed before the corresponding redo file at the source can be overwritten)?
Also, when specifying the destination directory, you want it to be a separate directory from where REP01 redo logs are archived, and make extra sure that the directory really exists as you wrote it.
I specified a non-existent directory, this resulted in errors on the alert log about failure to archive (scary!), and after few retries the archiver stopped attempting to write to the new destination. So, after creating the directory I had to reset the log_archive_dest_2 parameter (i.e. set it to ” and then change it back again).

On the source server, we run:


Create instantiation and configuration script: Since we do replication of (almost) entire schemas, we could use DBMS_STREAMS_ADM.MAINTAIN_SCHEMAS to do both the instantiation and the configuration of Streams.
I used the option that creates a script with the instructions (instead of run everything). I had many problems with the script and ended up using only parts of it, and other parts doing in different ways. I’m not sure if it means that using a script is a bad idea (since I had so many problems) or a good idea (since the same problems would be more annoying if I did not have the script). To create the script, I ran the following command on the destination database. If you are using a 3rd server for capture, run it on the 3rd server:

schema_names => 'DEV110',
source_directory_object => 'EXPORTS_DIR',
destination_directory_object => 'EXP_DP_VOL',
source_database => 'DB05',
destination_database => 'REP01',
perform_actions => false,
script_name => 'src_pre_inst_streams.sql',
script_directory_object => 'EXP_DP_VOL',
dump_file_name => 'dev110.dmp',
capture_queue_table => 'rep_capture_queue_table',
capture_queue_name => 'rep_capture_queue',
capture_queue_user => NULL,
apply_queue_table => 'rep_dest_queue_table',
apply_queue_name => 'rep_dest_queue',
apply_queue_user => NULL,
capture_name => 'capture_110dev',
propagation_name => 'prop_110dev',
apply_name => 'apply_110dev',
log_file => 'export_110dev.clg',
include_ddl => true,

Running the result script: Try running the script you just created on the same DB where you ran the procedure to create it. Note that it asks you for connection details of three sites. The first should be source, the second the destination, and the third can be garbage as it is not used in the script at all.

If the stars are aligned right, maybe you run this script to the finish and now have a working schema replication. In my case, it ran fine until the Datapump part, and created all the capture processes and queues. However, the Datapump export failed with very unusual errors. I spent two days trying to get it to work, and then given up and simply used Datapump manually to export the schema from source and import into destination. After that I continued running the rest of the script, and things were fine.

Fun and games: Now that you have your own schema replication. It is time to enjoy it. Create new tables, create as select, update data, insert new data, delete data, drop tables, create procedures and types. Everything replicates!

Just remember that if you don’t commit, it won’t replicate. Reasons should be obvious. Also, if you are testing on low-volume system, you probably want to alter system switch logfile every few minutes so you can see your changes faster.

Watch the replication process: While you are replicating changes, you probably want to watch how they move from source to destination:

You can see when the last message was read from redo archive:

You can see when the changes were entered into the capture queue:

And you can see when the apply process started working on them:

Another nice thing to try is to generate errors. For example, you can delete a table from the destination, make a change to it in the source, and see how it doesn’t replicate. The error will be found here:

And you should also check which archived logs can be deleted:

Excluding Tables:
The last thing I did was filter out the tables I did not need to replicate. Those are tables the application uses as queues or temps – tons of inserts and deletes, but the data is useless for reports.
I used the DBMS_STREAMS_ADM.ADD_TABLE_RULES procedure. I selected to do the filtering during the capture (you can also do it during apply).
The important bits: table_name MUST contain the schema name in it. Otherwise the procedure will assume that the table belongs to strmadmin, and nothing will get filtered.
inclusion_rule=>FALSE is the part that indicates that we don’t want this table.

include_tagged_lcr=> FALSE,

You can then see the rule you created by running select * from DBA_RULES

What now: The process I just described got me safely past a POC. There is obviously lot more to be done before this solution goes production. The performance of the capturing, and what is the impact of lags is of concern. Also, the entire system needs to be recreated from scratch whenever we do “alter database open reset logs” on the source DB (hopefully not often). But the immediate next step for me is to prepare a nice presentation to management showing what a great solution we prepared and how useful it will be for the business and how much this will be worth the investment.