Datatombraider's Blog

2017/05/09

Memory leak in asmb in version 12.1.0.2

Filed under: Oracle — datatombraider @ 21:51

After upgrading our cluster from GI 11.2.0.4 to GI 12.1.0.2 we had problems due to insufficent memory. Every node has 144 GB memory, 80 GB for SGA, remaining memory for applications. We traced the memory consumption of the applications and noticed two processes growing indefinetely:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
oracle 11879 0.1 19.2 33592852 28646580 ? Ss Apr04 47:46 oracle+ASM4_asmb_+asm4 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 11875 0.0 19.5 34051676 29101320 ? Ss Apr04 34:15 asm_asmb_+ASM4

These 2 processes consumed 38% of available memory and kept growing.

We hit the bug described in “SQLPlus 12c Memory usage Grows (Leaks) While Running Long Query (Doc ID 1919000.1)”. The title is misleading, but the solutions given in the Document fixed the issue, asmb’s memory consumption returned to normal and stays at that level.

Advertisements

2014/02/07

Standby DB ignores specified diskgroup and creates datafiles in db_create_file_dest

Filed under: Oracle — datatombraider @ 23:54

The issue described here is documented in “Oracle® Data Guard Concepts and Administration 11g Release 2 (11.2)” and not a bug but the intended behaviour. It’s easily overlooked, so it might be worth sharing.

In a tiered storage environment you usually use different diskgroups. We have +FAST for critical data and +SLOW for historical data. If we create a new tablespace or add a new data file to an existing tablespace we have to specify +SLOW for historical data:

create tablespace bla datafile ‘+SLOW’ size 10m;

In the primary database the datafile is correctly created in diskgroup +SLOW but on the standby using standby_file_management=AUTO it will be created in +FAST:

PRIMARY> select file_name from dba_data_files where tablespace_name=’BLA’;

FILE_NAME
——————————————————————————–
+SLOW/pri/datafile/bla.287.838042203

STANDBY> select file_name from dba_data_files where tablespace_name=’BLA’;

FILE_NAME
——————————————————————————–
+FAST/stdby/datafile/bla.777.838042205

Setting db_file_name_convert to whatever values has no effect. If the standby database uses OMF it creates the data files always in db_create_file_dest, no matter what was specified on the primary. There’s no way to change it because it’s the documented and intended behaviour.

This issue is potentially dangerous because different diskgroups may have different performance and/or recovery specifications. If the data files on the standby are created silently in the wrong diskgroup, it could led to a bad surprise after a switchover. Nothing i want 😉

Therefore i decided to drop the usage of db_create_files on databases using multiple diskgroups for data files.

The disadvantage is that you always have to specify the desired diskgroup, but the advantage outweighs this little inconvenience.

2013/03/31

Move tablespace to different ASM diskgroup

Filed under: Oracle — datatombraider @ 20:03
Moving a tablespace to a different AS; diskgroup can easily accomplished by using RMAN. Here are the steps for RAC. For single instance just use the step on the primary site only.
On primary
Copy all datafiles comprising the tablespace, set the tablespaces offline, switch to the copy and recover the copy to bring it up-to-date.
RMAN> backup as copy tablespace <tbs_name> format '<+NEW_DISKGROUP>';
RMAN> sql 'alter tablespace <tbs_name> offline';
RMAN> switch tablespace <tbs_name> to copy;
RMAN> recover tablespace <tbs_name>;
RMAN> sql 'alter tablespace <tbs_name> online';
On Standby
Stop log-apply because you cannot fiddle with tablespaces and datafiles if log-apply- is running.
DGMGRL> edit database '<DB_UNIQUE_NAME><db_unique_name>' set state=log-apply-off;
Now repeat the copy/switch steps from the primary except the recovery
RMAN> backup as copy tablespace <tbs_name> format '<+NEW_DISKGROUP>';
RMAN> switch tablespace <tbs_name> to copy;
and restart log-apply to start recovery:
DGMGRL> edit database '<DB_UNIQUE_NAME><db_unique_name>' set state=online;
Now delete the old datafiles. Note, datafile and datafile copy have been switched, so you have to delete the datafile copy.
RMAN> list datafilecopy all;
RMAN> delete datafilecopy '';

If you're brave or crazy, you could delete all copies in one shot:
RMAN> delete datafilecopy all;


Patch 10110625 is broken

Filed under: Linux,Oracle — datatombraider @ 19:47

The post-install script of Patch 10110625 for 11.1.0.7 is wrong. It sets current schema to SYS and calls rdbms/admin/prvtblib.plb, which recreates the body of package BSLN_INTERNAL without specifying the schema which end up in an error message because the package and it’s body are in schema DBSNMP, not SYS and recreating a body without a package header does not work.
Issue can easily be fixed by setting current_schema to DBSNMP in postinstall.sql.

The same patch for 11.2.0.3 is correct and works fine.

Update: patch for 11.1.0.7 was re-released, silently i might add, works now as it should.

2012/04/29

RDAC Multipath Driver on OEL5 UEK

Filed under: Linux,Oracle — datatombraider @ 19:55

RDAC Multipath Driver doesn’t compile out of the box on OEL5 with UEK, at least version 09.03.0C05.0504 doesn’t. But with some minor modifications it does and it even works 🙂 Here’s the patch (due to my or WordPress stupidity i cannot upload plain text files, so you have to remove the extension .doc, it’s actually plain text):

rdac-uek.patch

rdac-09.03.0C05.0638-uek-patch

2011/06/15

Real-World Performance Videos

Filed under: Oracle — datatombraider @ 21:08

I highly recommend Greg Rahn’s blog “Structured Data” to everyone working with Oracle database. If you don’t know it, go check it out.

And especially the last two posts

http://structureddata.org/2011/06/15/real-world-performance-videos-on-youtube-data-warehousing/

http://structureddata.org/2011/06/15/real-world-performance-videos-on-youtube-oltp/

are very interesting for both DBAs and developers.

 

2011/06/14

RHEL 6.1, ATI and compiz

Filed under: Linux — datatombraider @ 22:21

After upgrading my RHEL6.0 installation to RHEL6.1 compiz made some trouble. the menus and screens of some application, most notable openoffice and firefox looked distorted, like they weren’t refreshed properly. Compiz made some trouble in the past, so i first decided to disable it. But after some time i missed the compiz features like wall and zoom, so i tried to solve the issue and succeeded, hooray.

The fix was very easy: start compiz with ‘–indirect-rendering’. The best way to do it is to change ‘/usr/bin/compiz-gtk’ (i use GNOME). The script checks if ‘–indirect-rendering’ is necessary but for unknown reasons it doesn’t work properly on my ATI card. Here’s the change:

$ cat /usr/bin/compiz-gtk

function runCompiz() {
gtk-window-decorator &
#    exec compiz –ignore-desktop-hints glib gconf gnomecompat $@
exec compiz –indirect-rendering –ignore-desktop-hints glib gconf gnomecompat $@
}

After the change everything is working just fine.

2011/01/31

ASO and SSL

Filed under: Oracle — datatombraider @ 19:31

The Advanced Security Option, an option with costs of course, has the nice feature to secure database connections using SSL (authentication and encryption). Unfortunately TLS does not work (11.2.0.2 64bit on RHEL5). if you select TLS in NETMGR, it inserts ssl_version=3.1 in sqlnet.ora but SQL*Net has some problems with it. All connections fail with ‘ORA-12560: TNS:protocol adapter error’ which doesn’t say much.

sqlnet.log has some more details:

VERSION INFORMATION:
TNS for Linux: Version 11.2.0.2.0 – Production
sdp
Time: 29-JAN-2011 15:57:53
Tracing to file: ^D<9B><9B>
Tns error struct:
ns main err code: 12560

TNS-12560: TNS:protocol adapter error
ns secondary err code: 0
nt main err code: 549

TNS-00549: value specified for the SSL version is not valid
nt secondary err code: 0
nt OS err code: 0

note the tracefile name, looks like uninitialized memory to me but that’s not important for now. basically the ssl version is not valid.

the trace file (level=support) has some more funny information:

2011-01-29 15:57:53.857338 : ntzGetStringParameter:found value for “ssl_version” configuration parameter: “3.1”
2011-01-29 15:57:53.857356 : ntzGetStringParameter:exit
2011-01-29 15:57:53.857373 : ntzConvertToNumeric:entry
2011-01-29 15:57:53.857400 : ntzConvertToNumeric:value specified for SSL client authentication (“3.1”) is not boolean
2011-01-29 15:57:53.857427 : ntzConvertToNumeric:failed with error 549

it seems only numeric values are allowed and because NETMGR put ‘3.1’ in it, the parser gets confused (i didn’t configure ssl_client_authentication at all). the workaround is not to specify ssl_version but the ssl_ciphers used by TLS, for instance

SSL_CIPHER_SUITES = (SSL_RSA_WITH_AES_256_CBC_SHA)

After making the changes on client and server, i can establish ssl-connections using TLS ciphers.

It’s actually a known problem (Bug 9682150: SSL_VERSION=3.1 IS CAUSING ORA-12560 IN SSL AUTHENTICATION), opened 4-May-2010 but still not fixed, which is a shame.

2010/11/07

Fun with dates and dbms_profiler

Filed under: Oracle — datatombraider @ 13:24

Some time ago i came across an issue with the dbms_profiler package. The package records unit timestamps in a table called PLSQL_PROFILER_UNITS. The problem is, for anonymous packages, which don’t have a timestamp, the package somehow manages to insert invalid dates:

UNIT_TIMESTAMP       UNIT_TIMESTAMP_CHAR  UNIT_TIMESTAMP_DUMP
——————– ——————– —————————————-
00-DECEMBER-00       0000-00-00 00:00:00  Typ=12 Len=7: 0,0,0,0,0,0,0

From the documentation: “unit_timestamp   DATE   In the future will be used to detect changes to unit between runs” but i would expect either no values or valid values. The funny thing is, that to_char() happily converts the bogus date to ‘0000-00-00 00:00:00’ but, at least, to_date() fails to convert it back to a date (“ORA-1843: not a valid month”).

Negative DBID?

Filed under: Oracle — datatombraider @ 13:07

recently i’ve noticed that the RFS process of one of my databases logs a negative DBID:

RFS[1]: Opened log for thread 2 sequence 16 dbid -95040917 branch 733392363

RFS[7]: Selected log 11 for thread 2 sequence 17 dbid -95040917 branch 733392363

Strange but if you’re familiar with coding you probably know the problem called integer overflow (something you don’t want to see in your database software).

It seems Oracle stores the DBID in a 32 bit unsigned integer, so my DB ID can be calculated by adding the reported DBID to 232:

232 = 4.294.967.296 + (-95.040.917) = 4.199.926.379

which happens to be my correct DBID:

SQL> select dbid from v$database;
DBID
———-
4199926379

Storing a 16-digit DBID in a 10-digit number seems to be a bad idea, hopefully it happens only for displaying or logging purposes 🙂

Next Page »

Blog at WordPress.com.