Thinking about Design Patterns

I have this friend who is an ambitious young corporate climber. When I was started out as a team lead, I was totally overwhelmed by the office politics I was suddenly exposed to. Naturally, I turned to my friend for advice. She told me to read Machiaveli’s “The Prince”. So I did. Its an interesting read, but not that amazing as far as management advice goes. When I later tried to discuss the book with my friend, I found out that she never actually read the book herself – she just thought it is good advice.

6 years later and I still believe people when they tell me that I have to read a book. I’m naive like that.

So I spent the last two weeks reading “The Timeless Way of Building”. Its an architecture book. Architecture as in cities and buildings. The reason I spent two weeks reading a professional book intended for a different profession is that at some point in history (1994, I believe), some people though that the ideas in the book are relevant to software development. These days you can’t really be a Java developer without being fully fluent in the development pattern language.

And of course, everyone was saying “You have to read The Timeless Way of Building. It will change the way you think about software.”. From my days in development, I still remember quite a bit of stuff about design patterns, and I never really liked that particular approach to software development, but I didn’t really figure out why. After reading the authoritative source on patterns, I can say the following:

  1. Christopher Alexander had some good ideas about patterns. The book is readable to non-architects and is very enlightening. I recommend reading it if you are interested in what makes some cities and buildings feel better than others.
  2. I am pretty sure that his advice on how to design good buildings is not really applicable to software development field.
    A lot of his ideas are based on the fact (which he did the research to prove) that people intuitively know how buildings should be designed, and that when you ask a large number of people “How do you imagine you will feel in such room?”, you’ll get an overwhelming consensus. This is far from being true in the software field.
  3. What is common known today as software design patterns is so far removed from what Christopher Alexander recommended for architects, that software developers should really go and find a different name for what they are doing.

I’m still rather shocked by the differences between Christopher Alexander’s patterns, and what design patterns look like today. It is not few tiny differences that occur whenever ideas are translated from one domain to another. Some of the changes are profound.

First of all, Christopher Alexander says that patterns describe the way people already do things. “Night Life” and “Parallel Streets” existed before the book “A Pattern Language” was written. I’m not at all sure this is the case for design patterns. People buy design pattern books to learn the patterns themselves, not just the language or which patterns are better than others.

Second, patterns should have an intuitive meaning and intuitive name. Again, you don’t need a book to know what is a “Bus Stop” or “Small Parking Lots”. You may want to read the book to find out why they are a good idea, or how to make a good bus stop, but you know what it is. I don’t believe that anyone knew what is an “Abstract Factory” before reading a document about design patterns. Even patterns that have been used for decades got a fancy name. It can take a while to figure out that a Singleton is a global variable. One of the simplest and most common patterns in software development “A function that does exactly one task” is missing from software design patterns. “A Loop” is also a pattern which is missing in action. All this gives the wrong impression that patterns are very complicated and something that can be mastered by experts only – which is exactly the opposite of what Christopher Alexander intended.

Third, patterns are abstract concepts. They are always implemented in a different way, because the entire idea is to be sensitive to the context, which is never the same twice. There is a pattern called “Six-foot balcony”, but it would be wrong to mass-manufacture six-foot balconies and start attaching them to buildings. Six-foot balcony is the idea, the exact shape of the balcony will be designed to match the building, the view, the trees, the sun, etc.
So it is rather annoying to discover that all patterns have “implementation examples”, which developers enjoy copying into their code. I’m all for code reuse, but this is not patterns mean.Wikipedia has a decent description of what defines a pattern, and “being implementable in one or two simple classes that can be copy-pasted” is not part of it.

Executive summary: “Timeless Way of Building” is an interesting book on architecture, with some good insights about how humans like to live and a bit of a Zen feel. You will not learn anything about software development from reading it. If you already know software design patterns, you will be struck by how different the ideas in the book seem.


Everything a Junior DBA Should Know – a Book Review

The book is called “Beginning Oracle Database 11g Administration”.

I know what you are thinking. A book called “Beginning Oracle Database 11g Administration” cannot possibly be a good book. It just sounds like one of those books. You know the kind of book I’m talking about –  “DBA in 21 days”, “Oracle 11g handbook”, etc. Big, boring, completely useless.

In this case, looks are so misleading that I feel compelled to correct this first impression. This is not a big, boring, useless book. This is a brilliant and funny book and it is the most perfect book for newbie DBAs. I wish someone had given me this book few years ago. If I ever write a book, I want it to be just like this one. I’m definitely going to make management buy it for every new junior DBA hired in our department (Not that we are hiring new DBAs. But thats a different and very sad story).

I will now proceed to explain (in detail!) why this book is so perfect, and few of the less perfect things about this book.

In my opinion, the thing that makes a technical book fun and brilliant (as opposed to boring and useless) is the author’s voice. My favorite books have a very specific voice to them. The book sounds pretty much like the author sounds in a presentation or conversation or email.

Tom Kyte’s book has that “make a complex subject crystal clear with one good example” voice that I associate with AskTom. Cost-Based Fundamentals sound as obsessively details-oriented as their author, Jonathan Lewis. Laurent Schneider’s Advanced SQL book sounds exactly like Laurent – “The example is self explanatory, words are a waste of time”.

Books that don’t have that personal voice, that sound totally mechanical and robotic, are boring books, and since they probably bored their authors as well, they are usually not very good books.

“Beginning Oracle Database 11g Administration” sound exactly like the author – Iggy Fernandez. And Iggy is one of the funniest people I know. He’s also very no-nonsense type of guy, very realistic about what databases, customers and DBA jobs are like. And he has tons of experience. I’ve known Iggy for about a year, but you’ll be able to figure all this out after reading just few pages – the book is amusing, and slightly cynical and contains lots of real world stories all ending with “This happened to me, don’t let it happen to you!” and there are a bunch of exercises at the end of each chapter to test your knowledge, oh and a nice quote at the beginning of each chapter. Did I mention that this is exactly the type of book I wish I could write?

So, what’s in the book?

The book starts at unusual place – The first chapter is about relational theory. Which is a very welcome change from the usual “SGA, buffer cache, shared pool, 5 processes, data files, redo files, control files” chapter that normally open the newbie books. The chapter introduces important concepts about relations, ACID and data integrity. I liked that – starting with basic concepts before diving into Oracle mechanisms is a great way to start.

Follows is a chapter about SQL and PL/SQL. Its a decent introduction, but one of my least favorite chapters. First of all, the chapter about SQL and PL/SQL contains exactly 2 pages about PL/SQL and one example. I understand the limitations involved in writing an introductory book, but it still didn’t seem right. To make things even more annoying – 4 pages are dedicated to “Criticisms of SQL”. Thats the downside of reading a book with a personal author voice – it is clear that the author fell in love with Chris Date, but objectively, Date’s theories are somewhat of an advanced topic and seem out of place in a beginner level book. Not that I’m complaining – I found the topic fascinating 🙂

One other issue I found is that the first example in the chapter attempts to demonstrate efficient vs. inefficient SQL, by showing a query that due to a function in the where clause will fail to use an index. Its a great example, but indexes were not introduced in the book at this point and may be unfamiliar to the reader. This kind of problem happens several times in the book – examples and explanations use concepts that were not introduced yet. Not a big deal, but it may stump complete beginners.

Chapter 3 gives the expected overview of Oracle architecture and with that the “concept” chapters are finished.

Chapter 4 is called “Planning” and is one of the reasons I wish I had this book five years ago. It covers license costs, the different editions, different architecture choices (like RAC and dataguard) and most important – sizing. This is the only general-purpose DBA book I’ve seen to cover the essential issue of sizing.

Chapter 5 is installation and 6 is database creation. They contain all the usual screenshots of the usual installation and DBCA graphical interfaces. In the end of chapter 6 is a nice surprise – a listing that shows how to create a database without DBCA, using the almost forgotten “create database” command.

Chapter 7 covers physical design – partitions, indexes, clusters, etc. Chapter 8 covers user creation, some of the basics of privilege management (although options like “with grant” are left out) and also tools like export, datapump and sqlloader. I loved the explanation about datapump – because the listing contained many of the errors DBAs run into when using it. Much better than the examples in which everything works!

This ends the section about implementation. The next section is about support.

Chapter 9 covers some of the administration tools – SQL Developer and EM. I would have dropped this chapter in favor of PL/SQL chapter, but thats me.

Chapter 10 is about monitoring. Like planning, monitoring is a huge and critical part of the DBA job, and most books don’t give it the attention it deserves. This chapter is another reason this book is a must-have for new DBAs. Iggy avoids the trap of saying “EM takes care of monitoring. Here’s how you configure EM”. The chapter is completely tool-independent and covers the most important things a DBA should monitor. (I know it took me few years to figure out that backups should be monitored and very carefully too!). The chapter is also full of the queries that you’ll want to use for monitoring – which makes it a very valuable and useful reference.

Chapter 11 is about troubleshooting. It starts with a systematic overview of the troubleshooting process (because troubleshooting is a process and not something that is done once) and the process Iggy gives is identical to the one I give in my “Troubleshooting Streams” presentation. We did not steal from each other – its the very well known scientific process and everyone should follow it. Its a very promising start, but it gets much better. Iggy gives troubleshooting best practices and an hillarious dialog between a DBA and a user demonstrating the troubleshooting process. The chapter finishes with a list of resources that can be used for troubleshooting, and some advice about troubleshooting common error codes (although ORA-01555 is not that common anymore). The explanation on troubleshooting ORA-600 also demonstrates some metalink basics (The only Oracle book I know that explains Metalink. This is amazing, considering that many DBAs use metalink as often as they use SQL*PLUS)

Chapters 12 and 13 are a decent overview of backup and recovery topics.

Chapter 14 covers the things DBAs normally do on a regular basis – backups, statistics collection, data archiving, cleanup of trace files, reviewing audit records, password changes, capacitiy management and patching. Putting all this in a single chapter is unusual, and I’m not 100% sure it works. Statistics make more sense in the context of tuning, backups were already discussed, etc. I would have reduced this chapter to a single page checklist and a strong recommendation to automate as many of these tasks as possible.

Chapter 15 is another great one. This is the only chapter in the book that is as useful to senior DBAs as it is to beginners. The chapter covers the correct way to run a DBA team and make management happy. It covers ITIL basics and the recommended IT management processes. It explains what the business usually expects from the DBA team. And it gives very good ideas on how to build a documentation library and more important – why.

The last section is about database tuning. Books can be written about this topic, and Iggy only gives a basic beginner-level intro to the subject. Thankfully he avoids saying anything that will require unlearning later on (no ratio-based tuning!).

Chapter 16 covers instance tuning. Starting with the systematic process one should follow (instead of just messing with random hidden parameters) and discusses the main techniques used – DB Time analysis, Oracle wait interface and statspack.

Chapter 17 is about SQL statement tuning – it discusses the use of session traces to find these queries and some tuning techniques such as indexes, hints and collecting better statistics. The example that ends this chapter is very in-depth and very useful.

I hope that by now everyone who actually bothered reading my slightly longish review understands why I love this book so much. I can’t overemphasize how fun and readable the book is, and how useful it will be for DBAs with 1-2 years of experience.

In the interest of transparency, I’d like to mention that Iggy is a friend of mine and he gave me a copy of his book for free. He did not ask me to review it, its just that if a friend of mine happens to write the best beginner DBA book in existance, I think everyone should know.

Few Links of the Week and a Biased Recommendation

Because Pythian’s Log Buffer is missing due to unexpected appendix, and I can’t leave my readers with nothing to do all weekend.

Jonathan Lewis suggests browsing through V$SQL and provides a nice script.

OracleNerd explains how to shop for cars using SQL. I thought I’m nerdish for planning my purchases in Excel, but SQL is by far a nerdier method.

Marco Gralike gives a very cute HTTPURI example.

Tanel Poder shows the memory overhead of generating rows with “connect by” and shows a short and sweet solution.

Laurent Schneider’s Advanced Oracle SQL Programming book is now available for sale on Amazon. I’ve been lucky to get the chance to review the book. It covers some of the most advanced and exciting aspects of SQL programming, and it is full of useful, practical examples. Almost everyone who uses Oracle SQL regularly in his work can benefit from the information and ideas in this book. I know my programming skills improved significantly from reviewing it (and even my co-workers noticed!).

Not many posts this week. Maybe it is related to Euro2008?

Sources about Streams

Streams is one of the least documented Oracle features. I’m used to start my research into a new area by reading few examples of how others used this feature. In the case of streams, very few examples were found and I had to work directly from Oracle documentation (the horrors!).

Here are the sources I’ve used while studying streams:

Lewis C. has the only useful example of streams I was able to find. In two parts. It was a good start but very soon I learned that streams is so customizable, that my use case is going to be very different than his.

Oracle’s Stream Replication Administrator Guide – Also known as “Everything you ever wanted to know about streams replication”. I found myself spending more time with this document than I did with my husband. At this point in the project I can cite entire sections verbatim. This is the streams replication Bible.

Oracle’s Streams Concepts and Administration Guide – Thats the second most important document. I started my project by reading the concepts part of it, and then referred to the administration part during the implementation stage. The chapter about monitoring was especially useful.

PL/SQL Packages Reference, DBMS_STREAMS_ADM – Documents the actual functions I’ve used. Of course, you can’t live without it – because you’ll always want to do things a bit differently from the examples in the administration guide.

If I were doing this project on 11g, I’d probably also try reading Oracle Database 2 Day + Data Replication and Integration Guide. But I only found out about this one today.

I’ll publish my own streams example in a day or two, so there will be at least one more example to work with.

Preinstall Checks and a Book Review

This post is two topics for the price of one 🙂

1. A smart co-worker discovered that you can use RDA to check the pre-requisites before installing Oracle. What used to be an annoying, time consuming and error prone task, because almost fun.

Before installing the database, get RDA and type: ./ -Tdv hcve

This will generate a nice HTML page, with a list of tests that ran, which ones failed, why they failed and what you should do about it. Now you can send relevant parts of the report to your system administrator and ask for fixes.

2.  Over the Thanksgiving holiday, I’ve read Joel Spolsky’s “The Best Software Writing”.  It is a difficult book to review, because it is generally a collection of essays, chosen to represent different aspects of the software world, and as such they vary in appeal and quality. Topics include programming languages, programming career, the effort required to ship a product, marketing and management. Naturally, everyone will be interested in a different subset of the essays.

I have some problem with the premise behind the book – Joel seems to believe that there is very little good writing about software out there, because good programmers are bad communicators, and therefore the very best writing should be show cased. I believe that a lot of good programmers, not to mention good managers, are also very good writers, and that the quality of essays in the book was not significantly better than what I get almost daily in my RSS reader.

I was somewhat disappointed about this book because I bought it thinking it will be similar to “Oracle Insights”, a bunch of war stories from the field that are entertaining and educational (and in some cases a bit painful). Unfortunately, the essays that Joel chose were not as good as the stories that OakTable has provided. Despites Joel’s assertion in the beginning of the book on the importance of telling stories, the book lacked vivid real life stories, and in many cases it also lacked meaningful lessons and conclusions.

In short, if you are managing a development team, Joel’s book may interest you. If you are working with databases, Oracle Insights will be a better investment.

Things you learn while studying for OCP

So, these days I’m studying for my Oracle certification. I’m studying with a friend, and we use a book from Oracle Press to prepare. We are both experienced DBAs, so we make a game of finding mistakes in the book. Its a fun game, and keeps us alert while going over rather boring material.

Yesterday, I’ve read that Oracle doesn’t allocate space for a column, even if it is fixed size, until there is data in it.
While this is certainly space efficient, it seemed like a very difficult way to manage space – you have to keep moving columns around when people update them. So, we suspected that the book is making a mistake, and that Oracle allocates the space for fixed size column when the row is inserted, even if the column is empty.

Time for verification:
create table t1
(X integer, Y integer)

I created an empty table, with two fixed size columns, and checked the size:

SQL> select owner, segment_name, bytes,blocks, extents from DBA_SEGMENTS
where owner='CHEN';

----------  ----------------    ----------   ---------   -----------
CHEN         T1                 65536         8             1

Nothing surprising. So lets insert some data, but only to the first column. Keep the second empty:

for i in 1..30000 loop
    insert into t1 (X) values (i);
end loop;

And check size again:

SQL> select owner, segment_name, bytes,blocks, extents from DBA_SEGMENTS
where owner='CHEN';

----------  ----------------    ----------   ---------   -----------
CHEN         T1                 458752        56             7

Look, the table just got bigger.
Just for the heck of it, lets see what happens when I insert one row with both columns. Will this be enough to allocate trigger allocation in all rows?

insert into t1 values (1,1);
SQL> select owner, segment_name, bytes,blocks, extents from DBA_SEGMENTS
where owner='CHEN';

----------  ----------------    ----------   ---------   -----------
CHEN         T1                 458752        56             7

Nope. Nothing much changed.
So lets update all rows with value in the second column and see what happens:

update t1 set Y=1;
SQL> select owner, segment_name, bytes,blocks, extents from DBA_SEGMENTS
where owner='CHEN';

----------  ----------------    ----------   ---------   -----------
CHEN         T1                 851968        104            13

So, the table doubled. The book was correct – the second table was not allocated when I did the first insert, only after the update.
Since I assume that Oracle will now put one column of a row in one extent and the second in another extent, the update probably involved moving a bunch of rows around to the new extents. Something to keep in mind when trying to figure out why update is using to much IO.

Oracle Documents

I’ve run into a nice post about RTFM on the BAAG journal. Yes, I know this is not exactly news, but I think its going to resonate with many DBAs. However, I suspect that the RTFM post oversimplifies what is actually a rather painful issue.

Lets start with the fact that many DBAs are not native English speakers. They know enough English to get along nicely, but reading technical documents can still be a slow and painful activity. Perhaps slower than waiting for a kind soul on a mailing list.

Then, there is the fact that getting meaningful answers out of Oracle documentation is a bit of an art form. There are both OTN documents and documents in Metalink. They sometimes contradict, so you need to verify which is more updated, but make sure it matches your version.

If you need to know how to configure automatic memory manager or how to rebuild an index, the documentation is pretty good. If you need to know what happens when you set SGA_MAX_SIZE to zero while using automatic memory management, you are in for a significant search at the end of which you will have more information and still end up having to make a guess which may or may not be correct. I remember looking for a good quote that will allow me to prove to my boss that if we rebuild an index with parallel option, all queries using this index will be parallelized to the same degree. I couldn’t find one, although it could be inferred by combining several paragraphs from two different books in the right way.

There is a reason for the huge market for Oracle books other than the official documentation. Thats because the official documentation is difficult to read and sometimes is not even good enough as a reference. There are a bunch of websites, blogs and magazine articles that explain information that is already contained in Oracle documents. A co-worker is learning PL/SQL and asked for good book. You can bet I told him to get Steve Feuerstein’s PL/SQL book and didn’t tell him to download Oracle’s book from OTN.

So, I agree with Simon Haslam at BAAG that people need to RTFM. I just wouldn’t recommend Oracle’s official documentation for that purpose.

Becoming a better DBA

I’ve read Coskan’s post about invalid DBAs and now I’m a bit worried that I’m nearly invalid. Too many of the traits he mentioned hit too close home.

So, here’s how I plan to recompile my professional skills:

  1. Read the concepts guide. I’ve read parts of it, but it is probably time to give it a more through read.  Pretty much everyone agrees that one cannot be a good DBA without reading it.
  2. Learn to do wait interface tuning. Start by reading  the book and trying out the ideas in the test system. I know the theory, but currently I do all the wait interface analysis using Ignite for Oracle. A marvelous tool, but a DBA should know how to get along without fancy tools.
  3. Get more acquainted with Oracle security. Maybe start by reading Pete Finnigan‘s papers.
  4. Get Certified! This will instantly make me a better DBA, no?

This short list will probably keep me occupied well into the end of the year. Especially the performance tuning part – Millisap’s book is not exactly a short and easy read for the next flight. Hopefully I’ll be able to start the next year as slighly less invalid DBA.

OS bottle necks

I’m reading “Oracle Insights” these days, and I simply can’t recommend it enough. The book is a collection of articles written by very smart and experienced DBAs, the articles range over many different topics, but most center around issues with performance and tuning. Every article contains countless insights and ideas, most of them are so practical you can try them at work the next day. But the best thing is that the book is so well written, so readable and so fun that like any good book you can barely put it down, you will find yourself laughing out loud and crying over broken systems. Its really nothing like any Oracle book you ever read.

Gaja Vaidyanatha wrote an article called Compulsive Tuning Disorder about the right and wrong way to tune system, and inside this article was a little gem about identifying OS bottlenecks. For me, this short section alone was worth the price of the book.

Vaidyanatha advocates using two tools for collecting metrics – sar and vmstat, and to look at just two metrics – CPU run queue and memory scan rate. He claims that these are the only two metrics that are important for database tuning – how many processes are waiting for CPU at a given time, and how long the memory manager takes before finding more memory. He then goes on a bit about the exact meaning of these metrics, what the ideal values are and a bit on the important insights you get from the CPU utilization breakdown that sar gives (For example, don’t add processing power if you have lots of iowait).

So simple, and it makes total sense. My life has been changed and improved after reading about 2 pages. Tomorrow I’ll add monitors to track these metrics on our production servers and see if it is as good in practice as it is in theory. I just love finding simple reasonable ideas that are written so well.