Kafka or Flume?

A question that keeps popping up is “Should we use Kafka or Flume to load data to Hadoop clusters?”

This question implies that Kafka and Flume are interchangeable components. It makes as much sense to me as “Should we use cars or umbrellas?”. Sure, you can hide from the rain in your car and you can use your umbrella when moving from place to place. But in general, these are different tools intended for different use-cases.

Flume’s main use-case is to ingest data into Hadoop. It is tightly integrated with Hadoop’s monitoring system, file system, file formats, and utilities such a Morphlines. A lot of the Flume development effort goes into maintaining compatibility with Hadoop. Sure, Flume’s design of sources, sinks and channels mean that it can be used to move data between other systems flexibly, but the important feature is its Hadoop integration.

Kafka’s main use-case is a distributed publish-subscribe messaging system. Most of the development effort is involved with allowing subscribers to read exactly the messages they are interested in, and in making sure the distributed system is scalable and reliable under many different conditions. It was not written to stream data specifically for Hadoop, and using it to read and write data to Hadoop is significantly more challenging than it is in Flume.

To summarize:
Use Flume if you have an non-relational data sources such as log files that you want to stream into Hadoop.
Use Kafka if you need a highly reliable and scalable enterprise messaging system to connect many multiple systems, one of which is Hadoop.


“A beginning is the time for taking the most delicate care that the balances are correct.”

It is spring. Time for planting new seeds. I started on a new job last week, and it seems that few of my friends and former colleagues are on their way to new adventures as well. I’m especially excited because I’m starting not just a new job – I will be working on a new product, far younger than Oracle and even MySQL. I am also making first tiny steps in the open-source community, something I’ve been looking to do for a while.

I’m itching to share lessons I’ve learned in my previous job, three challenging and rewarding years as a consultant. The time will arrive for those, but now is the time to share what I know about starting new jobs. Lessons that I need to recall, and that my friends who are also in the process of starting a new job may want to hear.

Say hello
I’m usually a very friendly person and after years of attending conferences I’m very comfortable talking to people I’ve never met before. But still, Cloudera has around 200 people in the bay area offices, which means that I had to say “Hello, I’m Gwen Shapira the new Solutions Architect, who are you?” around 200 times. This is not the most comfortable feeling in the world. Its important to go through the majority of the introductions in the first week or two, later on it becomes a bit more awkward. So in the first week it will certainly seem like you are doing nothing except meeting people, chatting a bit and franctically memorizing names and faces. This is perfectly OK.

Get comfortable being unproductive
The first week in a new job feels remarkably unproductive. This is normal. I’m getting to know people, processes, culture, about 20 new products and 40 new APIs. I have incredibly high expectations of myself, and naturaly I’m not as fast installing Hadoop cluster as I am installing RAC cluster. It takes me far longer to write Python code than it does to write SQL. My expectations create a lot of pressure, I internally yell at myself for taking an hour or so to load data into Hive when it “should” have taken 5 minutes. But of course, I don’t know how long it “should” take, I did it very few times before. I’m learning and while learning has its own pace, it is an investment and therefore productive.

Have lunch, share drinks
The best way to learn about culture is from people, and the best way to learn about products is from the developers who wrote them and are passionate about how they are used. Conversations at lunch time are better than tackling people in the corridor or interrupting them at their desk. Inviting people for drinks are also a great way to learn about a product. Going to someones cube and asking for an in-depth explanation of Hive architecture can be seen as entitled and bothersome. Sending email to the internal Hive mailing list and saying “I’ll buy beer to anyone who can explain Hive architecture to me” will result in a fun evening.

If its not overwhelming, you may be in the wrong job
I’m overwhelmed right now. So many new things to learn. First there are the Hadoop ecosystem products, I know some but far from all of them, and I feel that I need to learn everything in days. Then there is programming. I can code, but I’m not and never have been a proficient programmer. My colleagues are sending out patches left and right. It also seems like everyone around me is a machine learning expert. When did they learn all this? I feel like I will never catch up.

And that is exactly how I like it.

Make as many mistakes as possible
You can learn faster by doing, and you can do faster if you are not afraid of failing and making mistakes. Mistakes are more understandable and forgivable when you are new. I suggest using this window of opportunity and accelerate your learning by trying to do as much as possible. When you make a mistake smile and say “Sorry about that. I’m still new. Now I know what I should and shouldn’t do”

Take notes
When you are new a lot of things will look stupid. Sometimes just because they are very different from the way you are used to things in a previous job. Don’t give in to the temptation to criticise everything, because you will look like a whiner. No one likes whiner. But take note of them, because you will get used to them soon and never see things with “beginner mind” again. In few month take a look at your list, if things still look stupid, it will be time to take on a project or two to fix them.

I may be new at this specific job, but I still have a lot to contribute. I try hard to look for opportunities and I keep finding out that I’m more useful than I thought. I participate in discussions in internal mailing lists, I make suggestions, I help colleagues solve problems. I participate in interviews and file tickets when our products don’t work as expected. I don’t wait to be handed work or to be sent to a customer, I look for places where I can be of use.

I don’t change jobs often. So its quite possible that I don’t know everything there is to know about starting a new job. If you have tips and suggestions to share with me and my readers, please comment!

Environment Variables in Grid Control User Defined Metrics

This post originally appeared at the Pythian blog.

Emerson wrote: “Foolish consistency is the hobgoblin of small minds”. I love this quote, because it allows me to announce a presentation titled “7 Sins of Concurrency” and then show up with only 5. There are places where consistency is indeed foolish, while other times I wish for more consistency.

Here is a nice story that illustrates both types of consistency, or lack of.

This customer Grid Control installed in their environment. We were asked to configure all kinds of metrics and monitors for several databases, and we decided to use the Grid Control for this. One of the things we decided to monitor is the success of the backup jobs.

Like many others, this customer runs his backup jobs from cron and the cron job generates an RMAN logfile. I thought that a monitor that will check the logfile for RMAN- and ORA- errors will be just the thing we need.

To be consistent, I could have moved the backup jobs to run from Grid Control scheduler instead of cron. But in my opinion, this would have been foolish consistency – why risk breaking perfectly good backups? Why divert my attention from the monitoring project to take on side improvements?

To have Grid Control check the log files, I decided to use OS UDM: Thats a “User Defined Metric” that is defined on “host” targets and allows to run a script on the server. I wrote a very small shell script that finds the latest log, greps for errors and counts them. The script returns the error count to Grid Control. More than 0 errors is a critical status for the monitor. I followed the instructions in the documentation to the letter – and indeed, everything easily worked. Hurray!

Wait. There’s a catch (and a reason for this blog post). I actually had two instances that are backed up, and therefore two logs to check. I wanted to use the same script and just change the ORACLE_SID in the environment.

No worries. The UI has a field called “Environment” and the documentation says: “Enter any environmental variable(s) required to run the user-defined script.”

One could imagine, based on the field name and the documentation, that if I type: “ORACLE_SID=mysid” in this field, and later run “echo $ORACLE_SID” in my script, the result would be “mysid”.

Wrong. What does happen is that $ORACLE_SID is empty. $1, on the other hand, is “{ORACLE_SID=mysid}”.

To get the environment variable I wanted, I had to do: tmp=(`echo $1 | tr ‘{}’ ‘  ’`); eval $tmp

It took me a while to figure this out as this behavior is not documented and I found no usage examples that refer to this issue.

Consistency between your product, the UI and the documentation is not foolish consistency. I expect the documentation and field descriptions to help me do my job and I’m annoyed when it doesn’t.

At least now this behavior is documented somewhere so future googlers may have easier time.

The Senile DBA Guide to Troubleshooting Sudden Growth in Redo Generation

I just troubleshooted a server where the amounts of redo generated suddenly exploded to the point of running out of disk space.

After I was done, the problem was found and the storage manager pacified, I decided to save the queries I used. This is a rather common issue, and the scripts will be useful for the next time.

It was very embarrassing to discover that I actually have 4 similar but not identical scripts for troubleshooting redo explosions. Now I have 5 🙂

Here are the techniques I use:

  1. I determine whether there is really a problem and the times the excessive redo was generated by looking at v$log_history:
    select trunc(FIRST_TIME,'HH') ,sum(BLOCKS) BLOCKS , count(*) cnt
    ,sum(decode(THREAD#,1,BLOCKS,0)) Node1_Blocks
    ,sum(decode(THREAD#,1,1,0)) Node1_Count
    ,sum(decode(THREAD#,2,BLOCKS,0)) Node2
    ,sum(decode(THREAD#,2,1,0)) Node2_Count
    ,sum(decode(THREAD#,3,BLOCKS,0)) Node3
    ,sum(decode(THREAD#,3,1,0)) Node3_Count
    from v$archived_log
    where FIRST_TIME >sysdate -30
    group by trunc(FIRST_TIME,'HH')
    order by trunc(FIRST_TIME,'HH') desc
  2. If the problem is still happening, I can use Tanel Poder’s Snapper to find the worse redo generating sessions. Tanel explains how to do this in his blog.
  3. However, Snapper’s output is very limited by the fact that it was written specifically for situations where you cannot create anything on the DB. Since I’m normally the DBA on the servers I troubleshoot, I have another script that actually gets information from v$session, sorts the results, etc.
    create global temporary table redo_stats
    ( runid varchar2(15),
      sid number,
      value int )
    on commit preserve rows;
    truncate table redo_stats;
    insert into redo_stats select 1,sid,value from v$sesstat ss
    join v$statname sn on ss.statistic#=sn.statistic#
    where name='redo size'
    insert into redo_stats select 2,sid,value from v$sesstat ss
    join v$statname sn on ss.statistic#=sn.statistic#
    where name='redo size'
    select *
            from redo_stats a, redo_stats b,v$session s
           where a.sid = b.sid
           and s.sid=a.sid
             and a.runid = 1
             and b.runid = 2
             and (b.value-a.value) > 0
           order by (b.value-a.value)
  4. Last technique is normally the most informative, and requires a bit more work than the rest, so I save it for special cases. I’m talking about using logminer to find the offender in the redo logs. This is useful when the problem is no longer happening, but we want to see what was the problem last night. You can do a lot of analysis with the information in logminer, so the key is to dig around and see if you can isolate a problem. I’m just giving a small example here. You can filter log miner data by users, segments, specific transaction ids – the sky is the limit.
    -- Here I pick up the relevant redo logs.
    -- Adjust the where condition to match the times you are interested in.
    -- I picked the last day.
    select 'exec sys.dbms_logmnr.add_logfile(''' || name ||''');'
    from v$archived_log
    where FIRST_TIME >= sysdate-1
    and THREAD#=1
    and dest_id=1
    order by FIRST_TIME desc
    -- Use the results of the query above to add the logfiles you are interest in
    -- Then start the logminer
    exec sys.dbms_logmnr.add_logfile('/u10/oradata/MYDB01/arch/1_8646_657724194.dbf');
    exec sys.dbms_logmnr.start_logmnr(options=>sys.dbms_logmnr.DICT_FROM_ONLINE_CATALOG);
    -- You can find the top users and segments that generated redo
    select seg_owner,seg_name,seg_type_name,operation ,min(TIMESTAMP) ,max(TIMESTAMP) ,count(*)
    from v$logmnr_contents 
    group by seg_owner,seg_name,seg_type_name,operation
    order by count(*) desc
    -- You can get more details about the specific actions and the amounts of redo they caused.
    select LOG_ID,THREAD#,operation, data_obj#,SEG_OWNER,SEG_NAME,TABLE_SPACE,count(*) cnt ,sum(RBABYTE) as RBABYTE ,sum(RBABLK) as RBABLK 
    from v$logmnr_contents 
    group by LOG_ID,THREAD#,operation, data_obj#,SEG_OWNER,SEG_NAME,TABLE_SPACE
    order by count(*)
    -- don't forget to close the logminer when you are done
    exec sys.dbms_logmnr.end_logmnr;
  5. If you have ASH, you can use it to find the sessions and queries that waited the most for “log file sync” event. I found that this has some correlation with the worse redo generators.

    -- find the sessions and queries causing most redo and when it happened
    select SESSION_ID,user_id,sql_id,round(sample_time,'hh'),count(*) from V$ACTIVE_SESSION_HISTORY
    where event like 'log file sync'
    group by  SESSION_ID,user_id,sql_id,round(sample_time,'hh')
    order by count(*) desc
    -- you can look the the SQL itself by:
    select * from DBA_HIST_SQLTEXT
    where sql_id='dwbbdanhf7p4a'

Lessons From OOW09 #2 – Consolidation Tips

The session was called “All in One” and it was given by Husnu Sensoy. A young and very accomplished DBA from Turkey. I chatted with him during ACE dinner and it turns out we have many colleagues in common. This was probably the most useful presentation I’ve heard this OpenWorld. As I am going into a large consolidation project for next year, I am glad I can learn from the experience of someone like Husnu who already gone through this and is very willing to share the experience.

His presentation is shared on his blog, so I’ll just give the parts that I consider to be highlights. You will probably learn more by reading his slides. He had tons of good content and he talks very very fast, so I’m sure I missed a bunch of good stuff. Since his content is readily available, I’m mixing in a lot of my own thoughts here.

The problems he set out to solve:

  • Too many DBs and too few DBAs.
  • Some servers are doing almost nothing.
  • Some servers have no HA.

Pick candidates for consolidation based on: License costs, utilization, data center location, dependencies, I/O characteristics, risk levels.

The driver for our consolidation were license costs – our new machines had two quad cores instead of one dual core, so license costs suddenly quadrapled and we were forced into cost saving consolidations. We mostly used data center location and risk levels to decide on the plan. Our most OLTP system, the one that is most sensitive to slow-downs, will remain unconsolidated for now.

Prior to consolidation collect lots of system/performance metrics. They will help pick candidates, plan capacity, test and later troubleshoot.

Don’t forget to talk to DBAs and business reps when making the consolidation plans, they will have their own ideas and this can be important input.

Additive linear models are recommended for capacity planning. He gave lots of guidelines on how to do this. Pages 26-36 in his PDF have the details. I could have sworn he recommended to stay below the 65% utilization when planning for CPU capacity, but I cannot see it in his slides. In any case – do this, because any higher than that and the linear additive model is questionable.

Also pay attention to the part about preferring larger servers and less RAC nodes, since RAC adds complexity. And to the part about every storage system delivering about 70-80% of spec. Actually, this is more true for the EMC system he used. Our Netapps seem to be up to spec.

Don’t mix sequential and random IO (i.e. OLTP and DW) is a good idea. A lot of places can’t really do this because of the way their apps are designed.

Benchmarking the new system to test the capacity plan is a great idea. I’d love to see more concrete information on how to benchmark, maybe a whole other presentation on this. One of the things that worry me most about our consolidation plans is that I’m not sure how good our tests will be. Husnu recommended HammerOra, which I’ll check out.

Crash tests. We did those ages ago when we moved to RAC architecture and then again for Netapp clusters. Was lots of fun and maybe its time to do this again. Husnu advised to ask support for a list of good test scenarios. I recommend taking your sysadmins, storage admins and net admins for few beers and asking them for scenarios – they generally come up with very creative stuff.

Good tip: In 11g, memory target does not work with huge pages. You are using huge pages, right?

Write the backup and recovery document as the first document on the new system.

Pages 76-86 have good advice on merging databases. We’ll be doing that too and I was glad to see that we came up with the same plans and same problems as Husnu.

His last advice is the best: “You never know how long this is going to take.” So true! Who could have known that we will be delayed for 3 month by an IT security group that popped up from no-where with requirement that we will pass certain security audits that we’ve never heard of. Life in a big organization can be full of surprises, so be prepared 🙂

Lessons From OOW09 #1 – Shell Script Tips

During OpenWorld I went to a session about shell scripting. The speaker, Ray Smith, was excellent. Clear, got the pace right, educating and entertaining.

His presentation was based on the book “The Art of Unix Programming” by one Eric Raymond. He recommended reading it, and I may end up doing that.

The idea is that shell scripts should obey two important rules:

  1. Shell scripts must work
  2. Shell scripts must keep working (even when Oracle takes BDUMP away).

Hard to object to that 🙂

Here’s some of his advice on how to achieve these goals (He had many more tips, these are just the ones I found non-trivial and potentially useful. My comments in italics.)

  1. Document dead ends, the things you tried and did not work, so that the next person to maintain the code won’t try them again.
  2. Document the script purpose in the script header, as well as the input arguments
  3. Be kind – try to make the script easy to read. Use indentation. Its 2009, I’m horrified that “please indent” is still a relevant tip.
  4. Clean up temporary files you will use before trying to use them:

    function CleanUpFiles {
    [ $LOGFILE ] && rm -rf ${LOGFILE}
    [ $SPOOLFILE ] && rm -rf ${SPOOLFILE}
  5. Revisit old scripts. Even if they work. Technology changes. This one is very controversial – do we really need to keep chasing the latest technology?
  6. Be nice to the users by working with them – verify before taking actions and keep user informed of what the script is doing at any time. OPatch is a great example.
  7. Error messages should explain errors and advise how to fix them
  8. Same script can work interactively or in cron by using: if [ tty -s ] …
  9. When sending email notifying of success or failure, be complete. Say which host, which job, what happened, how to troubleshoot, when is the next run (or what is the schedule).
  10. Dialog/Zenity – tools that let you easily create cool dialog screens
  11. Never hardcode passwords, hostname, DB name, path. Use ORATAB, command line arguments or parameter files.I felt like clapping here. This is so obvious, yet we are now running a major project to modify all scripts to be like that.
  12. Be consistent – try to use same scripts whenever possible and limit editing permissions
  13. Use version control for your scripts. Getting our team to use version control was one of my major projects this year.
  14. Subversion has HTTP access, so the internal knowledge base can point at the scripts. Wish I knew that last year.
  15. Use deployment control tool like CFEngine. I should definitely check this one out.
  16. Use getopts for parameters. Getopts looked to complicated when I first checked it out, but I should give it another try.
  17. Create everything you need every time you need it. Don’t fail just because a directory does not exist. Output what you just did.
  18. You can have common data files with things like hosts list or DB lists that are collected automatically on regular basis and that you can then reference in your scripts.
  19. You can put comments and descriptions in ORATAB

I Can Has Training Budget

We know how it goes – there is a recession, and companies try to reduce expanses. The next thing you know, your training budget is all gone. Or maybe there is some training budget left, but now 6 DBAs share a sum that is not enough for one Oracle University course. How do you convince your managers that paying for your training is the best investment they can make?

Start by convincing yourself. Remember that your manager probably got to his position because he is good at reading people, so if you don’t really want the training, or don’t really believe you need this training, he may see that and you lost. You have to be 100% sure that you want this training because it will really allow you to improve the way you work.

As an example, lets assume you want to go to Linux Administration course. Its an interesting case, because it is not even evident that a DBA should go to such course.

Then think about your boss for a bit – what parts of the job are most important to him? what are his pet projects? pet peeves.

Once you have your desire for the course and your bosses desires in mind, make a list of all the benefits you can see from going to the course. The important thing is to highlight how the things you want to learn will help with the projects that are most important to your boss, or will address his specific pain points.

So, if your boss loves automation say: “I will learn more shell linux tools so I’ll be able to write better automation scripts”.
If he is a capacity planning person, say: “I will be able to better monitor the OS so we can be more proactive about provisioning”.
If he is a big fan of RAC, say: “With my improved Linux knowledge, I’ll be able to understand low-leve clusterware issues and solve them faster!”

Now you need to decide if you make your pitch face to face or by email. I prefer email. Information I put in the email:
* Course title and instructor (or school name)
* Dates/Times
* Location
* Price
* The list of 3-5 reasons I need this course (as you prepared in the previous paragraph).

Until he makes his decision, keep mentioned once or twice a day how the things you do now will be much better after you take the course: “I still don’t understand how to debug coredumps after the process crashes, but the Linux course may help”, “It takes me 2 hours to copy old files to the second disk, but I’ll probably learn how to do it faster in the Linux course”. Don’t force it, but keep an eye open for opportunities to explain and demonstrate the value you see in the course.

And a questionable tactics that sometimes works: Get an ultra-expensive course rejected before asking for a reasonably-priced course. “I can totally understand you don’t have the budget to send me to Collaborate in Denver, but what about one day training given by our local usergroup at a near-by location?”. I’m not sure if this tactic works because the manager feels guilty about rejecting my request, or if the lower-price seminar just looks better in comparison. I’m not even sure if I recommend it, really. Consider and act at your own risk 😉