Category: Tech, Code & AI

  • The Second Lecture

    So the second lecture was on Software Engineering. Big word and unfortunately means big problem. The class was introduced to Agile/Scrum which is a newer methodology compared to the old waterfall/spiral SDLC model that I was taught in school.

    I used Agile/Scrum about two years ago in my previous job and my experiences were much similar to the ex-students who presented. I worked with teams from about 2 to 6 in size. Honestly, not many teams can get past 10 pax because of $$$. I must say 80 is an awesome huge group!

    So here’s my observations:

    • The common effort multiplier is between 2.5 to 3 not because people are slow or bad at estimating, but because they did not consider time taken for communication and other context switching overheads. However as the team gets better and better at estimating their effort this multiplier can go down to about 2. Don’t forget, Project Manager doesn’t do the actual work but still gets paid. 🙂 So where does his effort go?
    • The biggest problem with estimating effort is with companies billing by the hour. I was constantly questioned for high estimates (thanks to my 3x multiplier) because it ballooned the cost of a project and I was pressurized to push it down, but guess what? It always overruns, i.e. the original high estimate was correct.
    • Agile works well with small projects too, just that you may not need to religiously hold the daily scrums, but the idea of having broken parts down into bite-sizes is the key to easing project management.

    Wei Man is right. Geeks are bad at estimating effort, but we have to know what it takes to do something so that we can manage ourselves. Time, energy and life are finite and therefore our efforts are finite as well. If you don’t learn the skill of estimating efforts I can 100% assure you that you’ll overrun your projects. This is from a personal experience from not getting paid and even almost being sued. 🙁

    ***

    On the documentation part, Prof. Ben is right. There’s a job market out there with people writing documentation. This type of job is called Technical Writing. If you’re good with language, maybe this is a job you can pursue. There’s not many of these companies around and their clients are usually huge (Aerospace, Military, etc.) so you get paid pretty decent. Not to late to change courses now.

  • Aspiration 5: My Friends All Hate Me

    Just for laughs 😀

    http://apps.facebook.com/ajsdfasfj/aspiration5.php

    BTW, fb:wallpost tag is broken (it does not show up as it’s supposed to)

  • Teaming and Waving

    Looks like I’ve gotten my first team for the Facebook assignment and am still trying to figure out a team for the second Seminar assignment. Hopefully I can get that settled as well so I can get this off my back and concentrate on the actual work rather than kay-pohing about people’s lives.

    Meanwhile I’m poking around Google Wave. Actually not really very excited yet. I’m more confused than excited – I can’t seem to find a practical use for it at the moment. I will try to explore it more. It looks useful as an internal Wiki kind of thing. I won’t use it to replace my regular IM though. Problem here is, Wave is invite-only and I have limited invites so I cannot realize the full collaborative power of this tool yet. :S

    Short and sweet post. I need to get back to work. Sigh.

  • First Peek at Amazon EC2

    I just got my Amazon EC2 account today and poked around a bit. Technically, it’s just a super cluster of virtualized servers running a (very likely) hacked copy of the open source Xen with a AJAX-enabled web management interface. The servers are undoubtedly Intel Xeons.

    [root@domU-12-31-39-09-2E-31 ~]# uname -a
    Linux domU-12-31-39-09-2E-31 2.6.18-xenU-ec2-v1.2 #2 SMP Wed Aug 19 09:04:38 EDT 2009 i686 i686 i386 GNU/Linux
    [root@domU-12-31-39-09-2E-31 ~]# cat /proc/cpuinfo
    processor    : 0
    vendor_id    : GenuineIntel
    cpu family    : 6
    model        : 23
    model name    : Intel(R) Xeon(R) CPU           E5430  @ 2.66GHz
    stepping    : 10
    cpu MHz        : 2666.760
    cache size    : 6144 KB
    fdiv_bug    : no
    hlt_bug        : no
    f00f_bug    : no
    coma_bug    : no
    fpu        : yes
    fpu_exception    : yes
    cpuid level    : 13
    wp        : yes
    flags        : fpu tsc msr pae mce cx8 apic mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc up pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
    bogomips    : 5335.77

    There’s nothing technically amazing here but it’s interesting how Amazon put it together into a pay-per-use revenue model. It seems like they got the billing portion right.

    Personally, I don’t quite like the management of it though. If you’ve used VMware ESX or Citrix XenServer you might agree with me.

    For example, I couldn’t alter my firewall configuration once my instance was deployed. I created my first instance with a default firewall rule that drops everything, so in desperation I created another one.

    Then I realized I couldn’t delete an instance either. It took me a while to figure out that there is actually a command line client tool written in Java that allows me to delete an instance. In fact, the client tool has way more capabilities than the funky AJAX web interface.

    Here’s the Getting Started Guide. You need to read this to learn how to set up the authentication mechanisms. I presume most of us here can set up the Java environment variables no problem.

    Here’s the EC2 Command Line Tools Reference.

    It took me quite a while to find these links so do bookmark them.

    ***

    Just a quick start for everyone here since the authentication part is a hassle. The documentation had a bunch of talk cock before they got to the point.

    1. You’ll need to login to AWS, then go under the Your Account > Security Credentials menu on the top right hand corner.
    2. Scroll down and look under the Access Credentials heading.
    3. Click the X.509 Certificates tab.
    4. Click Create a new Ceritificate.
    5. Download both the Private Key File and Certificate File.
    6. Get down to your command prompt.
    7. Change to the directory where you unzipped the EC2 API tools.
    8. Make sure JAVA_HOME and PATH are both set.
    9. Set EC2_HOME to the directory in step 7 above.
    10. Change to the bin directory within the EC2 API tools directory.

    You’re all ready to go run the .cmd files (for Windows) or the non .cmd files (for MacOS/Linux guys).

    ***

    Update: Here’s a freebie for the MacOS X users – paste these into ~/.bash_profile so you don’t have to specify your key and cert all the time. (Edit where necessary.)

    export JAVA_HOME=/Library/Java/Home
    export EC2_HOME=~/Downloads/ec2-api-tools-1.3-46266
    export EC2_PRIVATE_KEY=~/Downloads/pk-XXXX.pem
    export EC2_CERT=~/Downloads/cert-XXXX.pem
    export PATH=$PATH:$EC2_HOME/bin

  • Hardening Linux and Apache Servers for DDoS

    In my earlier entry I discussed an interesting topic on firewalls and why we don’t need them. I put a small LAMP server to the test and got my results.

    Attack Information:

    • Type: TCP SYN flood
    • Max performance: 26Kpps (8Mbps)
    • Source IP Spoofing: Yes

    Victim A Specifications:

    • VMware Guest on a Single Core Opteron 1.8GHz Sun X2100
    • CentOS 4.x + Apache 2.x
    • 768MB RAM
    • Tuned (see below)

    Here’s what I’ve added to tune the Linux TCP stack in /etc/sysctl.conf:


    net.ipv4.tcp_abort_on_overflow = 1
    net.ipv4.tcp_fin_timeout = 15
    net.ipv4.tcp_low_latency = 1
    net.ipv4.tcp_syncookies = 1
    net.ipv4.tcp_max_syn_backlog = 2048
    net.ipv4.tcp_synack_retries = 3
    net.ipv4.tcp_sack = 0
    net.ipv4.ip_conntrack_max = 65535
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    net.ipv4.ip_local_port_range = 1024 65000
    net.ipv4.tcp_keepalive_intvl = 15
    net.ipv4.tcp_keepalive_probes = 4
    net.ipv4.tcp_keepalive_time = 1800

    Here’s what I’ve added to the top of my iptables configuration in /etc/sysconfig/iptables as well:


    -N SYN
    -A SYN -m limit --limit 20/s --limit-burst 50 -j RETURN
    -A SYN -j DROP
    -A INPUT -p tcp --syn -j SYN

    * Note: During my testing, I added a log entry before dropping the packet as this floods the logs and kills the CPU and I/O so I highly discourage doing so.

    I repeated the same test on another VM running in a much more powerful Dell 2850 and with no modifications to the kernel or iptables.

    Victim B Specifications:

    • VMware Guest on a 2 x Dual Core Xeon 3.2GHz Dell 2850
    • CentOS 5.x + Apache 2.x
    • 256MB RAM
    • No Tuning

    Results:

    • Victim A held up to 16Kpps SYN flood (approx 5Mbps) but slowed down a little
    • Victim A held up to respond at 26Kpps SYN flood (approx 8Mbps) but was extremely slow
    • Victim B held up to 26Kpps SYN flood (approx 8Mbps) and did not slow down at all

    At this point in time, I couldn’t generate any more SYN packets as I lacked the hardware to do so, but it has given some conclusive results that a reasonably powerful LAMP hardware could take on modest DDoS attacks if configured correctly. I would expect a bare metal hardware with decent CPU performance to hold up much much more than what I’ve tested.

    Time to ditch that firewall!

  • Being Ignorant About DDoS and Why Firewalls Suck

    I’ve just attended a one day “seminar” with folks at Arbor Networks and it has been insightful.

    It seems people are still pretty ignorant about DDoS attacks. Unlike the 1999 CIH virus that was programmed to take out a computer by corrupting it’s BIOS EEPROM, most of the viruses, worms, malwares and whatnots on the Internet today are around for one simple reason – money. Obviously if you’re good enough to write worms, you’d think “why write a worm for fun, when I can make money?” These worms infect computers to build Botnets, and Botnets are sold for real money on the black market to take down sites (via a DDoS), send spam, and all sorts of other things.

    There was one point in particular though that caught my attention, and it was that firewalls (or in fact any type of inline device such as load balancers) are potentially targets for DDoS attacks. To make matters worse, the higher the OSI layer the firewall capability goes, the worse it gets in terms of performance and reliability.

    Believe it or not, firewalls are vulnerable to serious security issues like buffer overflows just like any other server or appliance with an IP address. So it turns out that firewalls are the biggest marketing scam in the history of IT security because companies have spent millions and millions of dollars on these stuff that don’t offer much protection than say, iptables.

    Just about a month ago, I spoke to one of our customers who experienced a DDoS attack launched towards their co-location in the USA. The DDoS traffic was approximately 500Mbps and it completely took out the firewall. This site provided online payment services to customers and was up and down for days. Their firewall was tiny in comparison to the DDoS they got – on paper specs states performance capabilities of 90Mbps or 30Kpps at 2.8K sessions/sec with a max of 8K sessions at any time. Of course, these are lab specifications and real world traffic wouldn’t be as forgiving.

    A simple DDoS attack that’s merely 10Mbps in traffic volume would have generated millions of packets per second with a 1-byte  UDP or ICMP packet. Taking down such a firewall would be a breeze. In fact, a single modern day computer on a broadband connection could probably do the job.

    If it was a TCP SYN flood, it would have been way easier. Sending 2K TCP SYN packets per second is child’s play, so filling the firewall’s state table really takes no more than 10 seconds.

    I had a chat with my wife who audits financial institutions (FIs) based on the PCI-DSS standard. Most FIs providing payment card services will have to conform to this standard. This standard, however, mandates that a firewall is required to comply. Unfortunately, most FIs have a pretty average Internet connectivity pipe – somewhat in the range of 20Mbps to 100Mbps. They scale their firewalls to their connectivity, so what they have, well, closely resembles the one I described earlier.

    So why were firewalls invented?

    Early operating systems didn’t provide packet filtering capabilities, so the early firewalls were really just stateless packet filters that basically routed (not NAT’ed) traffic and dropped unwanted requests based on simple IP, protocol and port numbers to services that weren’t supposed to be public. Then the idea of NAT came about (remember the days of WinRoute) to allow multiple computers on a LAN to share a single IP address on a WAN link. Some smart guy then figured, “oh well, let’s put servers on a private subnet and use the NAT technology to map public and private address spaces. This way, we’re safer!” Agreeably, that was the dumbest idea ever and is a PITA to manage, but millions of servers are configured this way today. Over time, these features were slowly incorporated into the all-in-one junkbox we now call the Firewall. Sweet.

    Personally, I don’t have a firewall sitting in front of my servers. All my servers are individually configured to run iptables (or ipfilter on Solaris, etc.). I am going to test the Linux TCP stack with Apache from a default CentOS install to see how much SYN flood it can hold up before giving up and maybe post some results here, including what I tweaked in the kernel.