Malware at Midyear: a Summary
July 7th, 2010Posted by Francois Paget
Now that we’ve reached the middle of the year, it’s time to take a look at our malware collection. During the first half of the year, 10 million samples entered in our database. That’s certainly no decrease compared with last year.

With approximately 54,800 new samples arriving per day, the total size of our collection is almost 12 terabytes. At end of 2007, in contrast, and with only 5.8 million samples, the total size was only 1.1TB.
In June 2008, I posted a blog called And I say we are detecting between 400,000 and 10,000,000 malware! Two years later, I think we should compare the changes since this date.
First, let’s look at the main malware families–at left, my 2008 graph; at right, the current one. (Click to enlarge these and subsequent charts.)

From these we can see that malware developers have lost their creative spirit. Malware designers create their apps to make money, not for style. Because the old techniques still work, it is not necessary to be inventive, just repetitive. For example, it is not rare to see more than 10,000 Koobface variants in a single month.
My next 2008 graph concerned our figures compared to those of AV-test.org. At that time, I had only a short span, of four months, shown at left.

Two years later, we can observe another increase. Since June 2009, our collection has outgrown the AV-test database by two million. At that date, we started getting more samples from more sources.
To conclude my 2008 blog, I joked that we would detect at the end of 2008 between 450,000 (a figure related to the signature DAT readme file) and 22,000,000 malware (the count of malware samples in our collections). Between these extremes I also introduced another malware measurement scale that some anti-virus vendors use for comparison: the number of main variants.

Today when we quantify the malware world, the consensus is to use the number of unique files in our collections distinguished by their MD5 hash (or checksum). On June 30, we counted 43,337,677 unique binary files. Perhaps we’ll reach 54 million by the end of December.
Back to McAfee Research Blog overview