This year, the PushDO malware made a powerful comeback. Our team have been analyzing and monitoring this malware through a sinkhole system.
The common way for a bot to contact its Command & Control server is to use a static list of preconfigured domain names that are hardcoded in the bots’ binaries and correspond to the locations of the Command & Control servers. In time, however, cybercriminals came up with a new idea: to use an algorithm, called Domain Generation Algorithm, which is embedded into the malware samples and its role is to dynamically generate domain names. These generated domain names must have the property of changing over time. While the botmaster may choose which domain name / domain names to register from the generated ones, the bots must try all of them until a valid Command & Control server is found.
Shortly, DGA (Domain Generation Algorithm) is an algorithm that generates pseudo-random domain names in order to obtain a list of candidate Command & Control Servers’ domain names. Making use of a DGA provides a series of advantages like overcoming domain blacklisting, resisting domain takedowns by simply registering another domain generated by the same DGA, avoiding dynamic analysis and extraction of C&C domain names. DGAs became popular among cybercriminals specifically because of all these advantages. Such being the case, a lot of malware families started to use DGAs. These include Conficker, Bobax, Sinowal, Kraken, Srizbi, Zeus, BankPatch, Bonnana, TDL, Bamital, Flashback, and so on.
However, it opened an opportunity for malware researchers also. It became possible to sinkhole a botnet by simply purchasing a domain name generated by a DGA. In the past sinkholing was also possible by dealing with ISPs, registrars, and DNS/DynDNS providers, which could redirect the requests coming for malicious domain names to the AV vendors’ or malware researchers servers. The second option is not as efficient as the first one because the time needed to provide the proof that those domain names are used in malicious purposes might be long enough for the bomasters to change their Command & Control servers.
PushDO makes use of both techniques mentioned above. First of all, it tries to contact a preconfigured domain name, which is hardcoded in its binary and, only if PushDO doesn’t succeed in establishing a communication with the hardcoded domain name, it will try to use its Domain Generation Algorithm.
It was also interesting to observe that there are some PushDO versions that hide their malicious hardcoded domain name among a long list of clean hardcoded domain names. Besides protecting the hardcoded domain name against static analysis, it also prevents the domain name to be observed through dynamic analysis. The first purpose is achieved by using a hash function computed on each of the hardcoded domain names in order to find the one that matches a hardcoded hash value corresponding to the malicious domain name. While the second purpose is achieved by generating a lot of web traffic to all the clean domain names in order to allow the malicious one to pass unobserved while capturing and analyzing web packets. Here are some examples of C&C domain names found among the hardcoded clean domain names: ane***s.com, re***al.com, by***ty.com.
In general, DGAs make use of two techniques in order to ensure the change of the generated domains through time: they may use a time-based seed for the generation of the domain names or a non-deterministic algorithm. PushDO uses a time dependent and deterministic DGA. It generates 30 different domain names per day.
In order to find a valid Command and Control (C&C) server, the bot tries, first, to connect to the 30 domain names generated for the current day. The domain names for a specific day are tried in the exact order they are generated. If it can’t succeed in finding a valid Command and Control server among the current day’s domain names, it begins to search through the previous 30 days (starting with the previous day and moving towards the past with one day at a step) and through the next 15 days (starting with the next day and moving towards the future with one day at a step). It stops either when a valid Command and Control server is found or when all the (1+30+15) * 30 = 1380 domain names are already tried unsuccessfully. In the second case, it will wait for 2 minutes before trying again the same 1380 domain names considered active for the current day.
To harden the analysis, the symmetric key used to protect the communication between the C&C and its bots is encrypted with RSA. The PushDO bot contains its private key and the server’s public key hardcoded into its binaries. The public key is used to encrypt the data sent to the server, and the private key is used to decrypt the response received from the server.
If a valid package is sent to the server, a valid response will be received. If the package doesn’t have the correct structure, a response with size 0 will be returned by the server.
Building the package
In the analyzed samples, we found some variants of information and structures sent to the valid C&C. First of all the malware gathers some information from its own binary file and from the victim’s computer. This buffer has 3 parts.
The first part is given by 9 dwords. There are some dwords for which values are not checked, but whose presence is mandatory. The first and the fourth ones are hardcoded values of 1 and any changes will give an invalid answer from the server. The second dword can vary: for the first package type it can be 3, 4, 5, 6 or 7 and for the second a value greater than 8. The rest of the dwords can be replaced with any random number.
The second part of the buffer contains some strings. The structure is fixed:
The first item from this second part differentiates between the two types of packages found by us. This is the matched domain from the hardcoded list. In the first case it saves the domain on 40 fixed bytes, in the second case it follows the structure above.
The next strings with the above mentioned structures are:
Current process name
3 hardcoded dwords or a hardcoded RCLSID
A hardcoded RCLSID or operating system information
The interesting thing about this second component is the fact that only the structure is verified. If we replace the strings and dwords with something else, we will still get a valid answer from the server.
The third part consists of a random number (max count 0x50) of random bytes. This limit also seems not to be checked.
For each package sent to the server it generates a new RC4 key and encrypts the buffer built above. This key is encrypted with the RSA algorithm using the public key of the server.
The encrypted buffer is encapsulated in the package sent to the server. Here you have the structure of the package.
The structure of the packages sent to the C&C servers
The marker is a random number with a specific property: (N-2) % 7883 = 0. If this condition is not met, the server does not answer with a valid response. The seed for the XOR encryption is a random dword number.
The server response
The responses from the server are html templates, most likely stolen from legitimate websites. The malicious payload is inserted as a base64 string on a random offset.
The malicious part can be identified by two markers:
False JFIF header.
Even if this header is present or not, we can find the validity marker, the XOR seed, the length of the encrypted key, the encrypted key itself and a buffer encrypted with the RC4 algorithm. The encrypted RC4 key and its length are encrypted with the XOR encryption function. After extracting the RC4 key using the private key and decrypting the buffer, the resulted data contains a varied number of gziped malicious files.
During the time we monitored the C&C servers, we obtained approximately 1000 different malicious files. One response can contain up to 7 or 8 malicious files. We were served with four types of malware: a spam component (Cutwail, ~900 files), an update component (Pushdo family ~10 files), a DoS trojan component (~40 files), and a password stealer component (Fareit family ~5 files).
Once downloaded and decrypted, these malicious files are executed, infecting the victim’s computer with different payloads and maybe preparing the ground for more malicious activities.