Wednesday, June 22, 2016

From ROP to LOP bypassing Control FLow Enforcement

Once upon a time breaking the Stack (here) was a metter of indexes and executables memory areas (here). Then it came a DEP protection (here) which disabled a particular area from being executable. This is the fantastic story of ROP (Return Oriented Programming) from which I've been working for long time in writing exploiting and re-writing "resurrectors" (software engines able to convert old exploits into brand new ROP enabled exploits), please take a look: here, here, here, here, here and here. Now it's time to a new way of stack protection named Control-Flow Enforcement designed by Intel. CFE aims to prevent stack execution by using a "canary" stack .. ops this was the old way to call it.. right let me repeat the sentence... by using a "shadow stack" aiming to compare return addresses and a "Indirect Branching Tracking" aiming to track down every valid indirect call/jmp on target program.

Well, I made a joke mentioning the ancient canary words which might remind you how useless it was adding a canary control Byte (or 4 bits, actually) to enforce the entire stack, but this time is structurally  different. We are not facing a canary stack which could be adjusted by user by using "stores commands" such as: MOV, PUSH, POP, XSAVE, but is a user/kernel memory space exclusively used by "control flow commands" such as: CALL, RET, NEAR, FAR, etc.



When shadow stacks are enabled, the CALL instruction pushes the return address on both the data and shadow stack. The RET instruction pops the return address from both stacks and compares them. If the return addresses from the two stacks do not match, the processor signals a control protection exception (#CP). Note that the shadow stack only holds the return addresses and not parameters passed to the call instruction. To provide this protection the page table protections are extended to support an additional attribute for pages to mark them as “Shadow Stack” pages.  (Figure1 from here)
Just to make things a little harder (but it's going to be very useful to introduce a way to bypass Stack Shadow) let me introduce to you a more comprehensive stack defencing framework, defined by Abadi et al  and called Control-Flow Integrity framework. Following I borrow the classification described by Bingchen Lan et Al. on their paper (available here) reporting 4 kinds of Control Flow Integrity Policies (CFI):
  • CFI-call. The target address of an indirect call has to point to the beginning of a function. For instance, indirect call is constrained to the limited addresses, which are specified through statically scanning the binary for function entries.
  • CFI-jump. The target address of an indirect jump should be either the beginning of another function or inside the function where this jump instruction lies. For instance, Branch Regulation prevents jumps across function boundaries to stop attackers from modifying the addresses of indirect jumps.
  • CFI-ret. In coarse-grained CFI, the target address of a ret instruction should point to the location right after any call site. Shadow stack further enhances this constraint, i.e. the ret instruction accurately corresponds to the location after the legitimate call site in its caller.
  • CFI-heuristics. Apart from enforcing specific policies on indirect branches as CFI-call, CFI-jump and CFI-ret do, some CFI solutions tend to detect attacks by validating the number of consecutive sequences of small gadgets.
During the past few years many attack mechanisms bypassed the CIF policies, let me sum they up on the following table.

Figure 2 Comparing attack strategies the green "check" means the technique can bypass the defence policy, the red "x" means it cannot

Lets assume to be able to implement CFI-Ret and CFI-Jump (or CFI-Heuristics ) techniques in a single system. We might apparently guarantee Control Flow Integrity ! Well, it was "kind of true" since Bingchen Lan, Yan Li, Hao Sun, Chao Su, Yao Liu, Qingkai Zeng introduced in a well done paper (here) a LOP Loop Oriented Programming technique.  The main idea is to choose entire functions as gadget instead of using short code fragments or unaligned instructions. In this way the call instruction targets the beginning of a function bypassing CFI-call policy. Moreover CFI-heuristics expects the execution flow on a victim application consists of multiple short code fragments as ROP and JOP does. Since no short code is involved in LOP and it is possibile to select long gadget with many instructions on it LOP can also bypass CFI-Heuristics. The process of chaining gadgets exactly follows the normal carrer-callee (call-ret-pairing) paradigm. The loop gadget acts as proxy (dispatcher) invoking different functional gadgets repeatedly which eventuallu return to the original caller bypassing the CFI-ret policy. Meanwhile there is only one jump instruction used by LOP. This jump instruction works originally for loop functionality and it is untouched by LOP. Hence, CFI-jump is also ineffective towards LOP. The following picture shows the difference between CPROP and LOP.
Figure 3. CROP VS LOP (from here)


It's now interesting defining how a Loop gadget looks like. So, lets define a loop gadget as a complete working function having 3 keys elements such as :

  1. A loop statement
  2. An indirect call instruction within the loop
  3. An index instruction within the loop statement.
The following example is taken from initterm() in msvcrt.dll a Microsoft Windows dynamic library.

Figure 4: Example of LOP gadget


The LOP gadget make possible to set up starting address and ending address. Then Hijacks the control flow to the loop gadget. Then the LOP gadget makes the index pointer pointing to start to start address of the dispatch "table". It takes the next gadget address and uses an indirect call to invoke the addressed lop gadget. Just after the call it returns to the instruction located right after the indirect call in the loop by a legal ret instruction. Later the gadgets modifies the pointing index making it addressing the next gadget. It ends up by comparing the index value and the "end address".

Figure 5 Comparing attacks strategies the green "check" means the technique can bypass the defence policy, the red "x" means it cannot

We can now add an additional raw on the attack-comparing–table as shown in Figure5 introducing LOP as the ultimate way to bypass Control Flow Integrity Techniques. Happy hunting !

Sunday, May 29, 2016

Process Hollowing

Back in 2011 blogs (here, herehere) and papers (here, here, here, here) described a widely used Malware technique to hide malicious actions called: Process Hollowing. Nowadays we are experiencing some "flashbacks" to this delightful technique, so I decided to write a little bit about it, just in case someone needs a "refresh".

Process hollowing is a technique used by some malware in which a legitimate process is loaded on the system solely to act as a container for hostile code. At launch, the legitimate code is deallocated and replaced with malicious code.
Process Hollowing (from here)
The beauty of this technique is in the help given to malicious process to be hidden between conventional processes. But let's walk a little bit on the technique:

Step1.
The Malware starts a legitimate process by using the CreateProcecess within CREATE_SUSPENDED flag enabled in the fdwCreate.


// This function is used to run a new program. It creates a new process // and its primary thread. The new process runs the specified executable // file.
BOOL CreateProcess(
LPCWSTR pszImageName,
LPCWSTR pszCmdLine,
LPSECURITY_ATTRIBUTES psaProcess,
LPSECURITY_ATTRIBUTES psaThread,
BOOL fInheritHandles,
DWORD fdwCreate,
LPVOID pvEnvironment,
LPWSTR pszCurDir,
LPSTARTUPINFOW psiStartInfo,
LPPROCESS_INFORMATION pProcInfo
);
// fdwCreate
// [in] Specifies additional flags that control the priority
// and the creation of the process.

//
// CREATE_SUSPENDED fdwCreate flag
// The primary thread of the new process is created in a suspended state,
// and does not run until the ResumeThread function is called.
Step2.
The process has been created and it's in suspended state. Now it's time to hollow the legitimate code from memory in the hosted process. We might use the following API (ZwUnmapViewOfSection).

// NtUnmapViewOfSection and ZwUnmapViewOfSection are two versions of
// the same Windows Native System Services routine.


// The ZwUnmapViewOfSection routine unmaps a view of a section from
// the virtual address space of a subject process.

// a view can be a whole or partial mapping of a section object in 
// the virtual address space of a process.


NTSTATUS ZwUnmapViewOfSection(
__in HANDLE ProcessHandle,
__in_opt PVOID BaseAddress
);

Step3.
The Malware then allocates  memory for the new code by classically using VirtualAllocEx. The Malware should ensure the code is marked as writable and executable (by using flProtect).



// Reserves or commits a region of memory within the virtual address 
// space of a specified process.


LPVOID WINAPI VirtualAllocEx(
__in HANDLE hProcess,
__in_opt LPVOID lpAddress,
__in SIZE_T dwSize,
__in DWORD flAllocationType,
__in DWORD flProtect
);

// Memory Protection Constant PAGE_EXECUTE_READWRITE = 0x40
// Enables execute, read-only, or read/write access to the committed 
// region of pages.

Step4.
Now it's time to write the malicious code into the hollow host process using the romantic WriteProcessMemory.


// Writes data to an area of memory in a specified process. The entire 
// area to be written to must be accessible or the operation fails.


BOOL WriteProcessMemory(
HANDLE hProcess,
LPVOID lpBaseAddress,
LPVOID lpBuffer,
DWORD nSize,
LPDWORD lpNumberOfBytesWritten
);


Step5.
in order to camouflage the Malware, the author should re-set the normal pagination schema by setting Read/Execute protections like any other normal process by using VirtualProtectEx.
// Changes the protection on a region of committed pages in the virtual 
// address space of a specified process.

BOOL WINAPI VirtualProtectEx(
__in HANDLE hProcess,
__in LPVOID lpAddress,
__in SIZE_T dwSize,
__in DWORD flNewProtect,
__out PDWORD lpflOldProtect
);

It should also set the remote context to point to the new code section. The SetThreatContext API has been used to reach the scope!

// Sets the context for the specified thread.
BOOL WINAPI SetThreadContext(
__in HANDLE hThread,
__in const CONTEXT *lpContext
);


Step6.
It's time to resume the suspended thread (ResumeThread) and "game over" !

// Decrements a thread's suspend count. When the suspend count is 
// decremented to zero, the execution of the thread is resumed.

DWORD WINAPI ResumeThread(
__in HANDLE hThread
);

We've just fired up a brand new (and potentially malicious) process!

Focusing on detection, it is going to be hard if using static signatures (such as AntiVirus romantic signatures) but having the possibility to dynamically analyse system calls (such as a sandboxed environment) the detection rate will increase drastically.

Monday, May 16, 2016

Notorious Hacking Groups in mid 2016

It happens from time to time people asking me what are the most "notorious hacking groups". On February 2015 I wrote a little bit on most notorious group in 2015 (here) but today things changed a little bit. It's hard to answer to such a question since we need a strong definition of "notorious", do we mean the most known groups ever ? Or do we mean the most successful groups ? Or, again, the ones who attack few big organisations or the ones who attacks successfully millions of user PCs ? OK, we might go forth forever on that, so I'll give my personal point of view (which is debatable) based on my findings and on my daily activities.

The following list is not complete at all and it never will be, but if you want to start from scratch to looks for "notorious" group here a nice start:

Pawn Storm,  (Operation PawnStorm) is for sure one of the most interesting hacking group we might observe nowadays.
It is an active economic and political cyber espionage operation that targets a wide range of high-profile entities, from government institutions to media personalities. Its activities were first seen as far back as 2004, but recent developments have revealed more concrete details about the operation itself, including its origins and targets.
Regin. I've been writing about Regin (here) and at that time it was mainly considered an attack. Nowadays after several observable attacks we think it 's most a group of people who built sophisticated attaching tools.  
Regin, first identified in 2008, is a highly complex threat used by the APT group for large-scale data collection and intelligence-gathering campaigns. The development and operation of this threat would have required a significant investment of time and resources. Threats of this nature are rare and the discovery of Regin serves to highlight how significant investments continue to be made into the development of tools for use in intelligence-gathering. Many components of the Regin tools remain undiscovered, and additional functionality and versions may exist.
Emissary Panda. Discovered in 2015 but active since 2013 E.Panda is a Chinese Hacking group targeting US-Military and US-Defense infrastructures as well as critical infrastructures in USA. The attackers used contractors Managers and Directors to exfiltrate classified information from secret projects.

Potato Group. The group behind the most known "Operation Potato Express" (here). The group mostly operates targeting Russia, Belarus and Ukraine Govs and news agencies. The attacks were used even to spy members of MMM, a Ponzi scheme company popular in Russia
The attacks conducted using the Win32/Potao malware family span the past 5 years, the first detections dating back to 2011. The attackers are, however, still very active, with the most recent infiltration attempts detected by ESET in July 2015.
Waterbug.  Discovered and described by Symantec (here) Waterbub was operating since 2005.
Waterbug is likely a state-sponsored group which uses an attack network (“Venom”) that consists of 84 compromised domains (websites). The watering-hole websites used by the Waterbug group are located in many different countries. The greatest number of compromised websites are found in France (19%), Germany (17%), Romania (17%), and Spain (13%).
DragonFly. Discovered and firstly mitigated by Symantec (here) the group mainly attacks Energy Suppliers:
Dragonfly, likely a group of hackers operating out of Eastern Europe since 2011, bears the hallmarks of a state-sponsored operation. Analysis of the compilation timestamps on the malware used by the attackers indicate that the group mostly worked between Monday and Friday, with activity mainly concentrated in a nine-hour period that corresponded to a 9am to 6pm working day in the UTC +4 time zone.
Sandworm. Known for its most famous (so far) APT called BlackEnergy (here). Built from Russia against Ukraine during the political conflict Sandworm is a skilled group specialised in SandBox evasion tricks and documents (OLE) worms.

GovRat. Group behind several Governmental attacks and Discoverd and Mitigated by infoArmor (here)
Several English-speaking developers began creating custom malware and using it as a group in 2015. GovRAT is the name they gave the malware – which is used primarily for cyber espionage, and is also the code name of the group, the hackers using it for infections. 

Among these groups plenty of famous smallest and biggest groups are out there, some of there are notorious as well while some other are stille hidden, so please consider that list incomplete and based on personal experiences and not on scientific review process.

Wednesday, April 20, 2016

Looking For Caves in Windows Executables

Most of my readeres exactly know what code caves are while many other people out there (maybe occasional readers) could wonder why I am writing about codecaves in 2016 since it is a well know technique (published in 2006) to inject a malicious payload inside Windows Portable Executables. Well, today I want to disclouse a super simple python script that I used to calcultate the cave sizes (/x00) in windows executable. Code caves are places in where attackers could inject ShellCodes and execute them deflecting the normal program behaviour. Moreover I would like to discuss a little bit about the average size of free spots available in some of the most known executable shipped in Windows OS.

Two bits on CodeCaves (just to revise it)
A codecave can best be defined as "a redirection of program execution to another location and then returning back to the area where program execution had previously left." In a sense, a codecave is no different in concept than a function call, except for a few minor differences. If a codecave and a function call are so similar, why do we need codecaves at all then? The reason we need codecaves is because source code is rarely available to modify any given program. As a result, we have to physically (or virtually) modify the executable at an assembly level to make changes. At this point a few alarms and whistles may be going off for a few readers. What legitimate reason would we ever have to do so, modify an existing program for which no source is available? Consider the following hypothetical, but not too farfetched, scenario: A company that has been using the same software system they developed for the past 10 years. The software system they are using has served them well, but it is time to upgrade it to reflect a mandatory change in the output data format. The only problem is the original programmers are long gone and there are no hopes of getting the original source code to update the program. Now, this company has trained it's now veteran employees and grown the past 10 years using this specific software system, so a complete rewrite would be quite disastrous to the company. Retraining all their employees to a new system and having to reprogram things differently is not only time consuming but very costly. It would take about a year to do such and this is out of the time frame that the company has. The worst part of it all is that you are the programmer that was hired to solve this issue. You could just throw up your hands and say it is not possible, but that would not do much to help your professional career. Instead, imagine if there was a way that you could keep using the same program, but you have an additional DLL that is used to dynamically update the output data from the company's program so it fits the new standard that is required. Best of all, it is a solution that can be implemented well before your deadline and requires minimal changes to be made to the company's existing procedures of using the program. Enter codecaves.

Find Caves

The following script (Find_PE_Caves) takes as input a directory, it  looks for all PE files in it. It then takes every PE file and starts to look for multi dimensioning caves on it. It first tries to search for available 21bits (the smallest mok shellcode available today) and later it tries to search for available 1024bits caves. It ends up by writing down stats files on how many caves it did find on given files.


#!/usr/bin/env python2

#========================================================================#
#               THIS IS NOT A PRODUCTION RELEASED SOFTWARE               #
#========================================================================#
# Purpose of finMaliciousRelayPoints is to proof the way it's possible to#
# discover TOR malicious Relays Points. Please do not use it in          #

# any production  environment                                            #

# Author: Marco Ramilli                                                  #
# eMail: XXXXXXXX                                                        #
# WebSite: marcoramilli.blogspot.com                                     #
# Use it at your own                                                     #
#========================================================================#

#==============================Disclaimer: ==============================#
#THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR      #
#IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED          #
#WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE  #
#DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,      #
#INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES      #
#(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR      #
#SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)      #
#HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,     #
#STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING   #
#IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE      #
#POSSIBILITY OF SUCH DAMAGE.                                             #
#========================================================================#

#-------------------------------------------------------------------------
#------------------- GENERAL SECTION -------------------------------------
#-------------------------------------------------------------------------
import sys
import re
try:
    import pyprind
except ImportError:
    print 'pyprind not installed, see https://github.com/rasbt/pyprind'
    sys.exit()
try:
    import pefile
    import peutils
except ImportError:
    print 'pefile not installed, see http://code.google.com/p/pefile/'
    sys.exit()
try:
    import magic
except ImportError:
    print 'python-magic is not installed, file types will not be available'
    sys.exit()
import os
import glob

#----------------------------------------------------------------------
#----------------     Starting Coding   -------------------------------
#----------------------------------------------------------------------

def open_file(arg,mode):
    """
    Open a File  and returns the FileNode.
    """
    try:
        file =  open(arg,mode).read()
    except IOError as e:
        print str(e)
        sys.exit(1)
    return file


def get_executables(files):
    """
    Filters the only executable files from a files array
    """
    exec_files = []
    for file in files:
        if "executable" in magic.from_file(file):
            exec_files.append(file)
    return exec_files


def get_sections(binary_file):
    """
    Gets file sections => thanks to PE.
    Returns an multiDimensional array: [binary_file, sections_exe, sections_data]
    """
    sections_exe = []
    sections_data = []
    pe = pefile.PE(data=binary_file)
    sections = pe.sections
    for section in sections:
        # 0x20000000 IMAGE_SCN_MEM_EXECUTE
        # 0x40000000 IMAGE_SCN_MEM_READ
        # 0x00000020 IMAGE_SCN_CNT_CODE
        if all(section.Characteristics & n for n in 
            [0x20000000, 0x40000000, 0x00000020]):
            sections_exe.append(section)
        else:
            sections_data.append(section)
    return [binary_file, sections_exe, sections_data]


def get_codecaves(section,binary,size):
    """
    Looks for caves into a binary file in a specifc PE section
    Return the caves array [section, offsets]
    """
    codecaves = []
    raw_offset = section.PointerToRawData
    length = section.SizeOfRawData
    data = binary[raw_offset:raw_offset + length]
    offsets = [m.start() for m in re.finditer('\x00'*(size), data)]
    if offsets:
        codecaves.append(section)
        codecaves.append(offsets)
    return codecaves


def search_for_codecaves(sections_to_look_for, size):
    """
    Looks for caves in PE sections
    Returns codecaves array
    """
    for section in sections_to_look_for[1]:#exec_sections
        codecaves = get_codecaves(section, sections_to_look_for[0], size)
        if codecaves:
            return codecaves

    for section in sections_to_look_for[2]:
        codecaves = get_codecaves(section, sections_to_look_for[0], size)
        if codecaves:
            return codecaves


def save_files(data):
    """
    Saves a CSV File within stats comma separeted virgula
    Whatchout it creates as many file as analysed files
    """

    for d in data:
        print("[+] Saving plotting file for : %s" % (d[0]))
        fw = open(os.path.basename(d[0]) + ".csv", 'a')
        for point in d[1]:
            fw.write(str(point[0]) + "," + str(point[1]) + "\n")
        fw.close()


if __name__ == "__main__":
    #http://shell-stormorg/shellcode/files/shellcode-841.php.
    shellcode_minimal_lenght = 21 
    shellcode_max_lenght = 1024
    max_progress = shellcode_max_lenght - shellcode_minimal_lenght
    stats = []

    if len(sys.argv) != 2:
        print "Usage: %s \n" % (sys.argv[0])
        print """
                 The %s will search for caves inside 
                  and will save in current 
                  directory files within stas" % (sys.argv[0])
               """
         sys.exit()

    object = sys.argv[1]
    files  = []

    if os.path.isdir(object):
        for root, dirs, filenames in os.walk(object):
            for name in filenames:
                files.append(os.path.join(root, name))
    elif os.path.isfile(object):
        files.append(object)
    else:
        print "You must supply a file or directory!"
        sys.exit()

    files = get_executables(files)

    print("")
    print("==========================================")
    print("==========  Doing hard work here =========")
    print("==========================================")
    print("")

    for f in files:
        print ("[+] Calculating carvings for : %s" % (f))
        bar = pyprind.ProgBar(max_progress)
        points = []
        binary_file = open_file(f,"rb")
        #[binary_file, exe_sections, data_section]
        sections_to_look_for = get_sections(binary_file)   

        for size in range(shellcode_minimal_lenght,1025):
            codecaves = search_for_codecaves(sections_to_look_for, size)
            if codecaves:
                codecaves_per_size  = [size, len(codecaves[1])]
            else:
                codecaves_per_size  = [size, 0]
            points.append(codecaves_per_size)
            bar.update()
        stats.append([f, points])

    save_files(stats)

Analyzing Famous Windows Executables

I 've been using the aforementioned script against some of the most known portable executables shipped with Microsoft Windows looking for -ready to go- caves in order to figure out where to hide payloads (only for research purposes). The analyzed files are the following ones:
For each analized PE a simple ASCII graph shows the size distribution. Every ASCII graph sees on the x-ass the cave sizes and on the y-ass the number of caves for the given size.

 
Defrag.exe:
High number of small caves are observable. The size esponentially decreases (e^-x) but still a big spot for greater than 1024bits shellcode is present. This file could be a great "trojan holder".





Autologon.exe:
 More caves were found for many sizes. If you have multiple stage payloads AutoLogon.exe is a great place to store shellcodes ! Wholes are realy big like Defrag.exe but still many of them are ready to be filled.









Cacheset.exe:
Many small caves as well as big ones have been found.  Ideal for injecting many fragmented multi  stage payloads.









Winobj:
Found Interesting patterns out there. We observe more big caves rather than small ones. Winobj is ideal for hiding big sized payloads even bigger than 1024b.



 




psexec:
Maybe one of the most used executables from sys admin, it is super helpful to run remote code. We observed many midium sized caves while no caves bigger then 7xxbits have been found. It would be a great candidate for reverse_tcp or reverse_http payloads (size 250b to 380b)










I do not have conclusions here, I admit -- I was ready to bet on Microsoft. I thought Microsoft would not ship code within caves on it. But I Was wrong, fortunately I do not like gambling game!

Monday, March 21, 2016

Recovering Files From Brand New Crypt0l0cker

Today I want to share a quick'n dirty analysis of a brand new Crypt0l0cker version realised for the Italian market and spread over emails (such as: ENEL Bolletta).  Unfortunately I do not have much time to invest in that analysis but we will analyse how we might be able to recover mostly of the encrypted data.



New Crypt0l0cker Version

Sample signatures:
[*] MD5 aafc1dcd976f91b50e1f71017b8ab10f
[*] SHA-1 56e623b2d2a4abb09cfc23d754e0095f9a71a9cb
[*] SHA-256 f44310005b4d75b15df0126e954c68456e7882ee6081cfd3e39f4267f86b44d9

Two Antivirus on FiftySix where able to tag the new Malware as Suspicious executable. Baidu defines the new Crypt0l0cker as a generic Trojan (well, actually this version does "Trojan" too, please follow on reading) and Kaspersky defines it as "General Dangerous".


Antivirus Detection
Giving the new Crypt0l0cker to a packer signature engine (which happens to be an .EXE implementing a PDF icon) you might find out two valid information: (a) The Sample has been likely compiled through Visual C++ and (b) no known packers have been involved.

PEiD OEP Plugin

Let's move directly on OEP and see what we'll find there ! A Decrypting loop within anti-debug traps is found (please see the Graph Overview in the following picture). It's time to statically read the code patching the anti-debug trap and later on firing up the executable to decrypt the memory stub.

Decrypting loop within anti-debug traps

Patching the "isDebugPresent-return-function" and running the sample on a virtualized environment we might observe interesting behaviours. I wont spend time in writing how to reverse this sample but this time I am mostly interested in behaviour analysis. So letting run the code you will see the Malware injecting a DLL into explorer.exe getting administration privileges and exchanging modules and keys through its CC on ngrok VPN networks.  

PERSONAL NOTE: I saw may samples scanning for ngrok connection (mostly to find out vulnerable systems) but I never seen before Ransomware using ngrok protocol to communicate through their command and control exchanging keys. Ngrok is a secure introspectable tunnel to "localhost" which remote "localhost services". Ngrok is mostly used from developers to share preliminary results to their clients and for such a reason installed on developer 's machines. If the Ransomware 's author used ngrok from his own developer machine, we might be able to find him. But this is another "story"...

Following the Malware network activity you might appreciate a brand new Domain Name Generation Algorithm based on ngork network:

- DNS Query(16807d6e.ngrok.com)
- DNS Query(d07a6607.ngrok.io)
- DNS Query(ofywoxonega.neokred.org)
- DNS Query(ilqde.neokred.org)

The Injected DLL communicates through the following address by exchanging encryption keys and modules: 

** OUT,TCP - HTTP,192.168.1.69,171.25.193.9:80,C:\Windows\SysWOW64\explorer.exe
** IN,TCP - HTTP,171.25.193.9:80,192.168.1.69,C:\Windows\SysWOW64\explorer.exe
** OUT,TCP - HTTPS,192.168.1.69,198.211.127.225:443,C:\Windows\SysWOW64\explorer.exe
** IN,TCP - HTTPS,198.211.127.225:443,192.168.1.69,C:\Windows\SysWOW64\explorer.exe
** DNS Query(ibog.neokred.org)
** OUT,TCP - HTTPS,192.168.1.69,86.59.21.38:443,C:\Windows\SysWOW64\explorer.exe
** IN,TCP - HTTPS,86.59.21.38:443,192.168.1.69,C:\Windows\SysWOW64\explorer.exe

PERSONAL NOTE: it will be super interesting spending time in reversing the DNGA used in this sample. Please if you have time to spend on this project contact me I'll send you the sample.
As most of the Malware out there do, it sets itself in "autorun" to survive the system reboot. It copies itself into c:/windows/ and it changes the regkey to load the saved software (itself) on every system reboot (machine\software\microsoft\Windows\CurrentVersion\Run\utivikyh). The sample disables system securities and software quality-client as well, but this is out of my investigation topic. Before sending requests to C&C it harvests many information about the victims such as hardware components and keyboards layout. The malware is weaponized through modules it downloads from C&C. What has been found during this quick'n dirty analysis is the following:

   * Keylogger functionality.
   * Gets system default language ID.
   * Gets input locale identifiers.
   * Gets computer name.
   * Encrypts data.
   * Decrypts data.
   * Checks for debuggers.
   * Deletes activity traces.
   * Anti-Malware Analyzer routine: Disk information query.
    * Privilege Escalation Techniques.

The Malware sample implements some elementary evasion techniques such as understanding the user behaviour by acquiring the used windows (GetForegroundWindow) and applying some heuristics on windows and mouses. It saves many (I do not have the evience about "all") ongoing heuristics (or "counters" as the developer prefers to call them) into the following file:
C:\Users\user\AppData\Local\Microsoft\Windows\Temporary Internet Files\counters.dat

The Malware sample looks for email clients on the victim machines and tries to read out email address. Does it maybe send the new found emails addresses to Command and Control in order to self-empower central victim lists ? (I have no evidence of that, further analysis is needed)

The sample heavily uses sleep(60000) all around the code to avoid simple sandboxes analyses. But let's analyse the interesting part of it: how this version of Crypt0l0cker encrypts files! Observing Syscalls tree we observe: Encryption Parallelization ! My best guess is that Malware writer uses parallelization to speed up the entire process of encryption. Indeed the victim's CPUs rise up to 90% and the Crypt0l0cker process increases the number of threads and sub-process as soon as the infection starts !

Faster encryption means increasing the probability to encrypt user data before the victim stops the infection process. As more files will be encrypted as higher is the probability the user will pay for having them back!

Each process performs the following simplified encryption path:


Observed Encryption Behaviour


As a first stage the Malware reads the file in a input dynamically sized buffer, while it's encrypting the read file it deletes the original file and create a new one directly on the hard drive which will be filled with the encrypted content. This encryption path is vulnerable to file carving technique since Malware deletes the fileA and creates a new fileA.encrypted which will be (statistically but not scientifically) located in different HardDrive Block. So if you are a victim of such a Crypt0l0cker version (refers to hashes), and you do not have shadows file (File History, for windows 8 users) and/or backups you might try with file carving which it will statistically work :D !

Older ransomware (such as: torrentlocker, teslacrypt, bitlocker, etc etc) use a different but most effective approach. Some of them, in order to increase the speed, encrypts only a specific part of the file, others just rewrites the original file but without creating a new file itself. In such a situation file carving is not going to work. 

Summing Up:

- We obtained a brand new version of Crypt0l0cker.

- We  did have not enough time to invest on reversing the DNGA (based on ngrok) and the specific functionalities this original Crypt0l0cker have implemented,  BUT...

- We obtained two main interesting results:
  1. It is the first time we are observing a Malware implementing a DNGA based on private encrypted tunnels such as ngrok.
  2. This Cryp0l0cker version use a vulnerable read-encrypt-write algorithm which might decrease its effectiveness and vulnerable to file carving.
Conclusions:
If you've got infected by the Italian version of Crypt0l0cker try with file carving and you will probably get back data.


IoC:
Extension:
.encrypted

DNS Queries: 
16807d6e.ngrok.com
d07a6607.ngrok.io
ofywoxonega.neokred.org
ilqde.neokred.org

Temp File:
C:\Users\user\AppData\Local\Microsoft\Windows\Temporary Internet Files\counters.dat



Tuesday, February 16, 2016

Ransomware: a general view after field experiences

Even if Ransomware is not one of my favorite topics, since are simple Malware without specific targets (at least util today), I am currently observing a huge increment of this threat in companies, agencies and in private users as well. For such a reason I decided to write a little bit about them in my personal 'CyberSecurity Timeline' (.. well... my blog :). I am not going to describe a specific kind of Ransomware or to show you out spectacular code or reverse techniques, in this "post" I just want to wrap many experiences on this topoc and to make more general though and memories.

According to Netfort and TrendMicro ransomware is not a real news in  cyber securiy, indeed:
it was1989, the year of the “AIDS” trojan, aka. “Aids Info Disk” or “PC Cyborg Trojan” which replaced the AUTOEXEC.BAT file and it would then count the number of times the machine had booted, once it reached 90 days it would then hide directories and encrypt the names of all the files on the C: drive and rendered the system to be unusable. It would then display a message to the user asking them to “renew the license” and contact PC Cyborg Corporation for payment, this involved sending $189 to a post office box in Panama!
  During the past decade two main kinds of Ransomware were observed:
  1. Locker Ransomware. Aim of these threats is to deny the access of an entire victims' machine. One of the most famous exponent of this cathegory is the FBI Locker.
  2. Data Ransomware. Aim of these threats is to deny the access of victims's data. One of the most famous exponent of this cathegory is the CryptoLocker.



Both of the threats tamper with the user need of getting access to something she desires (like for example: PC, or Data) exploiting the 'attack momentum' by asking few money. The user might be fooled because she believes to get the data/machine back by simply pay few bucks and she prefix to be more careful in the next future.

What kind of platform do they infect ?

Nowadays are known Rasomware for: Microsoft Windows (not care versions), MAC OSX, Linux (mostly Debian and RedHat based) and Android. Mostly of the infected systems belong to Personal Computers and Mobile Deices but Servers (such as: FTP, Domain Controls and Http) are affected to.

Another interesting (at least on my personal point of view) question is how do they get into my device ?

Understanding how they propagate through machines is a foundamental step to prevent them ! Unfortunately they do not use a favorite propagation vector. From the victim perspective I observed many propagation vectors based on eMAIL and Social Engineering tricks. But many of the most known ransomware such as (but not limited to): NanoLocker, Crypt0L0ker, CryptoWall and TeslaCrypt are spread over Exploit Kits (mainly malvertisement, watering hole) and Downloaders as well.

What are the most common payment methods ?

A key point in Ransomware economy is the 'payment method', the most used and spread one -- which I see nowadays -- is by using the  BitCoin or LiteCoin channls.  If associated with a laudry service could guarantee a reasonable anonymity level.  A total different topic is the decision to pay or not to pay the attacker. It is not easy for victims to decide whether or not to pay the ransom demand to get their files back. With data now being essential to many organizations, not paying the demands and losing data could have catastrophic effects, such as closing a business down. On the other hand, paying the ransom demand only encourages even more crypto ransomware campaigns.  Some Ransomware such as (but not limited to) CTBLocker offer a "try and buy" capability in order to goad the customer ... hem.. the victm. 

How do the ransomware writers earn money ?

In 2009 a Symantec report found that almost the 3% of victims paid the ransom demand. The report also found that one of the smaller ransomware players managed to infect 68,000 computers in just one month, which could have resulted in victims being defrauded of up to US$400,000 in total.In March 2014, Symantec found that Trojan.Cryptowall earned at least US$34,000 in its first month of operations. A further study of Cryptowall by other information security researchers found that by August 2014, Cryptowall had earned more than US$1.1 million. In June 2015, data from the FBI’s Internet Crime Complaint Center (IC3) showed that between April 2014 and June 2015, it had received 992 Cryptowall-related complaints. The victims were a mix of end users and businesses, and the resulting losses from these cases amounted to more than US$18 million.

While all ransomware are designed to extort money they can do quite different in both techniques and technologies.

What technique do ransomware use to infect the target system ?

Encryption as far I observed is the most used technique: 
Old ransomware (such as SimpleLocker) have got symmetric key inside the code. They used that key (typically AES256) to encrypt data.This technique makes the malware 'orthogonal' by meaning they do not need interaction with external sources to start their job, so they are "ready to encrypt" as soon as they reach the target. On the other hand this technique is weak if a reverse engineer take over it. Once an Cyber Analyst detects the used encryption key he will be able to write the right "decryptor" program freeing the victims without paying the ransom. 

Modern ransomware use to dowload a public RSA/DSA key to encrypt the victim's file. Only the attacker will be able to decrypt the victims files since the used asymmetric encryption technique.  On one hand this technique is much more "safe" for the attacker perspective which he does not need to worry about key discovery; but on the other hand it is slower in encryption, if compared to the symetric key technique. Encryption speed is foundamental topic for ransomware writers, since increasing the encryption time means increasing the probability of being detected and stopped. 

Current ransomware implements a mixed technology (for example CryptoDefence) in where they use asymmetric keys techniques to exchange symmetric key which will be used to encrypt target data. In this way ransomware need internet connection to communicate through their Command and Control System in order to download keys and to communicate the 'end of encryption' once done their job. Both C&C and network communication introduce two more identification factors that might be used agains the ransomware's writers to detect and block his Malware.

Encryption strategies:

What to encrypt first is a mandatory question that every ransomware writer should be aware of. Indeed if the ransomware encrypts randomly it might get into big files which will take more resource and time to get encrypted. This will surprisingly increasing the probability to be identified and to be blocked.
Old ransomware did not care about file size, at that time the ransomware threat was not so spread and they could afford the risk to be identified and blocked.

Most recent ransomware they first order the target folder and start the encryption phase from the smallest file. In this scenario the ransomware increases the probability to encrypt much more files before being identified, which proportionaly increases the probability to get cash from victim!

Recently I observed some variants of TeslaCrypt and TorrentLocker which use to encrypt only the first 1024 Bytes of a file and then move to the next one. This hybrid technique is used in order to increase the probability to encrypt files for what victim will pay for even if on the target machine are found big files (VM, Image file, ISO, etc.).


How do they communicate to the victim ?

Usually ransomware implement 3 different victim communication channls:
  1. Broswer channel. Ransomware replace your browser home page and/or inject themselves into the broswer process and respond to every internet request their own ransom page.
  2. File channel. Ransomware write a lot of "README" files which happen to be the only one the victim might read.
  3. Messages Box. Ransomware writers might decide to communicate to the victime the  request by opening up MessageBox directly from OS'syscall.
How do they communicate to C&C to unlock files once rasom has been payed ?

There are many different scenarios aobut back communications to the attacker. Some Ransomware does not need to comunicate to thier own command and control at all, the attacker knows about the payment through a covert channel into bitcoin blockchain. On the other hand the communcation might happen through simple HTTP protocol or even through UDP single packets. Again, it is very aleatory, some communication methods are more sophysticated then other, but each one works pretty well and will be not complex to implement. CyberIntelligence.org realised a nice tool to monitor one of the most spread Rasomware Cryptowall. On the web site you will appreciate the C&C tracker, the spreading URLS and even new samples, but it's not going to be enough. Rasomware are sold as a service like Tox and many others, it will be super difficult being able to trace all of them.

I do not have conclusions on this specific topic but only a pesonal view of the threat. Ransomware is a mature threat (so many incarnations out there): you can easily find library and kits for built Ransomware on average price of 150 bucks (or even less on Dark Markets) ! Usually once a technology reaches this grade of maturity it became "local". On my personal point of view we will see the increase of Localized Ransomware threats, starting from languages ending up to targeting specific organizations.

Sunday, December 20, 2015

Spotting Malicious Node Relays

TOR is a well known "software" able to protect communications dispatching packets between different relays spread over the world run by a network of volunteers. Because the high rate of anonymity TOR has been used over the past years to cover malicious actions by physical and cyber attackers. TOR, especially through its browser implementation (the TOR Browser), is also know as one of the main (by meaning of the most used) way to get access to the Dark WEB  in where "malicious" people buy and sell illegal stuff through dark markets. Each relay belonging to the network is able to decide if being an ExitPoint (in the following picture represented by the last machine contacting "Bob") or just a middle relay (in the following picture: a TOR node highlighted by "green cross") depending on its own configuration status. If the relay decides to be an ExitNode it will expose its own IP address to the public world; it's usually a good idea alert local police and used ISP about that in order to avoid penalties.


From TheTorProject.org

During the past year mass-media such as: television shows, radio stations, youtube channels, Facebook groups, etc. disclosed many dark markets address swelling up flows of curios people to the DarkWeb and consequently exposing them to numerous new attack scenarios. Indeed new attackers set up Exit Nodes or Relay Nodes in order to spy and/or compromise communication flows passing through them. The attack could happen in many single ways but the most used ones (as today writing) are mainly three:

  1. DNS Poisoning: This technique consists in redirecting DNS calls related to well-known web sites to creative fake pages containing exploit kits able to eventually compromise  user browsers.
  2. File Patching: This technique consists in altering the requested file during its way back to destination by adding malicious content to it: this happens directly on ExitPoint/Relay before being issued to original requester.
  3. Certification Substitution (SSL - MITM). This techniques consists in substitute the real web-site certificate with a fake one in order to be able to decrypt the communication flow intercepting credentials and parameters.
Working on CyberSecurity means being aware of such attacks and being able to decide whenever passing through TOR relays or not. Please be aware that TOR is not the only anonymous networks in the DarkWeb !  

 My goal was to figure out when my TOR flow was passing through malicious relays. For such a reason I decided to write a little python script able to make some quick and dirty checks such as: DNS Poison, File patching and SSL-MITM which are significant checks on the status of the used TOR circuit. The script has 2 years old and it was undisclosed until now. I decided to public it since scientific researches have been implemented an advanced version of my FindMalExit.py. Please have a read here for the full paper on that topic.

The IDEA.
Well, actually it is a pretty simple idea: "let's grab certificates, IP addresses and files without passing through TOR network (or passing through trusted circuits) and then replicate  the process passing through all available relays. Compare the results and check if somebody is chaining the "ground".

The IMPLEMENTATION:
Following please find my poor code. Please remember it is a non production code so do not use on production envronments! (This code is just a first release of a bigger project now maintained by Yoroi ). I decided to publish the code in HTML format (so I can easily comment it), if you need it in a more common way check my github repo (here


#!/usr/bin/env python2

#========================================================================#
#               THIS IS NOT A PRODUCTION RELEASED SOFTWARE               #
#========================================================================#
# Purpose of finMaliciousRelayPoints is to proof the way it's possible to#
# discover TOR malicious Relays Points. Please do not use it in          #
# any production  environment                                            #
# Author: Marco Ramilli                                                  #
# eMail: XXXXXXXX                                                        #
# WebSite: marcoramilli.blogspot.com                                     #
# Use it at your own                                                     #
#========================================================================#

#==============================Disclaimer: ==============================#
#THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR      #
#IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED          #
#WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE  #
#DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,      #
#INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES      #
#(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR      #
#SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)      #
#HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,     #
#STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING   #
#IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE      #
#POSSIBILITY OF SUCH DAMAGE.                                             #
#========================================================================#




#-------------------------------------------------------------------------
#------------------- GENERAL SECTION -------------------------------------
#-------------------------------------------------------------------------
import StringIO
import tempfile
import time
import hashlib
import traceback
from   geoip         import  geolite2
import stem.control

TRUSTED_HOP_FINGERPRINT = '379FB450010D17078B3766C2273303C358C3A442' 
#trusted hop
SOCKS_PORT              = 9050
CONNECTION_TIMEOUT      = 30  # timeout before we give up on a circuit

#-------------------------------------------------------------------------
#---------------- File Patching Section ----------------------------------
#-------------------------------------------------------------------------
import pycurl

check_files               = {
                             "http://live.sysinternals.com/psexec.exe",
                             "http://live.sysinternals.com/psexec.exe",
                             "http://live.sysinternals.com/psping.exe", }
check_files_patch_results = []

class File_Check_Results:
    """
    Analysis Results against File Patching
    """
    def __init__(self, url, filen, filepath, exitnode, found_hash):
        self.url           = url
        self.filename      = filen
        self.filepath      = filepath
        self.exitnode      = exitnode
        self.filehash      = found_hash


#------------------------------------------------------------------------
#------------------- DNS Poison Section ---------------------------------
#------------------------------------------------------------------------
import dns.resolver
import socks
import socket

check_domain_poison_results = []
domains                     = {
                                 "www.youporn.com",
                                 "youporn.com",
                                 "www.torproject.org",
                                 "www.wikileaks.org",
                                 "www.i2p2.de",
                                 "torrentfreak.com",
                                 "blockchain.info",
}

class Domain_Poison_Check:
    """
    Analysis Results against Domain Poison
    """
    def __init__(self, domain):
        self.domain  = domain
        self.address = []
        self.path    = []

    def pushAddr(self, add):
        self.address.append(add)

    def pushPath(self, path):
        self.path = path

#-----------------------------------------------------------------------
#------------------- SSL Sltrip Section --------------------------------
#-----------------------------------------------------------------------
import OpenSSL
import ssl

check_ssl_strip_results   = []
ssl_strip_monitored_urls = {
                            "www.google.com",
                            "www.microsoft.com",
                            "www.apple.com",
                            "www.bbc.com",
}

class SSL_Strip_Check:
    """
    Analysis Result against SSL Strip
    """
    def __init__(self, domain, public_key, serial_number):
        self.domain        = domain
        self.public_key    = public_key
        self.serial_number = serial_number


#----------------------------------------------------------------------
#----------------     Starting Coding   -------------------------------
#----------------------------------------------------------------------


def sslCheckOriginal():
    """
    Download the original Certificate without TOR connection
    """
    print('[+] Populating SSL for later check')
    for url in ssl_strip_monitored_urls:
        try:
            cert = ssl.get_server_certificate((str(url), 443))
            x509 = OpenSSL.crypto.load_certificate(OpenSSL.crypto.FILETYPE_PEM, cert)
            p_k  = x509.get_pubkey()
            s_n  = x509.get_serial_number()

            print('[+] Acquired Certificate: %s' % url)
            print('    |_________> serial_number %s' % s_n)
            print('    |_________> public_key %s' % p_k)

            check_ssl_strip_results.append(SSL_Strip_Check(url, p_k, s_n))

        except Exception as err:
            print('[-] Error While Acquiring certificats on setup phase !')
            traceback.print_exc()
    return time.time()


def fileCheckOriginal():
    """
    Downloading file ORIGINAL without TOR
    """

    print('[+] Populating File Hasing for later check')
    for url in check_files:
        try:
            data = query(url)
            file_name = url.split("/")[-1]
            _,tmp_file = tempfile.mkstemp(prefix="exitmap_%s_" % file_name)

            with open(tmp_file, "wb") as fd:
                fd.write(data)
                print('[+] Saving File  \"%s\".' % tmp_file)
                check_files_patch_results.append( File_Check_Results(url, file_name, tmp_file, "NO", sha512_file(tmp_file)) )
                print('[+] First Time we see the file..')
                print('    |_________> exitnode : None'       )
                print('    |_________> :url:  %s' % str(url)     )
                print('    |_________> :filePath:  %s' % str(tmp_file))
                print('    |_________> :file Hash: %s' % str(sha512_file(tmp_file)))
        except Exception as err:
                print('[-] Error ! %s' % err)
                traceback.print_exc()
                pass
    return time.time()


def resolveOriginalDomains():
    """
        Resolving DNS For original purposes
    """
    print('[+] Populating Domain Name Resolution for later check ')

    try:
        for domain in domains:
            response = dns.resolver.query(domain)
            d = Domain_Poison_Check(domain)
            print('[+] Domain: %s' % domain)
            for record in response:
                print(' |____> maps to %s.' % (record.address))
                d.pushAddr(record)
            check_domain_poison_results.append(d)
        return time.time()
    except Exception as err:
        print('[+] Exception: %s' % err)
        traceback.print_exc()
        return time.time()


def query(url):
  """
  Uses pycurl to fetch a site using the proxy on the SOCKS_PORT.
  """
  output = StringIO.StringIO()
  query = pycurl.Curl()
  query.setopt(pycurl.URL, url)
  query.setopt(pycurl.PROXY, 'localhost')
  query.setopt(pycurl.PROXYPORT, SOCKS_PORT)
  query.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
  query.setopt(pycurl.CONNECTTIMEOUT, CONNECTION_TIMEOUT)
  query.setopt(pycurl.WRITEFUNCTION, output.write)

  try:
    query.perform()
    return output.getvalue()
  except pycurl.error as exc:
    raise ValueError("Unable to reach %s (%s)" % (url, exc))



def scan(controller, path):
  """
  Scan Tor Relays Point to find File Patching
  """

  def attach_stream(stream):
    if stream.status == 'NEW':
      try:
        controller.attach_stream(stream.id, circuit_id)
        #print('[+] New Circuit id (%s) attached and ready to be used!' % circuit_id)
      except Exception as err:
        controller.remove_event_listener(attach_stream)
        controller.reset_conf('__LeaveStreamsUnattached')

  try:

    print('[+] Creating a New TOR circuit based on path: %s' % path)
    circuit_id = controller.new_circuit(path, await_build = True)
    controller.add_event_listener(attach_stream, stem.control.EventType.STREAM)
    controller.set_conf('__LeaveStreamsUnattached', '1')  # leave stream management to us
    start_time = time.time()

    socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
    socket.socket = socks.socket

    ip = query('http://ip.42.pl/raw')
    if ip is not None:
        country  = geolite2.lookup( str(ip) ).country
        print('\n \n')
        print('[+] Performing FilePatch,  DNS Spoofing and Certificate Checking\
              passing through --> %s (%s) \n \n' % (str(ip), str(country))  )

    time_FileCheck = fileCheck(path)
    print('[+] FileCheck took: %0.2f seconds'  % ( time_FileCheck - start_time))

    #time_CertsCheck  = certsCheck(path)
    #print('[+] CertsCheck took: %0.2f seconds' % ( time_DNSCheck - start_time))

    time_DNSCheck  = dnsCheck(path)
    print('[+] DNSCheck took: %0.2f seconds'   % ( time_DNSCheck - start_time))

  except Exception as  err:
    print('[-] Circuit creation error: %s' % path)

  return time.time() - start_time

def certsCheck(path):
    """
    SSL Strip detection
    TODO: It's still a weak control. Need to collect and to compare public_key()
    """
    print('[+] Checking Certificates')
    try:
        socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
        socket.socket = socks.socket

        for url in ssl_strip_monitored_urls:
            cert = ssl.get_server_certificate((str(url), 443))
            x509 = OpenSSL.crypto.load_certificate(OpenSSL.crypto.FILETYPE_PEM, cert)
            p_k  = x509.get_pubkey()
            s_n  = x509.get_serial_number()
            for stored_cert in check_ssl_strip_results:
                if str(url) == str(stored_cert.domain):
                    if str(stored_cert.serial_number) != str(s_n):
                        print('[+] ALERT Found SSL Strip on uri (%s) through path %s ' % (url, path))
                        break
                    else:
                        print('[+] Certificate Check seems to be OK for %s' % url)

    except Exception as err:
        print('[-] Error: %s' % err)
        traceback.print_exc()

    socket.close()
    return time.time()

def dnsCheck(path):
    """
    DNS Poisoning Check
    """
    try:
        socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
        socket.socket = socks.socket

        print('[+] Checking DNS ')
        for domain in domains:
            ipv4 = socket.gethostbyname(domain)
            for p_d in check_domain_poison_results:
                if str(p_d.domain) == str(domain):
                    found = False
                    for d_ip in p_d.address:
                        if str(ipv4) == str(d_ip):
                            found = True
                            break
                    if found == False:
                        print('[+] ALERT:DNS SPOOFING FOUND: name: %s ip: %s  (path: %s )' % (domain, ipv4, path) )
                    else:
                        print('[+] Check DNS (%s) seems to be OK' % domain)
    except Exception as err:
        print('[-] Error: %s' % err)
        traceback.print_exc()

    socket.close()
    return time.time()


def fileCheck(path):
    """
    Downloading file through TOR circuits doing the hashing
    """
    print('[+] Checking For File patching ')
    for url in check_files:
        try:
            #File Rereive
            data = query(url)
            file_name = url.split("/")[-1]
            _,tmp_file = tempfile.mkstemp(prefix="exitmap_%s_" % file_name)
            with open(tmp_file, "wb") as fd:
                fd.write(data)
                for i in check_files_patch_results:
                    if str(i.url) == str(url):
                        if str(i.filehash) != str(sha512_file(tmp_file)):
                            print('[+] ALERT File Patch FOUND !')
                            print('    | exitnode : %s' % str(i.exitnode)      )
                            print('    |_________> url: %s' % str(i.url)        )
                            print('    |_________> filePath: %s' % str(i.filepath)   )
                            print('    |_________> fileHash: %s' % str(i.filehash)   )
                            #check_files_patch_results.append( File_Check_Results(url, file_name, tmp_file, path, sha512_file(tmp_file)) )
                        else :
                            print('[+] File (%s) seems to be ok' % i.url)
                        break

        except Exception as err:
                print('[-] Error ! %s' % err)
                traceback.print_exc()
                pass
    return time.time()


def sha512_file(file_name):
    """
    Calculate SHA512 over the given file.
    """

    hash_func = hashlib.sha256()

    with open(file_name, "rb") as fd:
        hash_func.update(fd.read())

    return hash_func.hexdigest()


if __name__ == '__main__':

    start_analysis = time.time()
    print("""

  |=====================================================================|
  | Find Malicious Relay Nodes is a python script made for checking 3   |
  | unique kind of frauds such as:                                      |
  | (1) File Patching                                                   |
  | (2) DNS Poison                                                      |
  | (3) SSL Stripping (MITM SSL)                                        |
  |=====================================================================|
         """)

    print("""
  |=====================================================================|
  |                 Initialization Phase                                |
  |=====================================================================|
       """)
    dns_setup_time             = resolveOriginalDomains()
    print('[+] DNS Setup Finished: %0.2f' % (dns_setup_time - start_analysis))
    file_check_original_time   = fileCheckOriginal()
    print('[+] File Setup Finished: %0.2f' % (file_check_original_time - start_analysis))
    ssl_checking_original_time = sslCheckOriginal()
    print('[+] Acquiring Certificates  Setup Finished: %0.2f' % (ssl_checking_original_time - start_analysis))

    print("""
  |=====================================================================|
  |                 Analysis  Phase                                     |
  |=====================================================================|
          """)

    print('[+] Connecting and Fetching possible Relays ...')
    with stem.control.Controller.from_port() as controller:
      controller.authenticate()

      net_status = controller.get_network_statuses()


      for descriptor in net_status:
        try:
          fingerprint = descriptor.fingerprint

          print('[+] Selecting a New Exit Point:')
          print('[+] |_________> FingerPrint: %s ' % fingerprint)
          print('[+] |_________> Flags: %s ' % descriptor.flags)
          print('[+] |_________> Exit_Policies: %s ' % descriptor.exit_policy)

          if 'EXIT' in (flag.upper() for flag in descriptor.flags):
              print('[+] Found Exit Point. Performing Scan through EXIT: %s' % fingerprint)
              if None == descriptor.exit_policy:
                  print('[+] No Exit Policies found ... no certs checking')
                  time_taken = scan(controller, [TRUSTED_HOP_FINGERPRINT, fingerprint])
          else:
              #print('[+] Not Exit Point found. Using it as Relay passing to TRUST Exit Point')
              pass
              #time_taken = scan(controller, [fingerprint, TRUSTED_HOP_FINGERPRINT])
          #print('[+] Finished Analysis for %s finished  => %0.2f seconds' % (fingerprint, time_taken))

        except Exception as exc:
            print('[-] Exception on  FingerPrint: %s => %s' % (fingerprint, exc))
            traceback.print_exc()
            pass



The RESULTS
I am not going to publish my results since Tor Relays change over time and what I found using this script might be inaccurate and imprecise: more check must be done. Moreover it could be unpleasant charge specific relays (ergo IP, ergo owners) to be "malicious". But I am going to indorse part of results described by Philipp Winte and Stefan Lindskog published on their paper (here).

From Spoiled Onions: Exposing Malicious Tor Exit Relays (Philipp Winter and Stefan Lindskog)
Many of the found Malicious relays have been found on Russia, Turky and Hong Honk. Not every malicious relay used the both techniques to compromise flow but almost one was found. The definitive more used technique is the SSL-Strip MITM mainly used to spy over channels. Few file patching techniques were identified. This kind of attack is useful to spread Malware over the networks and together with DNS poisoning is more used to "attack" rather then to "spy".

Hope you might enjoy the script, which is quite old and will need a code refactoring session but still interesting (at least on my personal point of view).