How to write YARA rules to improve security and malware detection

YARA is not a replacement for antivirus software, but it does help you detect problems more efficiently and allow for more customization. Learn how to write YARA rules to improve security and incident response.

Image: iStock / vadimrysev

of First article about YARA, Defined what kind of tool it is and in what context it can be used. Detecting malware on your network or endpoints, assisting in incident response and monitoring, file classification, and even detecting sensitive data breaches. It also showed how to install. Now let’s create a rule to get the most out of it.

look: Google Chrome: Security and UI Tips You Need to Know (TechRepublic Premium)

Start with an empty template

YARA rules are text files and follow a very basic but powerful syntax.

YARA rules always contain three parts.

  • Meta part: This part contains general or specific information that is not processed but helps the user understand what it is.
  • String part: This part contains all the strings that need to be searched in the file.
  • Condition part: This part defines the matching conditions. It can match one or more strings, but it can also be more complicated, as described later in this article.

From my experience, it is highly recommended to create an empty template that you will always use to start creating new rules. As you can see, you need to enter the contents of some variables and add the required conditions.

rule samplerule
author="Cedric Pernet"
reference="any useful reference"

This template allows you to quickly edit the metadata and rule names (named samplerule in this example). The metadata can be whatever the user wants to place there. In my case, I always use a blog report with a version number, date, a reference that could be a malware hash, or what I want to detect, and an author field.

Now that the metadata has been written, let’s start writing the first rule.

First rule

YARA rules are a combination of string elements and conditions. NS String Text string, hexadecimal string, or Regular expressions..

NS conditions Like any programming language, it is a Boolean expression. The best known are AND, OR and NOT. You can also use relational operators, arithmetic operators, and bitwise operators.

The first rule is:

rule netcat_detection
author="Cedric Pernet"
reference="netcat is a free tool available freely online"
$str1="gethostpoop fuxored" // this is very specific to the netcat tool
$str2="nc -l -p port [options]"
$str1 or $str2

Now let’s talk about this rule, titled netcat_detection.

After normal metadata, string splitting contains two variables str1 and str2. Of course, you can give these variables any name you like. It also contains a comment at the end of the first variable to explain how to add a comment.

The condition part contains the following conditions: Must match either str1 or str2.

This may have been written in a more comfortable way.

any of ($str*)

This is useful if you have a lot of different variables and you want to match one of them.

Run the first rule

Next, let’s execute the rule saved as a file named rule1.yar. I would like to run it against a folder that contains several different files. Two of them are 32-bit and 64-bit versions of netcat software (Figure A). The test system is an Ubuntu Linux distribution, but Yara is fine because it can be easily installed on a Linux, Mac, or Windows operating system.

Figure A


Run YARA rules on folders to detect specific software.

As expected, YARA runs and returns the names of all the files that match the rule.

Of course, you can put as many YARA rules as you need in one file, which is more comfortable than using many different rule files.

Running YARA with the -s option will display the exact string that matches those files (Figure B):

Figure B


Run YARA with the -s option to display the matching string.

By the way, finding a tool like netcat somewhere in your corporate network may actually be worth investigating. Its basic tool is to allow computers to connect and exchange data on specific ports and attackers. Of course, it could also be used by IT personnel and Red Team staff, so investigations were conducted to determine why it was detected on machines in the corporate network.

More complex strings

Matching the basic strings is enough to find the files in your system. However, the string may have different encodings on different systems, or it may have been slightly triggered by an attacker. For example, one of the minor changes is to use random case to change the case of a string. Fortunately, YARA can handle this easily.

In the following YARA string part, the strings will match in any case.

$str1="thisisit" nocase

Condition $ str1 will match all cases used (“ThisIsIt”, “THISISIT”, “thisisit”, “ThIsIsiT”, etc.).

If the string is encoded using 2 bytes per character, you can use the “wide” modifier. Of course, it can be combined with other modifiers.

$str1="thisisit" nocase wide

To search for strings in both ASCII and wide formats, you can use the modifier “ascii” in combination with wide.

$str1="thisisit" ascii wide

Hexadecimal string

Hexadecimal strings are easy to use.

$str1={ 75 72 65 6C 6E 20 }
$str2={ 75 72 65 6C ?? 20 }
$str3={ 75 72 [2-4] 65 6C }

Here are three different hexadecimal variables. The first is to find the exact sequence of hexadecimal strings. The second is two? Use the wildcard represented by. It is a character and searches for a string with any hexadecimal value. stand.

look: Password Violations: Why Pop Culture and Passwords Don’t Mix (Free PDF) (TechRepublic)

The third string searches for the first 2 bytes, then for jumps of 2-4 characters, and then for the last 2 bytes. This is very useful when some sequences are different in different files, but show a predictable number of random bytes between two known bytes.

Regular expressions

Regular expressionsIs very useful in detecting specific content that can be written in different ways, just like any other programming language. In YARA, it is defined using strings that start and end with a slash (/) character.

Let’s look at a meaningful example.

With malware binaries, developers left behind debug information, especially the famous ones. PDB string.

It reads:

D: worksheet Malware_v42 Release malw.pdb

The idea here is not only to create a rule that matches this malware, but also to create all the different versions of it in case the version number changes. We also decided to exclude the “D” drive from the rules, as the developer could put the “D” drive on a different drive.

Come up with a regular expression (Figure C):

Figure C


A rule that matches all versions of malware based on PDB strings and results.

For demonstration purposes, I created a file named newmalwareversion.exe that contains three different PDB strings, each with a different version number. Our rules match all of them.

Note that the character in the string is doubled because is a special character that needs to be escaped, like in C.

More complex conditions

conditions It’s smarter than collating one or more strings. You can use conditions to count strings, specify offsets to search for strings, match file sizes, and use loops.

Here are some examples I commented on for explanation:

2 of ($str*) // will match on 2 of several strings named str followed by a number
($str1 or $str2) and ($text1 or $text2) // example of Boolean operators
#a == 4 and #b > 6 // string a needs to be found exactly four times and string b needs to be found strictly more than six times
$str at 100 // string str needs to be located within the file at offset 100
$str in (500..filesize) // string str needs to be located between offset 500 and end of file.
filesize > 500KB // Only files which are more than 500KB big will be considered


This article describes the most basic features of YARA. Of course, it was a kind of programming language, so I couldn’t document everything. The file matching possibilities provided by YARA are endless. The more familiar an analyst is to YARA, the more he can get a feel for YARA and improve his skills in creating more efficient rules.

This language is very easy to write and use, so it’s important to know what you really want to detect. It’s been a few years since I’ve seen security researchers publish YARA rules in the appendices of research papers and blog posts to allow anyone to collate malicious content on their computers and servers. It’s becoming more and more common. YARA rules can also collate content that is not malicious but needs to be carefully monitored. For example, render YARA to a data loss detection tool or a malicious content detector, such as an internal document.Please feel free to contact us YARA document See all the possibilities offered by the tool.

Disclosure: I work for Trend Micro, but the views expressed in this article are mine.

See also How to write YARA rules to improve security and malware detection

Back to top button