Daniel Marjamäki's blog

måndag 7 december 2020

The most dangerous software weaknesses in 2020

Mitre has released a list of the most dangerous software weaknesses.

https://cwe.mitre.org/top25/archive/2020/2020_cwe_top25.html

Item #2 on that list is "Out-of-bounds Write" (CWE-787). I have investigated this a little. I looked at the related vulnerabilities that has been published.

Vulnerabilities

As a static analysis tool developer I wanted to investigate CWE-787 vulnerabilities (if Cppcheck would detect them and if not why..). Details about vulnerabilities can be found in a few locations but I used the National Vulnerability Database (NVD).

I wanted to find vulnerabilities that meets these criteria:

CWE-787
Published in 2020
Vulnerability in open source project so I can look at it
C++ code

To start with I used this search:

https://nvd.nist.gov/vuln/search/results?cwe_id=CWE-787&pub_start_date=01%2F01%2F2020

That search resulted in ~800 vulnerabilities and I manually looked through these results to find vulnerabilities in open source c++ projects. All such vulnerabilities I could find are listed below;

CVE-2020-25693 - CImg
CVE-2020-25219 - libproxy
CVE-2020-15365 - libraw
CVE-2020-15306 - openexr
CVE-2020-11764 - openexr
CVE-2020-11763 - openexr
CVE-2020-11762 - openexr
CVE-2020-0451 - Android
CVE-2020-0409 - Android
CVE-2020-0245 - Android
CVE-2020-0120 - Android
CVE-2020-0218 - Android
CVE-2020-0131 - Android
CVE-2020-0124 - Android
CVE-2020-0118 - Android
CVE-2020-0094 - Android
CVE-2020-0079 - Android
CVE-2020-0078 - Android
CVE-2020-0053 - Android
CVE-2020-0046 - Android
CVE-2020-0033 - Android
CVE-2020-0027 - Android

So there was in total ~22 (the year has not ended yet) such vulnerabilities in 2020. I do not think that ~22 is a lot; this is the #2 most dangerous software weakness and the search criteria included all open source C++ projects.

Please feel free to let me know if there is a vulnerability I missed.

Conclusion

You can certainly make mistakes in C++ also but it's not highly unsafe. At least CWE-787 vulnerabilities are rare. Those that disagree please feel free to take a look in the NVD...

lördag 27 juli 2019

Static analysis -- safe interfaces

My plan is to add more checks for "safe interfaces" in Cppcheck. Some analysis has already been added but more will be added soon.

Whole program analysis - Detect existing problems

Static analysis tools normally use whole program analysis to determine possible argument values for functions.

Example code:

    int average(int value1, int value2)
    {
        return (value1 + value2) / 2;
    }

    int main()
    {
        average(100,200);
        return 0;
    }

If you analyze this code with a static analysis tool you will not get any warnings. There is no bug.

Safe analysis

Cppcheck will be able to check that functions are robust.

There can be a signed integer overflow in the average function. For instance if it would be called like this:

    average(INT_MAX, 1);

Latest Cppcheck will tell you about the signed integer overflow, if you turn on the "safe checks".

Configuration

I believe that for must functions, you will only want to get warnings based on whole program analysis.

The "safe analysis" will not be used unless it is turned on.

I envision that "safe analysis" might be turned on in such cases:

If you design a library and want that the library interface is robust. You want to check all possible calls.
If you design a utility function/class and want that it is robust for all possible usage.
During code review, you could temporarily turn this on to see potential UB.
If you're going to release a critical software, I believe it would be good to turn on such analysis on the "alpha" release and check all the reported warnings.

We need to have detailed configuration options. It should be possible to turn it on permanently for specific functions or classes. To help "safe analysis" more annotation options will be needed also. I have not figured out fully how to configure it.. but to start with the primary way to configure it will be through the cppcheck gui (in the project settings dialog).

tisdag 19 juni 2018

New job

On June 7th I started on my new job.

In my previous job I was a "embedded systems software engineer". Most customer projects used C only, no C++, and used a realtime operating system (no Windows/Linux). Performance has not been a big issue in these applications.

From now on I will be a "C++ software engineer". I will be developing a video software. I am going to work with C++ entirely and I will be using modern C++ features a lot. I believe that performance will be more important than before.

Cppcheck

I of course want to catch as many bugs as possible with Cppcheck at work.

When I worked with "embedded C code", my primary focus was to catch bugs in C code. I wanted Cppcheck to handle C++ also, but it was not my primary focus. When C code is handled better it usually has a positive effect on C++ code also.

Now that I will be working with modern C++ I assume that I will have more focus on modern C++ than on C. We are not going to drop support for C, but for me personally it will be more interesting to work on modern C++ analysis from now on.

There are many contributors in Cppcheck and we all have our own priorities.

torsdag 3 maj 2018

My ideas for improved analysis in Cppcheck

Introduction

These are some improvements in Cppcheck analysis infrastructure that I believe will be implemented in the future.

There is no time line. These ideas will probably not be implemented in the next few months.

Whole program analysis integrated in ValueFlow.

The ValueFlow in Cppcheck is a component that determine possible values for every expression in a sourcefile.

This ValueFlow component is used by many checks. For instance; division by zero, null pointer dereference, array index out of bounds, uninitialized variables, ...

I believe that whole program analysis will be integrated in the ValueFlow analysis.

ValueFlow analysis for containers

I want Cppcheck to track contents / number of elements of C++ containers.

Tracking contents:

    void foo()
    {
   std::string s = "hello";
         return s[10]; // <- array index out of bounds
    }

Annotations and attributes

I want to handle widely used annotations and attributes. We should use the information that is already available:

SAL
If C++ contracts will become popular I want to handle those

If we handle SAL I believe we can complement the analysis done by microsoft.

Extend library with further functionality

I believe the library will be extended so the semantics of functions and classes can be described better.

With better knowledge about the functions and classes, the analysis will be stronger.

söndag 29 april 2018

Catching conversion bugs

Detecting conversion bugs (loss of precision and loss of sign) is far from a solved problem.

In this blog post I hope to convince you that the "old school" conversion checks are bad. These are used in many projects today.

Here is unsafe code that has potential loss of sign:

(1)
    signed int svar;

    void foo(unsigned int uvar)
    {
        if (svar > uvar)
{
           /* do stuff */
        }
    }

Assuming that svar can have arbitrary values, there can be loss of sign in the comparison. When a signed int value and a unsigned int value is compared in C/C++ code, the signed value is casted to an unsigned value. It means that a negative value is converted to a large positive value before the comparison. Then you get the wrong comparison result.

Logically, this code is the same as (1):

(2)
    int svar;

    void foo(unsigned int uvar)
    {
        if ((unsigned int)svar > uvar)
{
           /* do stuff */
        }
    }

Please note that this code is just as unsafe as the first code.

So if you want to compare a signed and unsigned variable that has arbitrary values, how do you do this safely? If you can cast the operands into a larger signed data type then that would make the code safe:

(3)
    int svar;

    void foo(unsigned int uvar)
    {
        if ((long long)svar > (long long)uvar)
{
           /* do stuff */
        }
    }

The other way to write safe code is to add an extra comparison:

(4)
    int svar;

    void foo(unsigned int uvar)
    {
        if (svar > 0 && svar > uvar)
{
           /* do stuff */
        }
    }

The advantage with (4) against (3) is that it avoids casts. These casts could easily cause bugs or conceal bugs in the future.

Old school tools

Let us look at the old school tools that will warn about loss of sign and loss of precision.

The old school tools will just think that if there is no cast then the code is unsafe and if there is a cast then the code is safe. The old school tools warn about (1) and (4). They do not warn about (2) and (3). This means:

Warnings are written about both safe and unsafe code.
You are not safe at all just because the tool is silent.

As far as I have seen in practice, most of the warnings from old school tools are wrong. As far as I have seen, the true positive ratio is maybe 1% or something like that. Variables do not have arbitrary values. If the signed expression can't be negative then there can't be loss of sign and no warning should be written.

When there is a warning the developer must suppress it with a cast. Such casts are unsafe, as they can introduce new problems. For instance I often see size_t casted to int etc to fix such warning. Even if that cast to int is safe today, the cast can conceal bugs or cause bugs in the future.

If the warning is really correct and the developer wants to fix the bug, then it is important that the developer choose the right cast. I think it's likely that mistakes are sometimes made, the result will probably not be (3) but instead some variation of (2). Unfortunately the old school tools do not enforce that the problem is fixed properly.

Proper checking

People should use proper static analysis instead to detect these problems.

A proper check would use valueflow analysis to determine what the possible values of the operands are. If the signed operand can be negative then there can be loss of sign.

I would expect that proper analysis would only show ~1 % as many warnings as the old school tools. Because old school tools are wrong so often. You would get a lot fewer casts in the code and this would make it much safer. You could just use casts when it was really needed.

There is for instance a proper check (alpha.core.Conversion) in the Clang analyzer that warns when a negative value is implicitly casted to unsigned and when a large value is implictly truncated.

There is no proper check in Cppcheck right now for this. It might be added in the future.