tisdag 19 juni 2018

New job

New job

On June 7th I started on my new job.

In my previous job I was a "embedded systems software engineer". Most customer projects used C only, no C++, and used a realtime operating system (no Windows/Linux). Performance has not been a big issue in these applications.

From now on I will be a "C++ software engineer". I will be developing a video software. I am going to work with C++ entirely and I will be using modern C++ features a lot. I believe that performance will be more important than before.

Cppcheck






I of course want to catch as many bugs as possible with Cppcheck at work.

When I worked with "embedded C code", my primary focus was to catch bugs in C code. I wanted Cppcheck to handle C++ also, but it was not my primary focus. When C code is handled better it usually has a positive effect on C++ code also.

Now that I will be working with modern C++ I assume that I will have more focus on modern C++ than on C. We are not going to drop support for C, but for me personally it will be more interesting to work on modern C++ analysis from now on.

There are many contributors in Cppcheck and we all have our own priorities.

torsdag 3 maj 2018

My ideas for improved analysis in Cppcheck

Introduction

These are some improvements in Cppcheck analysis infrastructure that I believe will be implemented in the future.

There is no time line. These ideas will probably not be implemented in the next few months.

Whole program analysis integrated in ValueFlow.

The ValueFlow in Cppcheck is a component that determine possible values for every expression in a sourcefile.

This ValueFlow component is used by many checks. For instance; division by zero, null pointer dereference, array index out of bounds, uninitialized variables, ...

I believe that whole program analysis will be integrated in the ValueFlow analysis.

ValueFlow analysis for containers

I want Cppcheck to track contents / number of elements of C++ containers.

Tracking contents:

    void foo()
    {
         std::string s = "hello";
         return s[10];  // <- array index out of bounds
    }

Annotations and attributes

I want to handle widely used annotations and attributes. We should use the information that is already available:
  • SAL
  • If C++ contracts will become popular I want to handle those
If we handle SAL I believe we can complement the analysis done by microsoft.

Extend library with further functionality

I believe the library will be extended so the semantics of functions and classes can be described better.

With better knowledge about the functions and classes, the analysis will be stronger.


söndag 29 april 2018

Catching conversion bugs

Detecting conversion bugs (loss of precision and loss of sign) is far from a solved problem.

In this blog post I hope to convince you that the "old school" conversion checks are bad. These are used in many projects today.

Here is unsafe code that has potential loss of sign:

(1)
    signed int svar;

    void foo(unsigned int uvar)
    {
        if (svar > uvar)
        {
             /* do stuff */
        }
    }

Assuming that svar can have arbitrary values, there can be loss of sign in the comparison. When a signed int value and a unsigned int value is compared in C/C++ code, the signed value is casted to an unsigned value. It means that a negative value is converted to a large positive value before the comparison. Then you get the wrong comparison result.

Logically, this code is the same as (1):

(2)
    int svar;

    void foo(unsigned int uvar)
    {
        if ((unsigned int)svar > uvar)
        {
             /* do stuff */
        }
    }

Please note that this code is just as unsafe as the first code.

So if you want to compare a signed and unsigned variable that has arbitrary values, how do you do this safely? If you can cast the operands into a larger signed data type then that would make the code safe:

(3)
    int svar;

    void foo(unsigned int uvar)
    {
        if ((long long)svar > (long long)uvar)
        {
             /* do stuff */
        }
    }

The other way to write safe code is to add an extra comparison:

(4)
    int svar;

    void foo(unsigned int uvar)
    {
        if (svar > 0 && svar > uvar)
        {
             /* do stuff */
        }
    }

The advantage with (4) against (3) is that it avoids casts. These casts could easily cause bugs or conceal bugs in the future.

Old school tools

Let us look at the old school tools that will warn about loss of sign and loss of precision.

The old school tools will just think that if there is no cast then the code is unsafe and if there is a cast then the code is safe. The old school tools warn about (1) and (4). They do not warn about (2) and (3). This means:
  • Warnings are written about both safe and unsafe code.
  • You are not safe at all just because the tool is silent.
As far as I have seen in practice, most of the warnings from old school tools are wrong. As far as I have seen, the true positive ratio is maybe 1% or something like that. Variables do not have arbitrary values. If the signed expression can't be negative then there can't be loss of sign and no warning should be written.

When there is a warning the developer must suppress it with a cast. Such casts are unsafe, as they can introduce new problems. For instance I often see size_t casted to int etc to fix such warning. Even if that cast to int is safe today, the cast can conceal bugs or cause bugs in the future.

If the warning is really correct and the developer wants to fix the bug, then it is important that the developer choose the right cast. I think it's likely that mistakes are sometimes made, the result will probably not be (3) but instead some variation of (2). Unfortunately the old school tools do not enforce that the problem is fixed properly.

Proper checking

People should use proper static analysis instead to detect these problems.

A proper check would use valueflow analysis to determine what the possible values of the operands are. If the signed operand can be negative then there can be loss of sign.

I would expect that proper analysis would only show ~1 % as many warnings as the old school tools. Because old school tools are wrong so often. You would get a lot fewer casts in the code and this would make it much safer. You could just use casts when it was really needed.

There is for instance a proper check (alpha.core.Conversion) in the Clang analyzer that warns when a negative value is implicitly casted to unsigned and when a large value is implictly truncated.

There is no proper check in Cppcheck right now for this. It might be added in the future.

The most dangerous software weaknesses in 2020

Mitre has released a list of the most dangerous software weaknesses. https://cwe.mitre.org/top25/archive/2020/2020_cwe_top25.html Item #2 on...